Triton Inference Server Feature Requests
Triton Inference Server Feature Requests
Add a new idea
Filter by status
Already exists
0
Will not implement
0
Planned
0
Shipped
5
Log in / Sign up
Identify yourself with your email address
Email address
Recent
Trending
Popular
0
Vote
Add my vote +1
+2
+3
Sample Java client
Have a sample Java client in TRTIS in addition to the Python and C++ clients already provided. This would be to support customers who's frontend is primarily in Java.
Created 21 Nov 19:51
0
0
Vote
Add my vote +1
+2
+3
Model Repository Integration with NGC
Be able to directly pull models from NGC model repo instead of having to set up TRTIS model repo in a persistent volume.
Created 21 Nov 19:49
0
0
Vote
Add my vote +1
+2
+3
Configurable Per-Model Scheduling Queues
Some would rather have request die in the queue than wait for an available instance to be ready for inference. TRTIS should let you configure the queue to have different lengths/functionality
Created 21 Nov 19:48
0
0
Vote
Add my vote +1
+2
+3
Auto generated model config
TRTIS should automatically generate the model configuration file based on the model's metadata. Today this works for Tensorflow and TRT. Enable this functionality for all model formats accepted by TRTIS
Created 21 Nov 19:46
0
0
Vote
Add my vote +1
+2
+3
Granualar GPU Metric
Metrics for % of cores on GPU used over time
Created 21 Nov 19:31
0
0
Vote
Add my vote +1
+2
+3
CPU Metrics
Add basic CPU metrics for models run on CPU in TRTIS. ie CPU Usage, CPU Memory
Created 21 Nov 19:27
0
0
Vote
Add my vote +1
+2
+3
MxNet backend
Add MXNet support to TRTIS as a backend so native MXNet models can be served without any conversion through ONNX.
Created 21 Nov 19:15
0
« First
‹ Prev
1
2