Configurable Per-Model | Triton Inference Server Feature Requests

Configurable Per-Model Scheduling Queues

Some would rather have request die in the queue than wait for an available instance to be ready for inference. TRTIS should let you configure the queue to have different lengths/functionality

Guest
Nov 21 2019

Comments (0)

Attach files

Enter a subject

Configurable Per-Model Scheduling Queues

Identify yourself with your email address

Related ideas