Triton Inference Server Feature Requests

Triton Inference Server Feature Requests

UVM to avoid over subscribing GPU memory

Want to be able to load more models into TRTIS GPU memory. Concerned about the inability to anticipate GPU memory consumption/fragmentation. Having UVM would solve this

  • Guest
  • Aug 27 2019
  • Attach files