NVIDIA Dynamo · control plane
KV-Aware Router
SLO Planner
Disaggregated Serving
Backend engine
vLLM
one device / instance
SGLang
one device / instance
TensorRT-LLM
NVIDIA GPU only
OpenInfer runtime
routes the whole fleet
Your fleet
Same Dynamo control plane. Swap the backend and watch how much of the fleet it can actually reach.