SqueezeBits
[vLLM vs TensorRT-LLM] #13. Vision-Language Models
This article provides a comparative analysis of serving vision-language models on vLLM and TensorRT-LLM.
[vLLM vs TensorRT-LLM] #12. Automatic Prefix Caching
This article provides a comparative analysis of automatic prefix caching.
[vLLM vs TensorRT-LLM] #11. Speculative Decoding
This article provides a comparative analysis of speculative decoding.
[vLLM vs TensorRT-LLM] #10 Serving Multiple LoRAs at Once
This article provides a comparative analysis of multi-LoRA serving capabilities of vLLM and TensorRT-LLM frameworks.