Tag
2 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.
vLLM introduced in: Kwon et al. 2023 — high-throughput LLM serving via PagedAttention.
468a9e2c047d8f2f · 2 sources · 100% confidence
Triton inference server publicly released on: 2018-11 by NVIDIA — formerly TensorRT Inference Server.
78ec1ceed08a221c · 2 sources · 100% confidence