SGLang introduced in: Zheng et al. 2024 — efficient LLM serving with structured outputs.
Object
Zheng et al. 2024 — efficient LLM serving with structured outputs
Primary source · preprint · 2023-12-12
SGLang: Efficient Execution of Structured Language Model Programs — arXiv (Zheng, Yin, Xie, Huang, Yu, Liu, Lin, Cuenca, Zhao, Stoica / UC Berkeley)