Conference Publications
“FaST-GShare: Enabling efficient spatio-temporal GPU sharing in serverless computing for deep learning inference,” J. Gu, Y. Zhu, P. Wang, M. Chadha, and M. Gerndt, in Proc. 52nd ICPP, 2023. DOI
“HAS-GPU: Efficient Hybrid Auto-scaling with Fine-grained GPU Allocation for SLO-aware Serverless Inferences,” J. Gu, P. Wang, I. Nunez, K. Huang, and M. Gerndt, EURO-PAR 2025. DOI