FaST-GShare: Enabling efficient spatio-temporal GPU sharing in serverless computing for deep learning inference
Enable fine-grained spatio-temporal GPU sharing for DL inference in serverless computing.
Enable fine-grained spatio-temporal GPU sharing for DL inference in serverless computing.