Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Fast Setup and High Throughput of GPU Serverless Computing (2404.14691v1)

Published 23 Apr 2024 in cs.DC

Abstract: Integrating GPUs into serverless computing platforms is crucial for improving efficiency. However, existing solutions for GPU-enabled serverless computing platforms face two significant problems due to coarse-grained GPU management: long setup time and low function throughput. To address these issues, we propose SAGE, a GPU serverless framework with fast setup and high throughput. First, based on the data knowability of GPU function ahead of actual execution, SAGE first devises the parallelized function setup mechanism, which parallelizes the data preparation and context creation. In this way, SAGE achieves fast setup of GPU function invocations.Second, SAGE further proposes the sharing-based memory management mechanism, which shares the read-only memory and context memory across multiple invocations of the same function. The memory sharing mechanism avoids repeated data preparation and then unnecessary data-loading contention. As a consequence, the function throughput could be improved. Our experimental results show that SAGE reduces function duration by 11.3X and improves function density by 1.22X compared to the state-of-the-art serverless platform.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Han Zhao (159 papers)
  2. Weihao Cui (11 papers)
  3. Quan Chen (91 papers)
  4. Shulai Zhang (6 papers)
  5. Zijun Li (14 papers)
  6. Jingwen Leng (50 papers)
  7. Chao Li (429 papers)
  8. Deze Zeng (3 papers)
  9. Minyi Guo (98 papers)

Summary

We haven't generated a summary for this paper yet.