Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast yet Safe: Early-Exiting with Risk Control (2405.20915v2)

Published 31 May 2024 in cs.LG, cs.AI, cs.CV, and stat.ML

Abstract: Scaling machine learning models significantly improves their performance. However, such gains come at the cost of inference being slow and resource-intensive. Early-exit neural networks (EENNs) offer a promising solution: they accelerate inference by allowing intermediate layers to exit and produce a prediction early. Yet a fundamental issue with EENNs is how to determine when to exit without severely degrading performance. In other words, when is it 'safe' for an EENN to go 'fast'? To address this issue, we investigate how to adapt frameworks of risk control to EENNs. Risk control offers a distribution-free, post-hoc solution that tunes the EENN's exiting mechanism so that exits only occur when the output is of sufficient quality. We empirically validate our insights on a range of vision and language tasks, demonstrating that risk control can produce substantial computational savings, all the while preserving user-specified performance goals.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Metod Jazbec (9 papers)
  2. Alexander Timans (8 papers)
  3. Tin Hadži Veljković (3 papers)
  4. Kaspar Sakmann (22 papers)
  5. Dan Zhang (171 papers)
  6. Christian A. Naesseth (28 papers)
  7. Eric Nalisnick (44 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com