Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 104 tok/s
Gemini 3.0 Pro 36 tok/s Pro
Gemini 2.5 Flash 133 tok/s Pro
Kimi K2 216 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning (2403.06497v1)

Published 11 Mar 2024 in cs.CV and cs.MM

Abstract: Transformer-based models have gained widespread popularity in both the computer vision (CV) and NLP fields. However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tuning method, \textbf{QuantTune}. Firstly, our analysis revealed that, on average, 65\% of quantization errors result from the precision loss incurred by the dynamic range amplification effect of outliers across the target Transformer-based models. Secondly, \textbf{QuantTune} adjusts weights based on the deviation of outlier activations and effectively constrains the dynamic ranges of the problematic activations. As a result, it successfully mitigates the negative impact of outliers on the inference accuracy of quantized models. Lastly, \textbf{QuantTune} can be seamlessly integrated into the back-propagation pass in the fine-tuning process without requiring extra complexity in inference software and hardware design. Our approach showcases significant improvements in post-training quantization across a range of Transformer-based models, including ViT, Bert-base, and OPT. QuantTune reduces accuracy drops by 12.09\% at 8-bit quantization and 33.8\% at 7-bit compared to top calibration methods, outperforming state-of-the-art solutions by over 18.84\% across ViT models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: