Emergent Mind

FP8 Formats for Deep Learning

(2209.05433)
Published Sep 12, 2022 in cs.LG

Abstract

FP8 is a natural progression for accelerating deep learning training inference beyond the 16-bit formats common in modern processors. In this paper we propose an 8-bit floating point (FP8) binary interchange format consisting of two encodings - E4M3 (4-bit exponent and 3-bit mantissa) and E5M2 (5-bit exponent and 2-bit mantissa). While E5M2 follows IEEE 754 conventions for representatio of special values, E4M3's dynamic range is extended by not representing infinities and having only one mantissa bit-pattern for NaNs. We demonstrate the efficacy of the FP8 format on a variety of image and language tasks, effectively matching the result quality achieved by 16-bit training sessions. Our study covers the main modern neural network architectures - CNNs, RNNs, and Transformer-based models, leaving all the hyperparameters unchanged from the 16-bit baseline training sessions. Our training experiments include large, up to 175B parameter, language models. We also examine FP8 post-training-quantization of language models trained using 16-bit formats that resisted fixed point int8 quantization.

Impact of casting GPT3 model to E4M3 format on perplexity, without per-tensor scaling adjustments.

Overview

  • The paper investigates FP8, an 8-bit floating point format, evaluating its effectiveness in deep learning tasks compared to 16-bit formats like FP16 and bfloat16.

  • Two FP8 encodings, E4M3 and E5M2, are explored for different deep learning applications, highlighting their balance between computational efficiency and precision.

  • Empirical evaluation demonstrates that models trained with FP8 can achieve accuracy comparable to those trained with higher precision formats without adjusting model architectures or training parameters.

  • The study suggests that FP8 could reduce computational resource requirements, making advanced AI technologies more accessible and paving the way for future research into optimization techniques and hardware support.

FP8 Formats for Deep Learning: An Analysis

Introduction to FP8

In the realm of deep learning, the quest for efficiency and speed in training and inference processes is unending. The transition from 32-bit floating point (FP32) to 16-bit formats (FP16 and bfloat16) has been a significant step forward, enabling faster computations and lower memory requirements. Building upon this foundation, an 8-bit floating point format, FP8, emerges as the next frontier in precision reduction, offering potential for further accelerating deep learning tasks. This paper presents a comprehensive investigation into two FP8 encodings, E4M3 (4-bit exponent, 3-bit mantissa) and E5M2 (5-bit exponent, 2-bit mantissa), evaluating their effectiveness across a spectrum of deep learning applications, including large-scale language models and various image and language tasks.

FP8: The Proposed Formats

FP8 aims to strike a balance between computational efficiency and the precision necessary for deep learning tasks. The E4M3 format, primarily recommended for weight and activation tensors, extends the dynamic range by exclusively representing one pattern for NaNs and excluding infinities, thereby boosting the representable magnitude range. E5M2, recommended for gradient tensors, adheres closely to IEEE-754 conventions, facilitating straightforward conversion processes between FP16 and FP8.

Empirical Validation

The paper's empirical evaluation showcases that models trained with FP8 can match the accuracy of those trained with higher precision formats (FP16 or bfloat16) across various tasks without the need to alter model architectures or training hyperparameters. Significant findings include:

Theoretical Implications and Practical Considerations

FP8's introduction and validation bear substantial implications. Theoretically, FP8 challenges the prevailing assumptions about the necessity of higher precision for deep learning training and inference. Practically, it heralds a shift towards more resource-efficient computing, potentially lowering the barriers to training larger models and democratizing access to state-of-the-art AI technologies.

Future Directions

The results open avenues for further research into optimization techniques tailored for FP8, exploring its applicability across a wider range of models and tasks. Moreover, hardware support for FP8, considering its differentiated requirements for exponent and mantissa lengths, could catalyze its adoption, making efficient AI more accessible.

Concluding Thoughts

The exploration of FP8 formats for deep learning posits a compelling case for precision reduction as a pathway to accelerating AI innovation. By meticulously investigating FP8's efficacy across a gamut of deep learning tasks and upholding rigorous IEEE-754 conventions, this paper sets a foundation for the next evolution in AI computation. The convergence of theoretical innovation and empirical validation in the presented work underscores the potential of FP8 to chart a new course in the efficiency and accessibility of AI technologies.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube