Wanted: Floating-Point Add Round-off Error instruction (1603.00491v1)
Abstract: We propose a new instruction (FPADDRE) that computes the round-off error in floating-point addition. We explain how this instruction benefits high-precision arithmetic operations in applications where double precision is not sufficient. Performance estimates on Intel Haswell, Intel Skylake, and AMD Steamroller processors, as well as Intel Knights Corner co-processor, demonstrate that such an instruction would improve the latency of double-double addition by up to 55% and increase double-double addition throughput by up to 103%, with smaller, but non-negligible benefits for double-double multiplication. The new instruction delivers up to 2x speedups on three benchmarks that use high-precision floating-point arithmetic: double-double matrix-matrix multiplication, compensated dot product, and polynomial evaluation via the compensated Horner scheme.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.