Options
IEEE-754 precision-p base-β Arithmetic Implemented in binary
Publikationstyp
Journal Article
Date Issued
2023-12-15
Sprache
English
Volume
49
Issue
4
Article Number
32
Citation
ACM Transactions on Mathematical Software 49 (4): 32 (2023-12-15)
Publisher DOI
Scopus ID
Publisher
Association for Computing Machinery
We show how an IEEE-754 conformant precision-p base-β arithmetic can be implemented based on some binary floating-point and/or integer arithmetic. This includes the four basic operations and square root subject to the five IEEE-754 rounding modes, namely the nearest roundings with roundTiesToEven and roundTiesToAway, the directed roundings downwards and upwards, as well as rounding towards zero. Exceptional values like ∞ of NaN are covered according to the IEEE-754 arithmetic standard. The results of the precision-p base-β operations are computed using some underlying precision-q binary arithmetic. We distinguish two cases. When using a precision-q binary integer arithmetic, the base-β precision p is limited for all operations by β2p ≤ 2q, whereas using a precision-q binary floating-point arithmetic imposes stronger limits on the base-β precision, namely β2p ≤ 2q for addition and multiplication, β2p ≤ 2q-1 for division and β2p ≤ 2q-3 for the square root. Those limitations cannot be improved. The algorithms are implemented in a Matlab/Octave flbeta-toolbox with the choice of using uint64 or binary64 as underlying arithmetic. The former allows larger precisions, the latter is advantageous for the square root, whereas computing times are similar. The flbeta-toolbox offers precision-p base-β scalar, vector and matrix operations including sparse matrices as well as corresponding interval operations. The base β can be chosen in the range β [2,64]. The flbeta-toolbox will be part of Version 13 of INTLAB [18], the Matlab/Octave toolbox for reliable computing.
Subjects
base-β
double rounding
Floating-point arithmetic
IEEE-754
interval arithmetic
INTLAB
precision-p
DDC Class
510: Mathematics