###### Options

# IEEE-754 precision-p base-β Arithmetic Implemented in binary

Publikationstyp

Journal Article

Publikationsdatum

2023-12-15

Sprache

English

Enthalten in

Volume

49

Issue

4

Article Number

32

Citation

ACM Transactions on Mathematical Software 49 (4): 32 (2023-12-15)

Publisher DOI

Scopus ID

Publisher

Association for Computing Machinery

We show how an IEEE-754 conformant precision-p base-β arithmetic can be implemented based on some binary floating-point and/or integer arithmetic. This includes the four basic operations and square root subject to the five IEEE-754 rounding modes, namely the nearest roundings with roundTiesToEven and roundTiesToAway, the directed roundings downwards and upwards, as well as rounding towards zero. Exceptional values like ∞ of NaN are covered according to the IEEE-754 arithmetic standard. The results of the precision-p base-β operations are computed using some underlying precision-q binary arithmetic. We distinguish two cases. When using a precision-q binary integer arithmetic, the base-β precision p is limited for all operations by β2p ≤ 2q, whereas using a precision-q binary floating-point arithmetic imposes stronger limits on the base-β precision, namely β2p ≤ 2q for addition and multiplication, β2p ≤ 2q-1 for division and β2p ≤ 2q-3 for the square root. Those limitations cannot be improved. The algorithms are implemented in a Matlab/Octave flbeta-toolbox with the choice of using uint64 or binary64 as underlying arithmetic. The former allows larger precisions, the latter is advantageous for the square root, whereas computing times are similar. The flbeta-toolbox offers precision-p base-β scalar, vector and matrix operations including sparse matrices as well as corresponding interval operations. The base β can be chosen in the range β [2,64]. The flbeta-toolbox will be part of Version 13 of INTLAB [18], the Matlab/Octave toolbox for reliable computing.