Summer of Cod: libmath

• Subject: Summer of Cod: libmath
• From: "S. Gilles" <sgilles@xxxxxxxxxxxx>
• Date: Fri, 11 May 2018 03:32:46 -0400
• To: myrddin-dev <myrddin-dev@xxxxxxxxxxxxxx>

As part of the Summer of Cod, I'm working on libmath, a library to
implement some IEEE-754 functions (cos, sin, sqrt, &c). A first
milestone has been pulled. Here's what we've got:

- some initial IEEE-754 functions: exp, expm1, sqrt, trunc, ceil,
floor, rn, scalB,

- some utility functions: algorithms for efficiently and accurately
computing sums, evaluating polynomials, and performing fused

- improved support for feature flag detection with mbld: we can
now check for fancy, non-universal CPU features like sse,

- an out-of-tree tool for comparing various IEEE-754 implementations:
http://repo.or.cz/fpmath-consensus.git

Right now, the algorithms are coming from the survey work of Muller,
and in particular I'm working through an article series by Tang
describing tabular + polynomial approximation methods: see
lib/math/references in the tree.

Based on some informal analysis, we're achieving accuracy at least
on par with glibc and musl, and better in some situations. Our speed
appears to be at least within an order of magnitude of glibc for
the software implementations, and where hardware implementations
exist we're even better.

==

Finally, here's something interesting I found out when implementing
sqrt. One of the best ways to calculate x = sqrt(a) is to use
Newton–Raphson to find a solution to f(x) = x² - a = 0. However,
this gives the iteration

xₙ₊₁ = xₙ - f(xₙ)/f'(xₙ) = (1/2)(xₙ - a/xₙ),

which involves expensive division by xₙ. If instead we compute the
inverse square root x = 1/sqrt(a) by solving 1/x² - a = 0, the
iteration becomes

xₙ₊₁ = xₙ - (xₙ⁻² - a)/(-2xₙ⁻³) = (1/2)(3xₙ - xₙ³a),

which only divides by the nice constant 2. So sqrt(a) is cheaply
computed as a·(1/sqrt(a)). This explains why so much research has
gone into optimizing N–R for *inverse* square root, resulting in
the famous Quake III snippet, for example.

--
S. Gilles