Reducing the number of trigonometric function calls?
While reading through the IDG paper and the
idg_cpu code I wondered if there is a fairly straightforward way of reducing the required number of
sin/cos calls in exchange for more FMA instructions...
Assuming that the frequency channels in the input data have equidistant spacing, the phases that need to be multiplied to every visibility for every pixel in a subgrid (ie. they depend on (subgrid pixel, channel and timestep), can be written as
phase0 + i_chan*d_phase
d_phase only depend on subgrid pixel and time step, but no longer on the channel. Sine and cosine of the total phase can be computed via angle addition theorems (boiling down to complex multiplication / FMA instructions), when sines and cosines of both
d_phase are known.
The total computation cost of this approach should be lower, but I admit that there may be accuracy issues. Is this something you have already tried?