Reducing the number of trigonometric function calls?
While reading through the IDG paper and the idg_cpu
code I wondered if there is a fairly straightforward way of reducing the required number of sin/cos
calls in exchange for more FMA instructions...
Assuming that the frequency channels in the input data have equidistant spacing, the phases that need to be multiplied to every visibility for every pixel in a subgrid (ie. they depend on (subgrid pixel, channel and timestep), can be written as
phase0 + i_chan*d_phase
where both phase0
and d_phase
only depend on subgrid pixel and time step, but no longer on the channel. Sine and cosine of the total phase can be computed via angle addition theorems (boiling down to complex multiplication / FMA instructions), when sines and cosines of both phase0
and d_phase
are known.
The total computation cost of this approach should be lower, but I admit that there may be accuracy issues. Is this something you have already tried?