diff --git a/doc/erko_teaser_talks.txt b/doc/erko_teaser_talks.txt index c6c183887bae5894160d58f5c750671c6616c955..d878d06e3f98a84e2b8e2d11d2857cd9cf65d5d4 100644 --- a/doc/erko_teaser_talks.txt +++ b/doc/erko_teaser_talks.txt @@ -27,26 +27,153 @@ at university. 1) Teaser talk: Quantization in LOFAR2.0 Station Firmware * floating point - fixed point - integer - . 2**+127 -------------------------- 1. ------------------ 2**-127 - <n bit int> - . <n bit fxp> fraction only - <n bit fxp> with fraction - <n bit fxp> . scaled - . two complement, so range e.g. -8 to +7 for 4 bit value + <p bit exponent> + . max -----<w bit mantissa>----- 1. ------------------ min floating point + + <w bit int> integer + + . <w bit fxp> fixed point fraction only + <w bit fxp> fixed point with fraction + <w bit fxp> . fixed point scaled + + . format: - - unsigned : u(w, p) - - signed : s(w, p) - -* Operations (*, +) cause bit growth - - rounding (to remove LSbits) - . truncation: int(x), // (-7 // 6 = -2, 7 // 6 = 1) - . half away: python2, matlab - . half up - . half to even: python3, SDPFW - - clipping or wrapping (to remove MSbits) - . intermediate beamlet sum in BF uses wrapping - . final subband output and beamlet output use clipping + - w = number of bits including sign bit + - p = number of bits after point (p > 0) or before point (p < 0) + - p = 0 for integer + - u = unsigned : u(w, p) + - s = signed : s(w, p) + + + * two complement representation of signed numbers + + - E.g. range -8 to +7 for 4 bit value: + + -1 1111 + -2 1110 + -3 1101 + -4 1100 + -5 1011 + -6 1010 + -7 1001 + -8 1000 + +7 0111 + +6 0110 + +5 0101 + +4 0100 + +3 0011 + +2 0010 + +1 0001 + 0 0000 + + - E.g. range -4 to +3 for 3 bit value: + + -1 111 + -2 110 + -3 101 + -4 100 + +3 011 + +2 010 + +1 001 + 0 000 + + - MSbit is the sign bit + - Sign extension of MSbit + -1 +1 +3 bit 111 001 +4 bit 1111 0001 +8 bit 11111111 00000001 + + . overflow wrapping, e.g. with 4 bit: + 7 + 1 = -8 + -8 - 1 = +7 + -(-8) = -8, negating most negative + + . product has 2w bits, if most negative * most negative is excluded then 2w - 1 bits: + + -8 * -8 = 64 = 01000000 = largest positive requires 2w bits + 7 * +7 = 49 = 00110001 = largest positive fits in 2w-1 bits + -8 * +7 = -56 = 11001000 = smallest negative fits in 2w-1 bits + +* Operations (*, /, +, -) in the digital processing cause bit growth + - Keep as many LSbits as needed to preserve sensitivity + - Keep as many MSbits as needed to preserve dynamic range + + Note: + . typically we can avoid using a / b in VHDL: + . do a * 1/b if b is a constant + . use bit shift if b is power of 2 + . similar for a % b to get remainder after division by b + side note: beware of a % b == 1, result is language dependent when a or b is negative + + +* Removing LSbits (unused resolution, insignificant bits) + - truncation by discarding LSbit is free in logic: + + . E.g. 4 bit value discard 2 LSbit + + -1 1111 -1 11 + -2 1110 -1 11 + -3 1101 -1 11 + -4 1100 -1 11 + -5 1011 -2 10 + -6 1010 -2 10 + -7 1001 -2 10 + -8 1000 -2 10 + +7 0111 +1 01 + +6 0110 +1 01 + +5 0101 +1 01 + +4 0100 +1 01 + +3 0011 0 00 + +2 0010 0 00 + +1 0001 0 00 + 0 0000 0 00 + + Corresponds to: + . shift right by n bits >> : -5 >> 2 = -2, 5 >> 2 = 1 + . divide // 2**n : -5 // 4 = -2, 5 // 4 = 1 + . floor: floor(-5 / 4) = -2, floor(5 / 4) = 1 + + ==> discarding b LSbits in logic = a >> b = a // 2**b in python + ==> a // b = floor(a / b) for integer a, b + + - rounding LSbits costs logic: + + rounding at whole + . ceil() + . int(x): int(-5 / 4) = -1, int(5 / 4) = 1 + ==> beware: int(x) != floor(x) + + + rounding at half + . rounding scheme for half, becomes more significant when fraction has fewer bits + . round() : language dependent + . half up + . half away: python2, matlab --> no DC bias + . half to even: python3, SDP Firmware --> no DC bias and no power bias + -1.500000 --> -2 + -0.500000 --> 0 + 0.500000 --> 0 + 1.500000 --> 2 + 2.500000 --> 2 + +* Removing MSbits (unused dynamic range) + - wrapping + . truncation by discarding MSbits is free in logic + ==> beware wrap(x, 4 bits) != x % 2**4 + + . wrap operator is distributive similar as the modulo operator: + (a + b) % n = [(a % n) + (b % n)] % n + ==> intermediate beamlet sum in SDP digital beamformer uses wrapping + + - clipping + . clipping to min and max costs logic + ==> subband output for uses clipping + ==> final beamlet output in SDP output to CEP use clipping + +* Reuse standard requantization component + - supports removing MSbits and LSbits + - to ensure same Q scheme is used consistently throughout the SDP firmware + * SDP signal path - Task: Preserve sensitivity of the ADC input and maintain sufficient dynamic range @@ -56,6 +183,7 @@ at university. . beamlets (BF processing gain) - BST - beamlet output (8 bit samples to CEP) + - Show where Q is applied, always same component is used for Q - Figure of internal signal levels . dBFS . SNR, P_quant @@ -63,6 +191,52 @@ at university. . coherent input (sine), incoherent input (sky noise, weak astronomical signal burried in noise) + Received signal at ADC consists of (q = one quantization step): + + . RFI, typically narrow band + . sky noise >> q + . receiver noise < sky noise + . quantization noise = 0.29 q + . astronomical signal << q, buried in the sky noise + + The sky noise sets the nominal input level for the ADC --> sigma_sky = 1 bit is enough for 2% SNR degredation + + The RFI sets the maximum input level for the ADC --> -50 dBFS margin to avoid overflow (clipping) + + Used FPGA build in multipliers of 18b * 18b or 27b * 18b + + FIR filter + - Consists of multiply (real coefficients) accumulates (taps) + - DC gain of the filter is 1 so sum does not grow beyond 30b + - Treat the ADC data as integers and the coefficients as fixed point numbers between +-1 + - W_adc = 14b and W_coeff = 16b yields product of 30b + - Round 30b filter output to 14b + - Next step (FFT) can fit 18b input, so round to 18b to keep some more accuracy + + FFT + - Consists of multiply (complex twiddle factors) and accumulates (butterfly) + - Narrow band input (sine wave) has gain of 0.5 = -1b + - FFT has processing gain of sqrt(N_point) = 5b for N_point = 1024 + ==> Need 14b -1 + 5 = 18b to fit the subband samples + ==> internally the FFT calculates with 26 bits data to avoid accumulation of rounding errors + + Subband calibration using complex subband weigths + - Consists of complex multiply to calibrate fine gain and fine delay differences between antenna inputs + - Initially we used 18b subbands, but that yielded artifacts in the SST + - Now we keep 26b subbands until after applying the subband weigths ands the round to 18b + + Beamforming using complex weights and summation of N antenna inputs + - std level of coherent input increases with N_ant, e.g. RFI in the beam, the (weak) astronomical signal in the beam + - std level of incoherent noise increases with sqrt(N_ant), therefore we can round log2(sqrt(N_ant)) bits + + - Subband statistics + . relate SST level to subband level + - Subband weights + . calibrate for gain and phase (= fine delay) differences between signal inputs + . can also calibrate cross-polarization between X and Y per antenna (used in Disturb) + . no equalization accross the subband frequencies is done + - Beamlet weights + * Implementation details - Use separate function to do DFT for two real ADC inputs with complex FFT - Spectral inversion to have incrementing subband indexes and frequencies in all Nyquist zones @@ -72,6 +246,8 @@ at university. - Interally extra LSbit inside PFB and before applying the weights, see try_round_weight.py * Conclusion: + - int(), floor(), ceil(), round(), //, % differ in details and can depend on sign, try + to be sure per language - Fixed point arithmetic uses less FPGA resources (multipliers, RAM, logic) than floating point, but requires carefull bookkeeping or the fixed point position in the FW implementation.