diff --git a/applications/lofar2/doc/prestudy/station2_sdp_transient_buffer.txt b/applications/lofar2/doc/prestudy/station2_sdp_transient_buffer.txt index 3ef9ffe37fbb4b309c88355bf14dac4a442656a3..7b11f321ca3c51170c2005b9d4d4002f59c736dd 100644 --- a/applications/lofar2/doc/prestudy/station2_sdp_transient_buffer.txt +++ b/applications/lofar2/doc/prestudy/station2_sdp_transient_buffer.txt @@ -9,6 +9,7 @@ Detailed design: Transient Buffer (TBuf) function for LIFT project 5) TBuf ICD SC-SDP, SDPTR-SDPFW 6) TBuf ICD STAT/SDP-CEP 7) Transient detection (TDet) Design +8) Planning References: @@ -132,15 +133,32 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 read from DDR4 --> demux to . 1 MM - . 1 retrieve --> unpack 256b to 64b --> check CRC --> add output hdr --> dump + . 1 retrieve --> unpack 256b to 64b --> check CRC --> add output hdr + --> pass on via ring --> + --> dump via 10GbE - support MP on buffer state . signal input index . frozen, buffering, reading . start address (time), end address (time) +- unb2c_test_ddr_16G resource usage + . git/hdl/boards/uniboard2c/designs/unb2c_test/revisions/unb2c_test_ddr_16G/unb2c_test_ddr_16G_resource_usage.jpg + . DDR4 max_burst_size = 64 --> 64 * 256b words = 64 * 32B = 2 kB + . M20k = 20 kbit = 2.5 kB + . per module: + wr_fifo 13 M20K + tech_ddr 9 M20K + rd_fifo 4 M20K + diag db 0 M20K + diag bg 0 M20K + --> Total 26 M20K/DDR4 module + . board common: + MMM : 69 M20K voor Nios memory + ctrl: 42 M20K voor MMAP ROM en 1GbE + - Provide direct MM access interface to DDR4 - . New access multiplexer component to interface with io_ddr with: + . New access multiplexer component (crossbar) to interface with io_ddr with: . write 12 signal input streams for TBuf recording + 1 MM write stream . read 1 stream for TBuf readout + 1 MM read stream . Write multiplexer for 12 + 1 = 13 inputs will take ~100 M20K, @@ -162,6 +180,8 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 sequentially, rather than in parallel, and to still have full freedom of allocating memory space to signal inputs. +- Apply TBuf freeze immediately when freeze message is received + - support partial dump . lightning >~ 1 s, cosmic ray >~ 1 ms . dump t0 - t1 @@ -170,16 +190,17 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 - packetize voor buffer write of na buffer read? --> voor . packtetize at 64b or 256b ? . 16b -> 64b packetize --> 64b --> 256b store - . data in buffer must have CRC --> 64b CRC ? . ddr page packet format: SSN + packed data + CRC - 8 KByte page to have integer number of pages (= slots) in 16G memory, so that DDR4-I can wrap without a gap or extend to DDR-II without a gap. - - 14b packed data - SSN = 64b = 8B - - CRC = 64b = 8B - - 8K - 8 - 8 = 8192 - 16 = 8176 B / 14b = 4672 samples per page - - 4672 * 5 ns = 23.36 us per page, so ~42.8 pages / ms - + - 14b packed data + . nof_samples_per_page = 8K - 8 - 8 = 8192 - 16 = 8176 B / 14b = 4672 + . 4672 * 5 ns = 23.36 us per page, so ~42.8 pages / ms + - CRC = 64b = 8B, use 64b CRC to match 64b words, no need to reduce stored + CRC to less bits, because nof_samples_per_page fits preferrably a multiple + of 64b words. + - Memory storage overhead is 16 / 8192 = 0.2% - dp_offload_tx header is the same for all 12 signal inputs, only si differs, so create one header for all and modify si field to save logic and RAM @@ -191,19 +212,6 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 - use SSN as timestamp, SSN = BSN * N_fft, so can be derived from bsn_source BSN, or do we need a dp_ssn_source.vhd? -- unb2c_test_ddr_16G resource usage - . git/hdl/boards/uniboard2c/designs/unb2c_test/revisions/unb2c_test_ddr_16G/unb2c_test_ddr_16G_resource_usage.jpg - . per module: - wr_fifo 13 M20K - tech_ddr 9 M20K - rd_fifo 4 M20K - diag db 0 M20K - diag bg 0 M20K - --> Total 26 M20K/DDR4 module - . board common: - MMM : 69 M20K voor Nios memory - ctrl: 42 M20K voor MMAP ROM en 1GbE - - store and send 14b packed data . so do not use 16b (with 2b sign extension), to optimize for memory usage and transport capacity (at the expense of requiring tools to observe the payload @@ -217,8 +225,10 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 . max memory size 16GB . max payload size 8kB --> 16G / 8k = 2M packets --> log2(2M) = 21b - . use packet serial number, instead of sop, eop bit fields, to show progress of - the packet dump to CEP + . to detect missing packets and to show progress of read dump to CEP: + - sop, eop bit fields, or + - use packet serial number / total nof packets, or + - nof_packets_remaining to tell how many more packets will be dumped for this si. . allocate start_page/nof_pages per si to memory, wrap at max memory size circular buffer per si, wrap after nof_pages keep track of nof_recorded pages, when > nof pages then circular buffer is @@ -233,8 +243,19 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 . FPGA_tbuf_alloc_start_page_RW [pn][si] # in range 0:2M-1 FPGA_tbuf_alloc_nof_pages_RW [pn][si] # 0 = free, > 0 = in use - . FPGA_tbuf_record_RW [pn][si] # True = start/continue, False = stop/freeze recording - . FPGA_tbuf_retrieve_timestamp_RW [pn][si] + of als allocatie fixed is, dan alleen MP: + + FPGA_tbuf_alloc_start_page_R [pn][si] # in range 0:2M-1 + FPGA_tbuf_alloc_nof_pages_R [pn][si] + + * In LOFAR1 bepaald LCU welke si uitgelezen moet worden richting CEP. De + TBB uP zorgt dan dat de nof pages verstuurd worden. + + . FPGA_tbuf_record_RW [pn][si] # True = start/continue, False = stop/freeze recording immediately + . FPGA_tbuf_record_stop_timed_RW [pn][si] # Stop recording at specified SSN time, not needed for raw data ??? + + . FPGA_tbuf_retrieve_inter_packet_gap_RW --> wait time between packets send to CEP in FPGA_tbuf_sample_period_R units + FPGA_tbuf_retrieve_timestamp_RW [pn][si] FPGA_tbuf_retrieve_nof_pre_pages_RW [pn][si] FPGA_tbuf_retrieve_nof_post_pages_RW [pn][si] . FPGA_tbuf_retrieve_enable_RW @@ -245,7 +266,13 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 . FPGA_tbuf_output_hdr_udp_destination_port_RW . FPGA_tbuf_output_enable_RW + . FPGA_tbuf_memory_address_RW[pn] + . FPGA_tbuf_memory_read_nof_words_RW[pn] --> read nof words (256b) from FPGA_tbuf_memory_address_RW + FPGA_tbuf_memory_read_data_R[pn] --> read data results from FPGA_tbuf_memory_read_nof_words_RW + . FPGA_tbuf_memory_write_data_words_RW[pn] --> write data words (256b) to FPGA_tbuf_memory_address_RW + - Monitor Points (MP): + . FPGA_tbuf_ddr4_present_R --> True is ddr4 memory is availabe, False is ddr4 calibration failed/ ddr4 not present . FPGA_tbuf_total_nof_pages_R --> 16GB / 8kB = 2M . FPGA_tbuf_page_size_R --> 8 kByte . FPGA_tbuf_nof_samples_per_page_R --> 4672 # = (8kB - 16) * 8b / 14b @@ -254,7 +281,8 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 . FPGA_tbuf_page_period_R # = FPGA_tbuf_sample_period_R * 23.36 us . FPGA_tbuf_recording_R [pn][si] - . FPGA_tbuf_retrieving_R [pn][si] + . FPGA_tbuf_retrieving_R [pn][si] --> boolean of report remaining nof pages + Maybe: . FPGA_tbuf_last_page [pn][si] --> index of last recorded page . FPGA_tbuf_last_ssn [pn][si] --> ssn of last recorded page @@ -264,24 +292,55 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 6) TBuf ICD STAT/SDP-CEP -- application header fields: + +- LOFAR1: + - 16b preamble = 0xA55A --> 8b marker + 8b version_id (as in beamlet packet) + . 8b station_id --> 16b station_id (as in beamlet packet) + . 8b rsp_id --> 8b gn_index + . 8b rcu_id --> 8b signal_input_index (0-191) + . 8b sample_freq in MHz --> 1b f_adc sample period is 5 ns or 6.25 ns (as in beamlet packet) + . 32b seqnr --> not needed + . 32b time in seconds --> 64b SSN = Sample Sequence Number + . 32b samplenr in current sec --> 64b SSN = Sample Sequence Number + 10b bandnr + 22b slicenr --> for spectral data, not needed + . 512b bandsel --> for spectral data, not needed + . 16b spare + . 16b header_crc --> not needed (also not in beamlet packet) + . payload 1948 bytes + . 32b payload_crc --> not needed (also not in beamlet packet) + +- beamlets application header fields (32 bytes = 4 * 64b) + +- tbuf application header fields (20 bytes + 4 reserved bytes = 24 bytes = 3 * 64b): . 8b marker . 8b version_id . 16b station_id - . 32b source_info - - 1b antenna_band_index - - 1b nyquist_zone_index - - 1b f_adc --> sample period is 5 ns or 6.25 ns + . 16b source_info (as in beamlet packet) + - 2b reserved + - 1b antenna_band_index (LB, HB) + - 2b nyquist_zone_index + - 1b f_adc --> sample clock rate, period is 5 ns or 6.25 ns - 1b memory_error --> based on DDR4 read CRC - - 5b sample_width --> 14b - . 8b signal_input_index - . 16b nof_samples_per_packet - . 24b packet serial number in current dump - . 24b total nof packets in current dump + - 4b sample_width --> 14b, where 16b is represented by 0 + - 5b gn_index --> purpose fault analysis + . 32b reserved + . 8b signal_input_index --> 0..191 + . 16b nof_samples_per_packet --> (8kB - 16) / 14b = 4672 (= 1022 words of 64b) --> log2() = 13b + . 24b nof_packets_remaining in current dump (log2(2M pages) = 21b ??? to detect lost packets and progress . 64b SSN = Sample Sequence Number - No need for: - - 32b observation_id --> also not in LOFAR1 - - 5b gn_index --> signal_input_index provides already all this information + +Not needed ???: + - 32b observation_id --> like for beamlets, not needed, also not in LOFAR1 + - 1b udp_error --> based on ETH/IP/UDP CRC error in case LCU does UDP to TCP, + no needed, because CRC error packets will be dropped + - header_crc (covered by eth crc) + - payload_crc (covered by eth crc and by memory_error bit) + +- headers: 14 + 20 + 8 + 24 = 66 bytes + crc: 4 bytes + data: 8kB - 16 = 8176 bytes + --> packet size = 66 + 8176 + 4 = 8246 bytes + --> packet overhead is (66 + 4) / 8246 = 0.85 % 7) Transient detection (TDet) Design @@ -301,3 +360,28 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 magnitude response doesn't have to be 0 at π, therefore they have increased bandwidths. So for odd length filters the useful bandwidth is limited to 0 < w < π. + + +8) Planning + +- ICD STAT-CEP --> tbuf packet format +- ICD SC-SDP --> OPC-UA CP and MP +- ICD SDPTR-SDPFW: + . Design TBuf control and readout + . MM registers +- L5 SDPFW detailed design TBuf + . investigate LOFAR1 TBB, Alert firmare + . investigate io_ddr + . provide MM rd/wr interface to DDR4 + . design 12 (signal inputs) + 1 (MM) DDR4 access crossbar + . estimate RAM usage + . tbuf FW block diagram + . transient data transport via ring + . transient data output via 10GbE + . identify VHDL test benches + . identify HW tests +- Implementation/test effort SDPTR software for TBuf +- Implementation/test effort SDPFW firmware for TBuf +- Verify TBuf within SDP on HW +- Integration effort with SC +