diff --git a/libraries/technology/ip_agi027_xxxx/ram/README.txt b/libraries/technology/ip_agi027_xxxx/ram/README.txt new file mode 100755 index 0000000000000000000000000000000000000000..86b2e05154011e1dacac7bdf9d579038d763e5da --- /dev/null +++ b/libraries/technology/ip_agi027_xxxx/ram/README.txt @@ -0,0 +1,247 @@ +README.txt for $HDL_WORK/libraries/technology/ip_agi027_xxxx/ram +VERSION 01 - 20231110 + +Contents: + +1) RAM components +2) ROM components +3) Agilex7 IP +4) Inferred IP +5) Memory initialization file +6) Implementation options (LUTs or block RAM) +7) Synthesis trials +8) Agilex7 issues +9) References + + + +1) RAM components: + + Available: + ip_agi027_xxxx_ram_cr_cw = One read port with clock and one write port with clock and with separate address and same data width on both ports. + ip_agi027_xxxx_ram_crk_cw = One read port with clock and one write port with clock and with separate address and different data withs on both ports. + The data port widths maintain a power of two ratio between them. + ip_agi027_xxxx_ram_r_w = Single clock, one read port and one write port and with separate address and same data width on both ports. + ip_agi027_xxxx_ram_rw_rw = Two read/write ports each port with same clock and with separate address per port and same data width on both ports. + + Unavailable: + ip_agi027_xxxx_ram_crw_crw = Two read/write ports each port with own port clock and with separate address and same data width on both ports. + For the Agilex 7 this IP can only be generated with 'Emulate TDP dual clock mode' and what this entails is described + under '8) Agilex7 issues'. With this mandatory enable option, this IP is not supported as used for previous technologies. + ip_agi027_xxxx_ram_crwk_crw = Two read/write ports each port with own port clock and with power of two ratio between port widths. + Not available, because the Agilex 7 does not support ratio widths in combination with true dual port mode. + + +2) ROM components: + ip_agi027_xxxx_rom_r_w = Not available and not needed, because the ip_agi027_xxxx_ram_r_w can be used for ROM IP by not connecting the + write port. The IP could be created and than the vhd file can be derived from the generated HDL files and the + existing ip_stratixiv_rom_r.vhd file. + + + +3) Agilex7 IP + + The RAM IPs were ported manually from Quartus v19.4 for arria10_e2sg to Quartus 23.2 for agi027_xxxx by creating it in Quartus + using the same parameter settings by: + + - methode A: + . copy original ip_arria_e2sg_<ram_name>.vhd and ip_arria_e2sg_<ram_name>.ip files. + . rename ip_arria_e2sg_<ram_name>.ip and .vhd into ip_agi027_xxxx_<ram_name>.ip and .vhd (also replace name inside the .vhd file) + . open in to Quartus 23.2.0 build 94, set device family to Agilex7 and device part to AGIB027R31A1I1VB. + Finish automatically convert to "new" IP, note differences such as version. + . then generate HDL (select VHDL for both sim and synth) using the Quartus tool or generate HDL in the build directory using the + terminal command generate_ip_libs <buildset> and finish to save the changes. + . compare the generated files to the copied .vhd file for version, using the same library, generics, and ports. Make adjustments if + necessary to make it work. + . git commit also the ip_agi027_xxxx_<ram_name>.ip to preserve the original in case it needs to be modified. + + - methode B: + . copy original ip_arria_e2sg_<ram_name>.vhd file. + . rename ip_arria_e2sg_<ram_name>.vhd into ip_agi027_xxxx_<ram_name>.vhd (also replace name inside the .vhd file). + . open ip_arria_e2sg_<ram_name>.ip file in Quartus 19.4.0 build 64. No device family and device part need to be set. + . open also Quartus 23.2.0 build 94, set device family to Agilex7 and device part to AGIB027R31A1I1VB. + . select the corresponding IP in the IP catalog in Quartus 23.2.0 and provide the filename as ip_agi027_xxxx_<ram_name>.ip + Finish automatically convert to IP, note differences such as version. + . save the changes and then generate HDL (select VHDL for both sim and synth) using the Quartus tool or generate HDL in the build + directory using the terminal command generate_ip_libs <buildset> to finish it. + . compare the generated files to the copied .vhd file for version, using the same library, generics, and ports. Make adjustments if + necessary to make it work. + . git commit also the ip_agi027_xxxx_<ram_name>.ip to preserve the original if case it needs to be modified. + + this yields: + + ip_agi027_xxxx_ram_cr_cw.ip + ip_agi027_xxxx_ram_crk_cw.ip + is derived from the ip_arria10_e2sg_ram_crwk_crw by modifying it to feature a single read and a single write port, + and incorporating a dual-clock design with distinct clocks for reading and writing. + ip_agi027_xxxx_ram_r_w.ip + ip_agi027_xxxx_ram_rw_rw.ip + is derived from the ip_arria10_e2sg_ram_crw_crw, incorporating the modification to operate with a single clock. + + + The IP only needs to be generated with generate_ip_libs <buildset> if it need to be modified, because the ip_agi027_xxxx_ram_*.vhd + directly instantiates the altera_syncram component. The buildset for the agi027_xxxx is iwave. + + The instantiation is copied manually from the ip_agi027_xxxx_ram_*/ram_2port_2040/sim/ip_agi027_xxxx_ram_*.vhd and saved in the + ip_agi027_xxxx_<ram_name>.vhd file. So the generated HDL files are no longer needed, because it could easily be derived + from the IP file and the files will be generated in the build directory (under iwave/qsys-generate/) when using the terminal command + generate_ip_libs <buildset>. + + It appears that the altera_syncram component can be synthesized even though it comes from the altera_lnsim package, + that is a simulation package. However it resembles how it worked for Stratix IV with altera_mf. + + + +4) Inferred IP + + The inferred Altera code was obtained using template insert with Quartus 14.0a10. The IPs with different port widths, + like the ram_crk_cw, can not be inferred from RTL code. + For the RAM the g_inferred generic is set to FALSE because the inferred instances do not yet support g_init_file. + It is possible to init the RAM using a function e.g. see the README.txt for arria10. But this is probably not being + applied (for now) because it's easier to generate an IP and use the altera_syncram component. The inferred ones + require more effort to make them work, because the structure of the inferred Altera code should let Quartus know + to use a RAM block for implementation. + + + +5) Memory initialization file + + Often referred to as a .mif file. It is used to initialize the content of memory blocks within the design, specifying + the data to be stored in each memory location. This file must be included in the Quartus Projects. During synthesis, + the tool uses this file to iniliaze the memory blocks in the design. + To support the g_init_file requires first reading the file in a certain format, by providing a file path as a string, + which indicates the location of the file. This path is telative to the project folder in the build directory. + These files uses the Intel hex-standar and are word adressed (32 bits per address). For us an integer format or SLV + format with one value per line (line number = address) would be fine. Using SLV format is necessary if the RAM data + is wider than 32 bit, because VHDL integer range is only 2**32. The tb_common_pkg has functions to read such a file. + Previously Quartus created a mif file from this when it infers the RAM. However the UniBoard1 designs provided a mif + file that fits the RAM IP. Therefore it was easier initially to also use the RAM IP for Arria10, and this still holds + on, also for the Agilex7. For RadioHDL a generic RAM init file format is preferrable though. Currently the args tooling + with the command gen_rom_mmap.py (refer to [8]) is used to generate the register map as a text file, compresses it, and + then creates the corresponding .hex or .mif file from it. For other RAM initialization we generate a hex file with + Python or with the Memory Initialization Tool that Quartus Prime GUI provides ourselves. This tool allows you to + specify the initial contents of memories in your design visually. + + + +6) Implementation options (LUTs or block RAM) + + The IP (and also the inferred) RAM can be set to use LUTs (MLAB), block RAM (M20K) or LCs, however this is not supported yet. + + . For IP RAM this would imply adding a generic to set the appropriate parameter in the altera_syncram + . For inferred RAM this would imply adding a generic to be used for the syntype attribute. + For an example see the README.txt for arria10. + + + +7) Synthesis trials + + All the synth .vhd files have been simulated and performed well. + The quartus/ram.qsf could be derived from the ip_arria10/ram/ folder and changed to only the following assignments: + set_global_assignment -name FAMILY "Agilex 7" + set_global_assignment -name DEVICE AGIB027R31A1I1VB + set_global_assignment -name LAST_QUARTUS_VERSION "23.2.0 Pro Edition" + set_global_assignment -name ERROR_CHECK_FREQUENCY_DIVISOR 256 + set_global_assignment -name MIN_CORE_JUNCTION_TEMP "-40" + set_global_assignment -name MAX_CORE_JUNCTION_TEMP 100 + set_global_assignment -name PWRMGT_VOLTAGE_OUTPUT_FORMAT "LINEAR FORMAT" + set_global_assignment -name PWRMGT_LINEAR_FORMAT_N "-12" + set_global_assignment -name POWER_APPLY_THERMAL_MARGIN ADDITIONAL + quartus_qsf_files = $HDL_WORK/libraries/technology/ip_agi027_xxxx/ram/quartus/ram.qsf could be added to the hdllib.cfg under + [quartus_project_file]. Use the terminal command quartus_config <buildset> to create/update all the projectfiles for iwave. + The Quartus project ip_agi027_xxxx_ram.qpf from $HDL_BUILD_DIR/iwave/quartus/ip_agi027_xxxx_ram/ was used to verify that the block RAM IP + actually synthesise to the appropriate FPGA resources. The current version of the inferred RAM is verified at arria10. Use the Quartus + GUI to manually select a top level component for synthesis e.g. by right clicking the entity vhd file in the file tab of the Quartus + project navigator window. For the (default) testcondition the generics are set to 32 words memory size and 8 bits wide. They only differ + for crk_cw waarbij the generics are set to 32 words memory size for writing, 32 bits wide of each write port, 16 words memory size for + reading and 64 bits wide of each write port. Then check the resource usage in the synthesis and fitter reports. + The most important information from these reports is (found under Place Stage > Resource Usage Summary / Resource Utilazation by Entity): + . for g_nof_words equal to 32 and for g_dat_w equal to 8: + . one M20k block ram is used, but it is not completely filled. 8 * 32 = 256 block memory bits. + . no M20k block ram is used. Instead, 256 MLAB memory bits are used along with combinational ALUT usage and 8 memory ALUT usage. + . for g_nof_words equal to 1024 and for g_dat_w equal to 20, exactly one M20k block ram is used and filled completely. + 20 * 1024 = 20480 block memory bits. + . for g_wr_nof_words equal to 32, g_wr_dat_w equal to 32, g_rd_nof_words equal to 16, and g_rd_dat_w equal to 64, two M20k block RAMs are + used, but they are not completely filled. Only 1024 block memory bits are used. 32 * 32 = 1024 block memory bits. A reasonable explanation + for this is that the data width is greater than 40 bits, which is the maximum data width with this memory size for one block ram. [2] + . the total M20k blocks is 13272. Thus the total block memory bits that is available is 13272 * 20480 = 271810560 when optimal use. + . no dsp blocks are used. + . the total dsp blocks on the device is 8528. + . the dedicated logic registers are 5 of the primary type. + . the total logic registers are 1825600 for each type. + . the used LABs is 5 (4 logic/1 memory). + . the total LABs on device is 91280. + . no ALMs needed for cr_cw, crk_cw and rw_rw. + . the ALMs needed [=A-B+C] for r_w is 13. + . the total ALMs on device is 912800. + . due to a critical warning that occured during synthesis of cr_cw (refer to [4]), it was identified that the issue arises when it uses dual + clock in conjunction with the read_during_write_mode_mixed_ports => "OLD_DATA". According to altera_syncram user guide, this configuration + is only supported when the same clock is utilized. Currently, the parameter value is set to "OLD_DATA", because when this parameter it is + set to "DONT_CARE" for the ip_arria10_e2sg_ram_crw_crw this eliminates the warning, but the regression test then fails. Implementing this + correctly across all technologies requires additonal effort. It is possible that this configuration may be applied in the future. + . due to the same parameter an error occurs for rw_rw. As a result the parameter value is now set to DONT_CARE in stead of OLD_DATA to resolve + the error. [3] + + + +8) Agilex7 issues + + No (direct) available use of ip_agi027_xxxx_ram_crw_crw and *_crwk_crw. The other .vhd synth files based on generated HDL files of the IPs did not + encounter any issues. + + crw_crw (dual-clock-read-write port RAM): + -Cause: + Due to the error that occurs in the Quartus configuration (refer to [5]), the parameter "emulate TDP dual clock mode" needs to be enabled. + As a result, this synthesis file cannot easily be ported. While the file can be successfully configured, it cannot be used differently without + a significant latency. This limitation arises because the VHDL synthesis code of this IP must utilize the TDP dual clock emulator, which consists + of two DCFIFOs and a single RAM block. However, it is preferable to resolve this issue at a higher layer where the implementation occurs. + + -Explanation: + Nevertheless according to the user manual of the Agilex 7, when you engage the TDP dual clock emulator feature (refer to [1]): + . the clock connection to port A must be a slow clock (clock A). + . the clock connection to port B must be a fast clock (clock B). + . the clock frequency ratio of clock B divided by clock A is greater than or equal to seven. + . port A and port B will have different latency, it can only be used with a minimum latency of five clock cycles (of clock A), which is significant. + . the latency for port A decreases as the difference between the two clock frequencies increase. + . the latency for port b is fixed to two clock cycles and the output registers are enabled for this configuration. + . the FIFO addresses clock domain crossing (CDC) issues for the control signals and serves as a temporary buffer for storing data before and after + being processing by the RAM block. + . the FIFO depth can be adjusted with the use of a generic. + . the FIFO depth must be a power of 2 and must exceed the clock frequency ratio (B/A) to ensure the proper functioning of the emulated TDP. + + -Solution: + This results in the utilization of a newly created IP, ip_agi027_xxxx_ram_rw_rw, which is a single-clock dual-read-write RAM, instead of *_crw_crw. And + address the solution at the higher-level layers where the implementation is occurring. This is appropriate due to the structure of the HDL git repository. + For this new IP, tech_memory_ram_rw_rw is created, wherein rw_rw functionality is constructed for the previous technology identifiers using the crw_crw + IP synthesis files in only one clock domain by providing the same clock signal twice, and no new rw_rw IPs need to be generated. + The 'common_ram_rw_rw' and 'common_paged_ram_rw_rw' files had to be modified to facilitate the integration of this new RAM IP. Additionally, an extra + testbench is created to simulate the "paged" file by duplicating the '*_crw_crw' version. This adjustment was necessary because previously, the + 'common_(paged_)_ram_crw_crw' files were underlying utilized by the 'rw_rw' files, and the usage has now been shifted to these files. + + crwk_crw (dual-clock-read-write port with a power of two data width ratio): + -Cause: + Due to the errors that occurs in the Quartus configuration (refer to [5], [6] and [7]), the ip_agi027_xxxx_ram_crwk_crw cannot be ported. + This IP has also the same issue due to the clocking method as crw_crw, but also has additonal issues due to incompatibility for different data withs + for true dual port RAM. + + -Solution: + To facilitate a specific aspect of the functionality provided by crwk_crw, specifically its integration into common_ram_cr_cw_ratio, a newly IP, + ip_agi027_xxxx_crk_cw is created instead of *_crwk_crw. Which is a dual-clock simple-dual-read-write RAM. Unfortunately, there is no built-in + implementation or solution for achieving the same functionality as crwk_crw for backward compatibility with Arria10 in the Quartus tool. + This implies that a custom implementation must be created at higher-level layers to achieve this functionality. + For this new IP, tech_memory_ram_crk_cw is created, wherein crk_cw functionality is made compatible for the existing technology identifiers using the crwk_crw + IP synthesis files, by utilizing only the read port for one clock domain and only the write port for the other, eliminating the need to generate new rw_rw IPs. + The 'common_ram_cr_cw_ratio' file had to be modified to facilitate the integration of this new RAM IP. No additional testbench is created for simulation, + as there is also no testbench for the underlying 'common_ram_crw_crw_ratio' file that was utilized. + + +9) References: + + [1] https://www.intel.com/content/www/us/en/docs/programmable/683241/23-2/true-dual-port-dual-clock-emulator.html + [2] https://www.intel.com/content/www/us/en/docs/programmable/683241/23-2/embedded-memory-configurations.html + [3] https://www.intel.com/content/www/us/en/docs/programmable/683241/23-2/mixed-port-read-during-write-mode.html + [4] Critical Warning(15003): "mixed_port_feed_through_mode" parameter of RAM atom gen_ip.u_altera_syncram|auto_generated|altera_syncram_impl1|ram_block2a5 cannot have value "old" when different read and write clocks are used. + [5] Error: In 'Clks/Rd, Byte En' tab. 'Emulate TDP dual clock mode' must be enabled if clocking method is 'Customize clocks for A and B ports' for Agilex 7 while using two read/write ports. + [6] Error: In 'Widths/Blk Type' tab, the valid ratio between widths of port A and port B is 1 for device family Agilex 7 while using two read/write ports. + [7] Error: In 'Widths/Blk Type' tab. 'Use different data widths on different ports' feature cannot be enabled as the valid ratio between port A and B must be 1 for Agilex 7 while using two read/write ports. + [8] ARGS tool script to generate fpgamap.py M&C Python client include file