From 0021dd314f6f166c80944d2759cd0b4f33b2b276 Mon Sep 17 00:00:00 2001 From: Eric Kooistra <kooistra@astron.nl> Date: Thu, 5 Nov 2020 14:26:08 +0100 Subject: [PATCH] Copied from https://svn.astron.nl/UniBoard_FP7/UniBoard/trunk/Firmware/doc/howto/ --- doc/how_to_write_VHDL.txt | 1462 +++++++++++++++++++++++++++++++++++++ 1 file changed, 1462 insertions(+) create mode 100644 doc/how_to_write_VHDL.txt diff --git a/doc/how_to_write_VHDL.txt b/doc/how_to_write_VHDL.txt new file mode 100644 index 0000000000..00d190ffc5 --- /dev/null +++ b/doc/how_to_write_VHDL.txt @@ -0,0 +1,1462 @@ +How to write VHDL coding style +============================== + +Contents: +1) Introduction +2) Coding style +3) RTL development views +4) State machines +5) Clocked and combinatorial process + a) RTL (Register Transfer Level) + b) No latches + c) Sensitivity list + d) Limited use of variables + e) Avoiding delta-delay problems with derived clocks + f) Default only use positive edge triggered flip-flops + g) Use reset to clear flip-flops only + h) Clock domain crossing + i) Register outputs +6) No-variables method +7) Gaisler two-process method +8) Directory structure +9) Naming conventions + a) Language key words + b) Files, entities, architectures + c) Library directory + d) Constants and generics + e) Variables + f) Signals + g) Types + h) Processes, instances, generates + i) Procedures, functions + j) Packages + k) Do not use other HDL reserved words +10) Coding conventions + a) Default use descending bus order + b) Default use named association for the generic map and port map + c) Component instantiation + d) Only use IEEE.std_logic_1164 and IEEE.numeric_std libraries + e) Do not use the BLOCK statement + g) Do not use CONFIGURATIONs + h) Avoid embedded, tool-specific synthesis commands + i) Use enumerate values for FSM states +11) File layout + a) Copyright statement + b) Purpose and description + c) Put entity, architecture and configuration in same file + d) Port order + e) Declaration order + f) Avoid mixing structural and RTL description code in the same architecture + g) Comment + h) Indent and alignment + i) TABs and spaces - Do not use TABs + j) Line length + k) Use separate line for each statement + l) Place each declaration on a separate line +12) Standard packages +13) Use of records +14) Use of two-dimensional arrays +15) Use of constants, generics and packages +16) Functions +17) Procedures +18) Test benches + a) Verifying a DUT + b) Test bench interface packages and components + c) Multi-level test bench + d) Self checking and self stopping + e) Python test case using MM interface +19) DP streaming component development example +20) Simulation and synthesis debugging + + +1) Introduction + +The basic idea of the VHDL coding style discussed here is that the structure +and naming of the code closely reflects the task that the code performs. In +summary the key aspects of this VHDL coding style are: + + * a structured hierarchy of the design + * a structured way of naming all elements in the VHDL + +The VHDL coding style may be used as a cosmetic step during or at the end of a +VHDL implementation. However the coding style can also be used as an integral +part of the development. Not only after the coding has been done but rigth from +the start and during all phases of the coding. In this way the coding style +becomes more than just a matter of cosmetics, in fact it then is a valuable +tool for developing robust and correct code. + +Something about VHDL and digital design: + +VHDL has its flaws (e.g. why is the INTEGER type resticted to <= 32 bit), but +is a good language for describing and implementing digital design. +Writing VHDL may look like programming software like C but it is fundamentally +different. In particularly writing VHDL differs from sequential programming +because digital designs are: + +- massively parallel, e.g. each register, RAM or LUT (lookup table) acts as a + parallel entity +- restricted by low level details like e.g. multiple clock domains and hardware + specific resources + +Some more aspects of digital design are: + +- digital design often needs to cover the entire range from high level + application function to low level physical aspects of the targeted hardware. + Modular design and using hierachy help to separate these two levels. However + the targeted hardware often changes with every new project, this means that + digital design will inherently always involve also low level implementation + develoments. +- The VHDL language offers many language constructs, but not all of these are + possible or wise to use for describing hardware. + +Using a proper coding style helps to cope with these aspects of VHDL and digital +design and to make good code that will work on the hardware (in time and with +quality). Note that many of the coding style aspects discussed here in fact +apply to any programming language (C, MATLAB, Python, TCL, BASH, Erlang, ...). + + + +2) Coding style + +The subsequent sections describe the coding style. Following a coding style is +not a burden or something to do afterwards as a code clean up session. Instead +following a coding style is an important development method at any stage of the +code writing development. Proper code without loose ends is important because +in combination with some testing this ensures that the code is correct. Dirty +code may pass the testing as well, but does not give the same level of +confidence that the code is correct. It is often impossible to cover all +conditions in the tests. Therefore proper coding is an integral part of robust +development. At the end of the development you have a polished code that +performs what it needs to do and a test bench to confirm that it is correct. +E.g. similar as a piece of music the code and the comment should have no false +notes. The code should flow like a melody and all parts should be in harmony. + +At intermediate steps in the coding of the sources should be committed in SVN. +Such a step e.g. can be a cosmetic improvement, a correction or an added +feature. The SVN commits can be seen as hooks that a mountaineer uses to secure +the climb. The hooks (SVN commits) help to ensure that we develop in atomic +steps. The hooks are useful for doing a 'diff' to highligth the changes or to +see the progress by reading the commit comments. Typically we seldomly need to +fall back to the hooks (i.e. return to a previous state in SVN). The hooks (SVN +commits) help to stay on the rigth track and at any time during the development +they show a tractable trace towards the end result. By developing in small +steps, each with clear added value, we break down a complex problem into an +easier problem. + +Obsolete or redundant code like declarations, assignments, comments, etc have +to be removed already during development, because they obscure the true working +of the code. It is like with the construction of a building. During the work +the place needs to remain tidy and once it is finished the scaffoldings are not +left behind. + +Some important aspects of the coding style are: + +- Use symmetry and similarity, e.g. similar functions should look similar in + code. This not only applies within a file, but also between files and in + fact throughout the entire development directory. +- Use consistent names for related designs, entities, functions, etc. E.g. + there are components common_round, common_resize, common_requantize. Then + it would be 'wrong' to call a new related component common_clipper instead + of common_clip. +- The code complexity must be equal or less than complexity of the problem. + Make use of the redundancy and don't care situations in the problem to + simplify the code. +- The comment must have added value and be phrased correctly. +- Avoid redundant code, e.g.: + . Move the same code in both the 'then' and the 'else' of an 'if' statement + above the 'if' + . Consider making a component of it or a function +- Use mininmal interfaces, i.e. structure the design such that the blocks have + clear tasks to do without too much interdependencies. +- Use interface names that describe the output of a task +- Minimize the amount of control. Only gently touch the input so much as to get + to the required output. Too much control conditions are hard to cover in + testing and harder to understand. A sign of too much control can be e.g.: + . Any control at all (always consider whether you can do without) + . A conditional statement with more than 2 levels + . Avoid unneccessary restrictions, make use of redundancy and don't care + situations + . A power of 2 counter wraps automatically, no need to check for max + . Do not force invalid data to a value, let it hold its current value +- The code should always look clear. If the code looks not in harmony or 'ugly' + then this can be improved for sure. The cause can be e.g.: + . The function interfaces have not been placed at the rigth place. + . The problem is not clearly understood yet. + The solution is e.g.: + . Determine what is the IO and what realy needs to be done and when. + . Break the problem down into atomic steps that each reveal a clear action + for solving the problem. + . Find the resemblence in other problems that have already been solved, + identify what is the difference or perhaps there is no real difference. + . Reuse existing components. + . Try to let the default behaviour be the wanted behaviour. +- Perhaps the only excuse for an 'ugly' solution is when the input is presented + in an 'ugly' way and you can not change this (e.g. because it comes from an + existing device). However this then may be handled by first defining a + component that 'beautifies' the input, so that the rest of the problem + solution can again be implemented in harmoneous VHDL. +- Use names that are accurate, i.e. not too specific and not to broad +- From input to output the code names and functionality should gradualy change. + E.g. similar as Escher's metamorphosis transformation drawings, e.g. with + birds on one side that gradually change into fishes on the other side. Note + e.g. the symmetry and similarity of the internal streaming interface names + from rx input via control to tx output in eth.vhd: + + SIGNAL rx_adapt_sosi : t_dp_sosi; + SIGNAL rx_crc_sosi : t_dp_sosi; + SIGNAL rx_hdr_sosi : t_dp_sosi; + SIGNAL rx_channel_sosi : t_dp_sosi; + SIGNAL demux_sosi_arr : t_dp_sosi_arr(0 TO c_mux_nof_ports-1); + SIGNAL eth_rx_sosi : t_dp_sosi; + SIGNAL rx_frame_* + SIGNAL eth_tx_sosi : t_dp_sosi; + SIGNAL mux_sosi_arr : t_dp_sosi_arr(0 TO c_mux_nof_ports-1); + SIGNAL tx_mux_sosi : t_dp_sosi; + SIGNAL tx_hdr_sosi : t_dp_sosi; + + + +3) RTL development views + +Basicly there are four views on RTL code: + +a) as text +b) as schematic (drawing flipflops and combinatorial logic like 'and','or', mux, + demux, +, * etc at RTL level or of components at structural level) +c) as a timing diagram (like the Modelsim wave window) +d) state machine drawing using state circles and arrows between them + +The end result is a text file, but for the development it is often useful to +also make a schematic drawing or to draw a timing diagram to better understand +the functions and their dependencies (latencies) in time. A state machine +drawing is useful to ensure that the state machine is correct. Many engineers +only use the text view as development view. However the other views can be +quite helpful to understand and improve the text code. + + + +4) State machines + +There are two types of state machine: Mealy and Moore. The difference is that +for a Moore state machine the outputs only depend on the states, whereas for a +Mealy state machine the outputs also depend on the inputs. We typically only +use the Mealy state machines. +The state machine drawing is useful for a state machine. In the drawing each +state is represented by a circle and the arrows indicate the condition for a +state change and the effect of this change on the outputs. For each circle the +sum of the arrow conditions should be '1' (i.e. TRUE). +In the RTL code all outputs get a default value and a case statement lists +the conditions and effects on the outputs per state. +Note that in fact all RTL logic can be regarded as Mealy state machines, but +for e.g. a counter it is less useful to view it as a state machine. + + + +5) Clocked and combinatorial process + +a) RTL (Register Transfer Level) +The RTL logic functionality can be described in clocked processes (registers, +flip flops) and combinatorial (transfer) processes. The convention is to +clearly separate the clocked process from the combinatorial processes, whereby +the clocked process only lists the q <= nxt_q register assignments. The nxt_q +assignments is done in a separate combinatorial process. + + p_clk : PROCESS(clk) + BEGIN + IF rising_edge(clk) + q <= nxt_q; -- q becomes its d input called nxt_q + END IF; + END PROCESS; + + nxt_q <= f(...); -- concurrent statement + +or: + + p_comb : PROCESS(sensitivity list) -- process block statement + BEGIN + nxt_q <= f(sensitivity list) + END PROCESS; + +Note that this scheme describes any logic function that we need to implement. +Logic is always a clocked part and a combinatorial part (i.e. a Mealy machine). +The path delay through the combinatorial part is what determines the maximum +clock frequency. By adding more pipelining register stages the combinatorial +paths can be shortened to increase the maximum clock frequency. + +Important clocking rules: +- Do not use differente clock edges. Within a clock domain one should only use + the rising_edge(clk). +- Do not use data signals as clock. Two exceptions are at a single central + point or at the output pin of the FPGA. The data that is used for the clock + must come directly from a flipflop, because for a combinatorial output + different internal processing delays can cause glitches during the set up. +- Do not gate clocks, because within an FPGA this creates a new clock and the + number of clock trees is limited. In an ASIC clock gating can be used to + save power, but it then needs to be done centrally and in a controlled way + to ensure that no glitches will occur due to the gating logic. +- Use resets that are synchronized to their corresponding clock domain. + +b) No latches +The logic must have no latches, we only use flipflops and registers. Therefore +every conditional statement (IF, CASE) should address all conditions. In VHDL +it is often possible to first assign a default value to the nxt_* signals and +then in the rest of the process assign the conditional value(s). E.g.: + + nxt_q <= q; + IF cnt=b THEN + nxt_q <= 0; + END IF; + +The Quartus synthesis report warns for inferred latches. These unintentional +latches must be corrected in the RTL code. + +c) Sensitivity list +Make sure that all signals that are read in a combinatorial process are also in +the sensitivity list, because that is how the synthesis and hardware will +interpret it. The signals that are read in a process are: +- Signals that are part of a condition e.g. '>, <, =' in e.g. an IF statement. + The signal can be on either side of the condition. +- Signals that are assigned to another signal or variable, i.e. at the + right of <= or :=. +If a signal is used but not in the sensitivity list of the combinatorial +process then the results can be different in simulation, because then the +process will not be evaluated if that input changes. +The Quartus synthesis report warns for incomplete sensitivity lists. These +warnings must be corrected in the RTL code. + +d) Limited use of variables +Variables are preferrably not used, because they are not visible in the +ModelSim Wave window, which makes it difficult to debug them with respect to +signals. + +To describe digital logic variables are not needed, using only signals is +sufficient. In fact using variables to describe digital logic is somewhat a mis +conception, because variables are assigned sequentially in a process, whereas +implemented logic is inherently parallel. For example a filter, a counter, but +also every LUT or flipflop: in hardware they all run in parallel. Signals get +their value at the end of a process, this reflects the parallel behaviour of +logic. Hence all logic can be described using only signals. The sequential +behaviour of variable resembles ordinary programming code that runs on a +processor, hence it seems that variables are added to VHDL to pamper software +designers and make VHDL look like e.g. C. In VHDL test benches variable can be +used, e.g. to access a file. + +e) Avoiding delta-delay problems with derived clocks +Assigning a clock to another signal does cause a delta delay. If data is passed +on using these two (identical) clocks, then the simulation can mismatch the +reality (synthesis). Because the data is then be captured one clock cycle later +in simulation due to the delta-delay between the rising_edge() of the two +clocks. To avoid these delta-delay problems all data processing logic +should only use the derived clock. For reset signals it is less critical, +because typically a design does and should not rely on reset release being one +cycle later or not. +Passing a clock through hierarchy does not cause delta-delays. However, for a +clock that is created inside a component and then used both internally and +externally the clock needs to be assigned to the output port via a auxiliary +signal like: clk_out <= i_clk_out, because an output port signal cannot be read +in VHDL. To avoid the delta-delay problem this implies that the derived clock +needs to be output as clk_out and then back input as clk_in. Both internally and +externally the clk_in should then be used to clock the data. + +f) Default only use positive edge triggered flip-flops + +g) Use reset to clear flip-flops only +Asynchronous and synchronous resets should only place flip-flops in a known +start-up state. Resets should not be used for other purposes. It should be +clear where a reset is generated and to which clock domain it belongs. + +h) Clock domain crossing +Use the dedicated components from common library to cross clock domains: + +- common_areset.vhd -- for a reset +- common_async.vhd -- for a level signal +- common_spulse.vhd -- for a pulse signal +- common_reg_cross_domain.vhd -- for MM data vector +- common_fifo_dc.vhd -- for streaming data vector + +i) Register outputs +Default register the outputs of a component. Typically there is then no need +to register the inputs. The output registering between components eases timing +closure within the components. Exceptions are eg. the ready output for +streaming flow control which typically needs to be combinatorial output to +maintain the ready latency. + + + +6) No-variables method + +The RTL scheme desribed at point 5) is referred to as the 'no-variables method'. +The no-variables method was used in LOFAR and is also used in the UniBoard +firmware. The no-variable method separates the clocked process that defines the +registers from the combinatorial process that defines the function. There is +only one clocked process per clock domain but there can be multiple +combinatorial proceses. The processes do not use of variables. + + +7) Gaisler two-process method + +The Gaisler two-process method can be regarded as a clever method that helps +to bridge the gap between a software approach (more sequential thinking) and +a hardware approach (more parallel thinking) towards developing logic. In the +Gaisler two-process method there is: + +- one clocked process and +- one combinatorial process. + +All local registers are grouped in a local record type such that the clocked +process becomes quite simple and uniform and whereby the functional operation +is in the combinatorial process: + + TYPE t_reg IS RECORD + -- local registers (flip flops) + END RECORD; + + CONSTANT c_reg_rst : t_reg := (<reset values for the t_reg fields>); + + SIGNAL r, nxt_r : t_reg; + + -- calling the local registers 'r' also fits the Gaisler style + -- or instead to fit the Gaisler style call 't_reg' --> 'reg_type' + -- or instead to fit the Gaisler style call 'nxt_r' --> 'rin' + + -- Map t_reg outputs to entity outputs + <entity outputs> <= r outputs; + + -- p_reg + r <= nxt_r WHEN rising_edge(clk); + + p_comb : PROCESS(rst, r, <other inputs>) + VARIABLE v : t_reg; + BEGIN + -- Default + v := r; + + -- Functionality + <Here the logical operations on r and the other inputs to determine + nxt_r and the combinatorial outputs are defined.> + + -- Reset and nxt_register + IF rst='1' THEN + v := c_reg_rst; + END IF; + + nxt_r <= v; + END PROCESS; + +Note that p_reg is similar as with the RTL clocked process style defined in 5a, +but the big advantage of the Gaisler style is that all local registers get +nicely grouped into one record. + +The single combinatorial process uses next register value variable v : t_reg +and some more auxiliary variables if necessary. The rst is applied at the end +so that it acts like a synchronous reset and then v is assigned to nxt_r. The +variable v gets initialized with the current register value r and the rest of +the process code decribes and defines the logical operation on r and the +other process inputs to get the nxt_r. The nxt_r is only assigned to, so +therefore it is not in the sensitivity list. + +The description in dp_packet_merge.vhd explains how to use v and r in p_comb. + +The default/preferred coding style rule is that v is only assigned, so not +read. This means that v does not occur in an if condition and also not right +of :=. If needs to be read then first use another variable to determine this +intermediate combinatorial result based on the inputs and r, and then assign +this variable to v. In this way the use of v is focused on defining the next +input for r. Using nxt_r or v seems similar in this way, because in both +schemes they are only assigned to. A difference is that the scheme with nxt_r +uses an asynchronous rst, whereas the Gaisler scheme with v uses a synchronous +rst. + +The default/preferred coding style is to treat each v field implementation for +r in a separate section in the p_comb process. The alternative would be to +have a separate section per input and per r field, but that seems less clean +in general. + +The reset is applied in p_comb so it is used as a synchronous reset, for most +functionality this is appropriate. Still the reset could instead be applied +within p_reg to have an asynchronous reset. The advantage of an asynchronous +reset is that it gets applied even without a clock. The advantage of a +synchronous reset is that its timing with respect to the clock gets taken care +of automatically like any other register data input. + +In 5d it is argued to minimize the use of variables. However within p_comb for +the Gaisler method using variables is appropriate. In fact the variables are +more used as auxiliary or temporary variables, therefore they can have +insignificant names while the signals represent the function of the process and +have the significant functional names. Using variables as temporary variables +fits a sequential way of thinking and fits the VHDL definition of variables in +a VHDL process, and it does not curse with the parallel nature of digital +logic. Therefore the general rule for using variables seems to describe the +true functionality in signals and use variables only as auxiliary variables to +hold temporary results. + +A variable in a process can also be used to hold a dynamic semi-constant value +that gets determined at process entry, but that does not get modified further +on, e.g. as with v_siso_arr_* in dp_bsn_align. + +A mixed style is also possible whereby the t_reg record is used to have the +clarity and ease of the single assignment clocked process, but whereby the +logical functionality is still defined in one ore more combinatorial processes +and or concurrent statements to reflect the parallel behaviour of digital +logic. + + + +8) Directory structure + +It is proper to clearly separate the VHDL that describes logic that can run on +hardware from the VHDL that describes the test bench. + +<module name>/build/sim/modelsim/ + /build/synth/quartus/ + /data/ + /src/vhdl + /tb/vhdl + + + +9) Naming conventions + +All names and text must be in English. + +All names must reflect the use at the correct level. E.g. a general purpose +counter can be called common_cnt, while its instance is called u_rx_cnt. It +would be wrong to call the counter entity rx_cnt if it is in fact a general +purpose counter. + +Names like tmp, help, cnt2 are bad. In general if you can not give an object +(e.g. a signal, an entity) a proper name then that is a sign that you do not +(yet) have a clear view on what your design should do and how it should work. +Hence spending time on defining accurate names is an integral part of proper +design. + +The structure of a design consists of components and processes. A general +naming convention is to give interfaces within the structure a name and use +this interface name as pre, middle or post fix in the corresponding signal +names. Note that in this way the naming directly relates to creating a proper +design structure. + + +a) Language key words +The convention for manually written code is to use capitals for all VHDL key +words and small characters for entities, signals etc. Underscores are used to +seperate parts of a name, so eg. my_signal_name, proc_name. + + +b) Files, entities, architectures +Hierarchy within a module is represented via the VHDL file naming. Typically +the entity and architecture are kept in the same file. + + - The file name is always equal to the entity name, with the suffix .vhd. + - If a specific architecture is used, architecture filename ends with _a.vhd. + - General package filenames end with _pkg.vhd. + - Component declaration package filenames end with _component_pkg.vhd. + +For the VHDL architecture names we use the following +category names within /UniBoard/Firmware: + + - pkg = VHDL package + - str = Structure architecture containing only components + - rtl = RTL architecture containing Register Transfer Level code (i.e. + processes) + - wrap = Wrapper structure + - stratix4 = Wrapper structure containing Stratix4 specific components + - beh = Architecture containing behavioral code (e.g. for a test bench + model of an I2C sensor, a flash, an ADC, etc) + - empty = Empty architecture + - tb = Test bench architecture + +The two main architecture categories are 'rtl'and 'str'. Typically the top level +components consist of 'str' architectures and the lowest level components (the +'leaves') contain the 'rtl' architectures that actually define the function. +For external IP like from the MegaWizard a 'wrap' architecture hides these IP +architectures. In practise it can occur that RTL code needs to be combined with +instantitated components, so then the architecture name is a bit arbitrary. + +Putting the entity and architecture into separate files only seems useful when +the architecture is FPGA vendor specific and would cause problems if e.g. a +'stratix4' and a 'virtex6' architecture are both visible to the synthesis +tool. Keeping the package in a separate file e.g. st_pkg.vhd remains useful, +because it avoids unnecessary recompilations. The category name can then be +used in the VHDL architecture file name name by combining the entity name +with the architecture category as post fix, so: + + <entity file name> = <entity name>.vhd + <package file name> = <package name>_pkg.vhd + = <package name>_component_pkg.vhd + <architecture file name> = <entity name>_a_category.vhd + +Instead of this file name post fixes also braces () as with (pkg) and with +(architecture category) have been used in the file name, but it appears that +'make' under Linux can not cope well with using braces in the file names. +Therefore do not use braces () in the file names. + + +c) Library directory +Within a library of that has several files all files should start with the +same prefix. That prefix corresponds to the library directory name and is +typically also used for all items in a library_pkg.vhd file it that is used. + +For example some module called ST could look like: + + /modules/st/sim/modelsim/st.mpf -- modelsim project file + st/src/vhdl/st.vhd -- top ST entity with str architecture + st_pkg.vhd -- file name postfix '_pkg' + st_ctrl.vhd + st_ctrl_tx.vhd + st_calc.vhd + st_calc_a_str.vhd -- file name post fix '_a_' + + -- architecture name + st/tb/vhdl/tb_st.vhd -- test bench for st + tb_tb_st.vhd -- test bench of multiple tb_st + tb_st_calc.vhd -- test bench for st_calc + +In this example the '_tx' functionality in st_ctrl_tx is specific to st_ctrl, +therefore it is not put in a more general seperate module. A general type +of IO like an I2C interface would be kept as a seperate module so that it +can be used easily within other modules. + +This ST module may be used in different modules or designs, typically using +generics to adapt it to the design specific requirements (e.g. word width) + + +d) Constants and generics +All constant values must be identified by a name in order to easily search for +them. It is not allowed to used numbers directly in statements. Entity port +widths must be defined via generics, even if a range is fixed. This to clearly +distinghuis the meaning of the range value. For some port widths the generic +may even allow choosing another value dependent on the usage. Furthermore +constants and generics have to be clearly distinghuised from signals. Therefore +use prefixes: + +. c_ constant +. k_ constant +. g_ generic + +For example: + + CONSTANT c_word_sz : NATURAL := 4; + CONSTANT c_byte_w : NATURAL := 8; + CONSTANT c_word_w : NATURAL := c_byte_w*c_word_sz; + +The k_ prefix is rarely used, but it can e.g. be used as a reference for the +actual constant: + + CONSTANT k_sel : STD_LOGIC_VECTOR(c_max_w-1 DOWNTO 0) := "000"; -- c_max_w = 3 + CONSTANT c_sel : STD_LOGIC_VECTOR(c_actual_w-1 DOWNTO 0) := k_sel(c_actual_w-1 DOWNTO 0); -- c_actual_w = 2 <= c_max + +or e.g. as a local constant in a package that should not be used outside that +package. + +e) Variables +Similar variables have to be distinghuised from signals. Therefore use pre fix: + +. v_ variable +. v next value for local register record in Gaisler two-process method + + +f) Signals +Signals do not have to have a prefix or a post fix to identify them as signals, +instead for signals a pre or post fix is better used to clarify its functional +meaning: + +. _ack acknowledge +. _adr address +. _addr address +. _address address +. _arr array +. _avail available +. _bi bit index, or byte index +. buf buffer, memory +. _clk clock (used in a rising_edge process) +. clr clear +. cnt counter +. ctrl control +. _cpx complex +. _cplx complex +. _complex complex +. _cur current +. _d delayed signal +. _dat data +. _data data +. _dec decode +. _decode decode +. _depth depth, size +. _dis disable +. _dly delayed signal +. dp_ data path streaming interface signal +. _done done +. _ely early signal, equivalent to 'nxt_' and to the opposite of '_dly' +. _early early signal +. _en enable +. _enc encode +. _encode encode +. _eof end of frame +. _eop end of packet +. _err error +. _evt event +. _fevt falling event +. ff flipflop +. fifo first in first out +. hdr header +. i_ internal copy of an output signal +. in input +. io input/output +. _i internal auxiliary signal, where signal 'x_i' closely relates to 'x' +. _im imaginary +. _imag imaginary +. _late late signal +. _lat latency +. _latency latency +. _len length in e.g. bytes or data words +. mem memory +. _max maximum +. _min minimum +. _miso master in slave out +. _mosi master out slave in +. mm_ memory mapped interface signal +. _n active low +. _N negative pin of a differential signal +. nat natural +. nof_ number of +. next_ next functional value +. nxt_ next value of a register, e.g. q <= nxt_q +. ofs offset +. offset offset +. out output +. _org original +. _p pipeline delay (or early) +. _pp pipeline delay two clock cycles +. _ppp pipeline delay three clock cycles +. _P positive pin of a differential signal +. _phs phase +. pulse pulse +. prev_ previous signal value, equivalent to '_dly' +. r local register record in Gaisler two-process method +. ram memory +. rd read +. _re real +. _real real +. read read +. _rdy ready +. _reg registered signal +. _revt rising event +. req request +. rst reset +. rsl resolution +. rcv receive +. rx receive +. s_ state machine state enumerate name +. sel select +. _siso source in sink out +. _sosi source out sink in +. sl standard logic +. slv standard logic vector +. _sof start of frame +. _sop start of packet +. st_ state machine state name +. st_ streaming interface signal +. _size size, depth +. _sz size in e.g. bytes +. _sync synchronisation +. this_ this_ signal as related to some next_ signal +. tr transmit/receive, transceiver +. tx transmit +. _val valid +. _vec vector +. _wi word index +. wr write +. write write +. _w width in bits +. x times, e.g. clk_2x +. xmt transmit +. zdly z^(-1) DSP sample period delay + +For example: + + CONSTANT c_nof_mp : NATURAL := 1; + CONSTANT c_nof_rcu_per_mp : NATURAL := 4; + CONSTANT c_nof_rcu : NATURAL := c_nof_mp * c_nof_rcu_per_mp; + + SIGNAL tbb_frame_hdr : t_frame_hdr_arr; + +Usage examples: + +- A bus of _dat[], _val, _sync together indicates the data, whether it valid and + the relation to some external time. +- A bus of _hdr, _dat, _val, _sof, _eof can indicate a packet with header and + data in a stream of words. +- At the rising clock edge the q output of a flipflop or register becomes the d + input, therefore the d input is called nxt_q. The 'nxt_' is used as prefix for + all register d inputs and the clocked processing only contains q <= nxt_q + like assignments. + + +g) Types +. t_ type +. t_c_ a type for constant values, typically a record +. t_e_ an enumerate type +. t_*_enum an enumerate type +. t_reg local registers record for Gaisler two-process method (reg_type) +. _arr array with indexing (I) +. _2arr two-dimensional array defined as array of arrays with indexing (I)(J), index range J is fixed by the type +. _3arr three-dimensional array defined as array of arrays of arrays with indexing (I)(J)(K), index ranges J and K are fixed by the type +. _mat matrix, two-dimensional array with indexing (I,J) +. _matrix matrix, two-dimensional array with indexing (I,J) +. _cub cube, three-dimensional array with indexing (I,J,K) +. _cube cube, three-dimensional array with indexing (I,J,K) +. _rec record + +For example: + + TYPE t_frame_hdr_arr IS ARRAY (0 TO c_nof_rcu-1) OF t_frame_hdr_rec; + TYPE t_sl_matrix IS ARRAY(INTEGER RANGE<>, INTEGER RANGE <>) OF STD_LOGIC; + TYPE t_data_mat IS ARRAY (0 TO c_nof_tlen-1, 0 TO c_nof_input-1) OF STD_LOGIC_VECTOR(g_data_w-1 DOWNTO 0); + + +h) Processes, instances, generates +. p_ process name +. u_ component instance name +. gen_ generate name +. no_ 'no generate' name + +The clocked process that creates the registers can typically be called p_clk +or p_reg. The combinatoral processes get the name of their main output signal +or a name that reflects the task of that process. + +An component instance typically gets the component name preceded by 'u_' as +instance name, or an instance name that reflects the task of the component or +the main output of the component. + +Generate example: + + CONSTANT c_debug_no_cep : NATURAL := sel_sim_synth(g_sim, 0, 0); + no_cep : IF c_debug_no_cep /= 0 GENERATE + -- default signal assigments + END GENERATE; + gen_cep : IF c_debug_no_cep = 0 GENERATE + -- instantiate module cep + END GENERATE; + + +i) Procedures, functions + +Procedures and functions may have a pre or post fix. They may also use capitals +to separate parts of their name. Less common functions and procedure should +have these prefixes in addition to their package prefix: + +. func_<package_prefix>_... function prefix +. proc_<package_prefix>_... procedure prefix + +or + +. <package_prefix>_func_... function prefix +. <package_prefix>_proc_... procedure prefix + + +j) Packages + +All declarations in a package must have a package_prefix. This package_prefix +must be relatively short and sufficient to give a clue one where the item +is defined. Package prefix examples: + +- common_pkg -> no prefix, this is an exception, because this package is + so common +- common_mem_pkg -> '_mem_' +- dp_stream_pkg -> '_dp_' +- diag_pkg -> '_diag_' +- tr_nonbonded_pkg -> '_trnb_' +- ddr3_pkg -> '_ddr3_' + + +k) Do not use other HDL reserved words +Avoid using reserved words from Verilog in VHDL and vice versa. Also avoid +using then eg. as record fieldname. + + + +10) Coding conventions + +a) Default use descending bus order + +Descending (DOWNTO) bus order must be used whenever possible in the declaration +of arrays. Ascending (TO) bus order can be used for signals for which the +ascending number may have a specific significance (stages of a processing, +filter tap indexes, etc.). + +b) Default use named association for the generic map and port map +Avoid instantiating components using positional association for generic or +ports. Using named associations avoids hard to detect errors and improves +the readability. An exception can be a component instance in a wrapper, like +the technology independent IP wrappers in RadioHDL. + +c) Component instantiation +Default use entity instantiation to instantiate a component. This avoids the +need for component declarations (either locally or in a component package). + +For technology independent code component instantiation is used to support +multiple IP components from different vendors or FPGA families without having +to compile this IP, because only the plain VHDL component declaration needs +to be known. + +d) Only use IEEE.std_logic_1164 and IEEE.numeric_std libraries +Do not use deprecated numeric libraries: +- Do not use bit or bit-vector. Use only STD_LOGIC and STD_LOGIC_VECTOR for + bits and bit vectors. +- Do not use older (and unofficial) std_logic_unsigned and std_logic_signed. +- Do not used std_logic_arith. + +Note that the same type (e.g. SIGNED) from different packages are actually +different and not interoperable. Mixing different types cause compatibility +issues. + +e) Do not use the BLOCK statement +Because they are akward and not necessary. + +g) Do not use CONFIGURATIONs +Because they are akward and not necessary. + +h) Avoid embedded, tool-specific synthesis commands +Synthesis directives are best placed into constrain files. Generalized +tool-independent directives for synthesis commands are emerging. Use them +sparely and check they are supported in at least the major synthesis tools +and FPGA families. + +i) Use enumerate values for FSM states +Use enumerate values for Finite State Machine (FSM) states and defined them +with a 's_' prefix. Do not hard-code FSM State Vector values (VHDL), so do +not code them explicitly. + + + +11) File layout + +a) Copyright statement + +The file should start with a copyright statement that includes the date of +creation and the author(s) and the affiliation. For example for Astron: + +------------------------------------------------------------------------------- +-- +-- Copyright (C) 2011 +-- ASTRON (Netherlands Institute for Radio Astronomy) <http://www.astron.nl/> +-- P.O.Box 2, 7990 AA Dwingeloo, The Netherlands +-- +-- This program is free software: you can redistribute it and/or modify +-- it under the terms of the GNU General Public License as published by +-- the Free Software Foundation, either version 3 of the License, or +-- (at your option) any later version. +-- +-- This program is distributed in the hope that it will be useful, +-- but WITHOUT ANY WARRANTY; without even the implied warranty of +-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +-- GNU General Public License for more details. +-- +-- You should have received a copy of the GNU General Public License +-- along with this program. If not, see <http://www.gnu.org/licenses/>. +-- +------------------------------------------------------------------------------- + +b) Purpose and description +Next the file should contain a 'purpose' that summarizes the purpose of the +component in one or at most two sentences. The 'description' then provides a +more detailed description of the function of the component. Optionally there +can also be a 'remarks' section that lists some particularities of the +component. For test benchches there also should be a 'usage' section that +shows how to run the testbench in simulation and that briefly describes the +expected result. + +-- Purpose: +-- Description: +-- Remarks: +-- . +-- . +-- Usage: -- for test benches +-- > as 10 +-- > run -all +-- The tb is self stopping and self checking + + +c) Put entity, architecture and configuration in same file +Use a single file for completely describe a module. Possible exception if +completely different architectures must be used e.g. for different logic +families. Limit this situation to a few low level IP modules. + +Default there is only one entity and architecture in the file. Exception can +be: +- Auxiliary entities that are instantiated only in this file. The top level + entity corresponds to the file name and needs to appear last in the file + due to the compile order. Eg. dp_repack_data.vhd which uses local entities + dp_repack_in and dp_repack_out. +- Multiple architectures that to show different implementation schemes. The + last architecture is the default. Eg. uth_rx.vhd and uth_tx.vhd. + +d) Port order +Ports are listed in logic groups, with clock and reset first. Typically first +list the streaming inputs and then the streaming outputs. The MM control ports +are also grouped. In case of multiple clock domains the clock may be listed +close to its group or central at the top. + +e) Declaration order +In the declaration section of an architecture, if possible place declarations +in the following order: + +• Local types +• Local constants +• Local functions, procedures +• Signals + +The detailed declaration order should as much as possible follow the functional +flow from component input to output and from begin to end of the architecture. + + +f) Avoid mixing structural and RTL description code in the same architecture +In general a architecture contains only RTL code (ie. with process statements) +or it only instantiates other components to create structure. An exception +is eg. the use of some low level components from common library in an 'rtl' +architecture. Keeping only structural instantiations in a 'str' architecture +makes the functionality more clean and therefor more clear. + + +g) Comment +Typically the HDL code needs to be self explanatory, i.e. properly set up, so +there is no need for comment. The purpose of the module can be described in the +HDL source (e.g. above the entity) or in a seperate design document. + +In some cases inline comment is needed though to help the reader. Try to write +the comment such that it is still valid even if thecode is changed. Care must +betaken that the comment is up to date and accurate. This is often what is not +the case, because most designers will forget to update the comment as well when +they modify the code. This is a good reason to minimize the amount of comment, +because inaccurate comment is worse than no comment. + +In any case the comment should not explain the VHDL syntax (the assumption is +that the reader knows VHDL). Comment should also not be used to disable +obsolete code, because we have SVN to keep the obsolete data. In general +comment should also not be used as a compile option, because for that generate +statements should be used. + + +h) Indent and alignment + +Default use a fixed indent of 2 spaces. VHDL has BEGIN and END statements that +group a section so 2 spaces indent is sufficient, however eg. 4 spaces is also +allowed. Within one file the indent should be fixed. Take care of proper +alignment, e.g. all declared signals should be aligned at the colon ':'. + + +i) TABs and spaces - Do not use TABs +Using TABs is not allowed in the source code, because they have different sizes +in different editors which can disturb the layout. Instead set the editor to +fill in 2 spaces when the TAB key is used. Note setting the TAB size to 2 spaces +is not enough, set TAB such that it truely prints 2 spaces. + + +j) Line length +For the purpose and description try to keep the comment line lenght <= 80 to +allow printing it in courrier on an A4 without line wraps. For the rest of the +file the line length can be larger up to about ~200, such that they still fit +on a screen. + +k) Use separate line for each statement + +l) Place each declaration on a separate line + + + + +12) Standard packages + +For signed and unsigned only use (so do not use other packages with similar +functions): + + LIBRARY IEEE; + USE IEEE.STD_LOGIC_1164.ALL; + USE IEEE.NUMERIC_STD.ALL; + + + +13) Use of records + +Using records makes signal interfaces more clear. Typically the input signals +of an interface are grouped into one record and the output signals of that +interface are grouped into another related record. Defining too many different +types of records also makes the VHDL less readable, because it is then not +easy to remember what each record type means. For the main Memory Mapped (MM) +and Streaming (ST, also called DP for data path) interfaces it appears possible +to define just a few standard record types that can be used in all MM and ST +components. These records are defined in: + +- MM : common_mem(pkg).vhd +- ST : dp_stream(pkg).vhd + + + +14) Use of two-dimensional arrays + +Multi-dimensional arrays can be declared and indexed in several ways: + +a) As matrix of elements -> index(I,J): + TYPE t_sl_matrix IS ARRAY(INTEGER RANGE<>, INTEGER RANGE <>) OF STD_LOGIC; + TYPE t_data_matrix IS ARRAY (0 TO c_nof_tlen-1, 0 TO c_nof_input-1) OF STD_LOGIC_VECTOR(g_data_w-1 DOWNTO 0); + + SIGNAL a_mat : t_sl_matrix(0 TO c_nof_tlen-1, 0 TO c_nof_input-1); + SIGNAL d_mat : t_data_matrix; + +b) As array of arrays -> index(I)(J): + -- The element range needs to be set in the TYPE. + TYPE t_sosi_2arr IS ARRAY (INTEGER RANGE <>) OF t_dp_sosi_arr(0 TO c_nof_input-1); + SIGNAL a_sosi_2arr : t_sosi_2arr(0 TO c_nof_tlen-1); + +c) As a 1-dimensional array --> index( I * length(J) + J) + +d) As a fixed number of declared 1 dim arrays --> index I in name and J in arr + SIGNAL input_arr0 : t_dp_sosi_arr(0 TO c_nof_input-1); + SIGNAL input_arr1 : t_dp_sosi_arr(0 TO c_nof_input-1); + SIGNAL input_arr2 : t_dp_sosi_arr(0 TO c_nof_input-1); + SIGNAL input_arr3 : t_dp_sosi_arr(0 TO c_nof_input-1); + +Of course option d) is less nice, because the number of I arrays is fixed in +declared row names. However for some cases it may be appropriate. Options a) +and b) are the true 2-dim arrays. Option a) has the disadvantage that the +elements can only be assigned per element, it is not possible to assign all J +with an 1-dim array of equal length. This is possible with option b), but +option b) has the disadavantage that the length of the element array must be +known at the TYPE definition. Typically option b) then requires to define this +2-dim array in a module package, because only then the TYPE can be used in +multiple files. E.g.: + + c_<module_name>_nof_j_max = 32; + TYPE t_<module_name>_i_j_2arr IS ARRAY (INTEGER RANGE <>) OF t_dp_sosi_arr(c_<module_name>_nof_j_max-1 DOWNTO 0); + + out_sosi_2arr : t_<module_name>_i_j_2arr(g_nof_i-1 DOWNTO 0, g_nof_j-1 DOWNTO 0); + +An alternative can be to not use a 2-dim array like a) or b) but instead use an +aggregate 1-dim array like c). E.g.: + + out_sosi_arr : OUT t_dp_sosi_arr(g_nof_i*g_nof_j-1 DOWNTO 0); + +The advantage of such an aggregate 1-dim array is that it can be entirely set +by means of generics. The disadvantage is that the 2-dim indexing is a bit +more complicated and that the 2-dim properties of a signal do not show as such +in a Wave Window. A way around is to internally still declare a local signal +of a true 2-dim array type like a) and use that to monitor the 1-dim port +signal. E.g. for port out_sosi_arr above this then yields out_sosi_matrix below: + + SIGNAL out_sosi_matrix : t_integer_matrix(g_nof_i-1 DOWNTO 0, g_nof_j-1 DOWNTO 0); + + p_wires : PROCESS(out_sosi_arr) + BEGIN + FOR I IN g_nof_i-1 DOWNTO 0 LOOP + FOR J IN g_nof_j-1 DOWNTO 0 LOOP + out_sosi_matrix(I,J) <= out_sosi_arr(I*g_nof_j+J); + END LOOP; + END LOOP; + END PROCESS; + + + +15) Use of constants, generics and packages + +How to define a constant and where depends on the function of the constant. +Constants can be define at: + +1) Locally in the architecture +2) As default value of a generic that typically does not need to be changed +3) As a global constant in a package that can be used directly in any + architecture that uses this package +4) As a global constant in a package that is used as default value for a + generic + +The advantage of using generics is that the parameter can be changed by the +block that instantiates it. If the generic is propagated to up the hierarchy +then it can be controlled in a test bench, which is useful to verify multiple +generic settings in simulation without having to edit the code. + +The disadvantage of using generics to pass on constant values is that this +can bother the user with detailed knowledge of the component. + +Constants in a package are useful to make them known in lower hierarchy blocks +wthout having to pass them on via the entity generics. The limitation of +constants in packages is that they can only be changed by editing the file. + + + +16) Functions + +Functions have input arguments and return a single value. Functions cannot +wait for a clock, so they work combinatorially (= immediately). Functions +are useful to: + - derive constant or generic values that depend on input conditions + - map a signal to an other signal, eg. to change the order in an array + +For many examples of reusable functions and some procedures see: + - common_pkg.vhd + - dp_stream_pkg.vhd + + + +17) Procedures + +Procedures have input, output or inout arguments and no return value: + + - A procedure with only inputs seems not useful. + - A procedure can drive one or more outputs. Procedures can wait for a + clock, so they can output (generate) a sequence in time. This is useful + in a test bench. + - Using an inout argument is useful if the procedure needs to maintain some + storage, eg. the previous value of the current output. The storage cannot + be defined inside the procedure. The inout argument is connected to a + a signal that is declared outside the procedure to act as the storage. + +When using in, out and inout arguments the difference between a procedure +and an entity becomes small. The advantage of an entity is that it can have +internal state (storage) and components are more intuitive for creating +schematic structure and hierarchy. A prodedure can be synthesized, but +typically procedures are more used in test benches and entities are more used +in synthesizable code. + +The signal connected to a procedure output can be an array(I) element +provided that the index I comes from a GENERATE statement. If the index I +comes from a LOOP statement then the compiler gives Error: "(vcom-1450) +Actual (indexed name) for formal "output argument" is not a static signal +name" + +For many examples of reusable procedures and some functions see: + - tb_common_pkg.vhd + - tb_dp_pkg.vhd + + + +18) Test benches + +A device under test (DUT) can be a component, a module or a complete VHDL +design that will run on the FPGA. In general all components, modules and +designs should have a VHDL test bench to verify the working of the it. +Consider that if you do not make a test bench for some component then you +should not have made that component. If it is not worth testing it is not worth +implementing. This rule can help development to focus on the functionality that +is realy essential for the design. + +Test benches are equally important as the design itself. Therefore the test +bench code should also be well structured. + +Test bench code does not get into the FPGA, so therefore it is kept in a +separate directory together with behavioral VHDL models of some peripherals +(like e.g. an ADC or an I2C slave). + +The testing of a DUT in simulation is called 'verification'. The testing of a +DUT on FPGA hardware is called 'validation'. + +a) Verifying a DUT + Test benches provide the envrionment in VHDL to verify a device under test + (DUT). The DUT can be anything between low component or an entire system. + The test bench environment consists of: + + - Stimuli that drive inputs of the DUT (e.g. clk, data) + - The DUT + - Behavioral models that model a external device (e.g. a sensor, an ADC) + - Verification that check outputs of the DUT + + The verification can be: + + - Monitoring (e.g. manually observing a signal in the Wave window) + - Logging to a file that can be check by means of a diff with a golden + reference file. + - Self checking (comparing the DUT output with pre-calculated reference + data or with stored golden reference data) + +b) Test bench interface packages and components + The following packages provide useful procudures for making test benches: + + - tb_common_pkg - General flow + - tb_common_mem_pkg - MM interfacing + - tb_dp_pkg - DP interfacing + + The tst/ module provides file IO. + +c) Multi-level test bench + A test bench can have generics that allow the multiple variations of the + test bench to be instantiated in a higher level test bench. This multi-test + bench can then simulate these variations in parallel and serve as a + regression test for the DUT. + These DUT regresssion test benches can again be instantiated in yet another + test bench to have a system level regression test bench. + + For self stopping test benches the tb_end signal needs to be output via the + test bench entity, such that the multi-level test bench can issue its tb_end + when all tb instances have raised their tb_end. + +d) Self checking and self stopping + The preferred scheme is a self checking and self stopping test bench that + reports errors if they occur and runs as long as necessary. This is useful + when the test bench is ran manually and moreover it prepares the test bench + to be used in a regression test. + + Errors can be reported via ASSERT or REPORT with SEVERITY ERROR. + + A simulation of a test bench can be stopped automatically be stopping all + toggling when the stimuli have finished. Typically all toggling can be + stopped by making tb_end <= '1' and using tb_end in the clock statement, eg: + + clk <= NOT clk OR tb_end AFTER clk_period/2; + + For some IP stopping the clocks and applying the resets using tb_end is not + enough to stop the simulation, apparently because some process still keeps + on free running (eg. a VCO in a PLL). In those cases the simulation can be + forced to stop by asserting a FAILURE: + + REPORT "Tb simulation finished." SEVERITY FAILURE; + + The disadvantage of this scheme is that it can be confusing that the test + went OK but finishes with a failure. A better solution is therefore to + still tb_end <='1' to signal the end of the simulation, but use the Modelsim + 'when' command to actually stop the simulation. The when command can be + issued only when a simulation is loaded: + + when -label tb_end {tb_end == '1'} {echo "Tb simulation finished" ; stop ;} + +e) Python test case using MM interface + For DUT with an MM interface the stimuli and verification can be done using + a Python test case. The Python test case then communicates with the DUT + in Modelsim via file IO. + + The Python test case can stop the simulation by raising tb_end via a + dedicated file IO handler, eg similar as used for polling the simulation + time. + + + +19) DP streaming component development example + +a) DP component example: dp_packet_merge.vhd + The description in dp_packet_merge.vhd explains how a DP streaming component + with flow control can be developed. It also explains how to do this within + the Gaisler two-process method. +b) DP test bench example: tb_dp_example_no_dut.vhd + The tb_dp_example_no_dut.vhd provides an example test bench without DUT that + describes and shows how a streaming DP component can be verified using + stimuli procedures and verification procedures from tb_dp_pkg. + + + +20) Simulation and synthesis debugging + +a) Compile errors +Most compile errors are easily fixed. Fix the first 1 or 2 errors first and +then recompile, because some more errors may then disappear as well and new +error may appear. Sometimes it is more difficult to identify the cause of a +compile error. Then first carefully read the error message and eg. use Modelsim +'verror <error message number>' and/or Google (part) of the error message +to get more info. If the cause still is notclear then try to isolate the cause +by commenting out more and more parts of the code until the error disappears. + +b) Simulation error +Make sure that the simulation has loaded the implementations of all components +(eg. there are no empty black boxes and all memory initialization files have +been found). Most simulation errors can be debugged by tracing the signals in +the Wave Window and through the source code. It is almost never necessary use +break points and to step through the VHDL code. + +c) General debugging +- Try to consider a bug as a valuable insight into how the design is (not) + working. When the bug is fixed the design will be better then it was before. +- Discuss the debug steps with one or more colleges to get fresh ideas and to + avoid misconceptions and blind spots +- Narrow down the problem till it disappears and then build up the design again + till it reappears. +- Make step by step changes, that are sufficiently small to isolate the + problem. +- Compare the design that fails with a working design (eg. a previous version) + and identify the differences. +- In case of IP related bugs carefully read the manual (again), search the web + for reports on how to tackle IP problems +- Stay calm and carefully observe the results of each debug step to determine + the best next step +- A bug that takes longer than expected to solve can feel like a burden (eg. + due to project time pressure, due to lack of progress, due to incompetence), + be open about this when discussing this with collegues ('sof' = share our + feelings). + +d) Hardware debugging +A device under test (DUT) can be simulated using a VHDL test bench. With proper +models of the DUT environment the simulation of DUT should use the same result +as when running the synthesized DUT on FPGA hardware. If there does occur a +mismatch between simulation and hardware behaviour then check the following: + +- In case of a new board: + . are the chips mounted correctly and are the chip type numbers correct + . are the values of e.g. termination resistors and capacitors correct +- In case of a new FPGA: + . First try all FPGA interfaces using Xilinx/Altera reference designs that + have been shown to work on development boards that use the same FPGA + . Ask on-site help of an Xilinx/ALtera engineer to speed up learning curve + on debugging the FPGA IO (especially transceivers and DDR memory). + . is the FPGA type used for synthesis correct? + . do you use the latest version of the synthesis tools, because with new + FPGAs there may be important tool updates +- Is the still board oke, ie is eg unb_minimal still working? Are the board + power supplies oke and if nescessary are external clock and PPS connected? +- If the design did work and now does not anymore recall what source files were + modified. +- If an old design still worked then use a clean SVN check out and a clean + build to go back to this design. If this design works then go to newer + version in SVN and try that on hardware. Continue this kind of binary search + until the SVN version that works and the next that does not have been found. + Check the SVN log and the SVN diff of the file(s) that differ to determine + how these diff can cause the malfunctioning. +- Be precise in noticing what exactly is not working without jumping to + conclusions too soon. First observe and analyse the bug to obtain as much + clues as possible. +- Is the environment model sufficiently accurate? If feasible try to improve + the model. +- If the simulation works but on hardware it fails, then check that all + generics and derived constants that are different on hardware have scaled + accordingly. Maybe a generic or derived constant is hardcode and keeps its + simulation value, which is then wrong for hardware. + generic is hardcoded and keeps its model value instead +- Check all warnings in the synthesis and fitter reports. +- Does the synthesized DUT pass the timing constraints? If not then first fix + this. +- Is the resource usage per instance as expected or is there functionality + that gets optimized away unitentionally? --> search for 'away' in the + synthesis report. +- Is the pin definitions file correct and complete? +- Are the RAM initializations files found by the synthesis tool? --> eg. for + the Nios program, a BG, the coefficients of a FIR filter +- Are the constraint files found and interpreted correct by the synthesis tool? +- Does the RTL code have special VHDL constructions that migth be interpreted + differently by the synthesis tool (e.g. comparing std_logic_vectors of + different length --> better compare them as unsigned values)? +- Are the process sensitivity lists complete (no used inputs missing)? --> + search for 'sensitivity' in the synthesis report. +- Are there no latches in the design? --> search for 'latch' in the synthesis + report. +- There should not be any combinational loops in the design, --> search for + 'combinational loop' in the fitter report. +- Are there uninitialized signal values that may be interpreted differently in + synthesis? --> eg. use the c_dp_siso_rst/rdy/hold/flush constants to default + assign a record signal rather than assigning each record field individually + to avoid missing a field. +- Are the RAM block initializations the same for both the simulation and + synthesis? +- Are the correct clocks and resets connected to the different clock domains? +- In case of occasional failure on hardware maybe a clock domain crossing is + not done properly. +- Are the signal (pulses) and busses transferred properly accross clock + domains? +- Does the (software) functionality take account of the latency that it takes + to let a signal reliably cross a clock domain? +- For DDR3 and gigabit transceiver IP there may be an IP toolkit option to + observe IP internal status via JTAG +- For bugs that take weeks to solve consider asking help from an external + (Altera or Xilinx) engineer that knows the IP. + +If this all does not help to identify the cause then: +- Add some more DUT signals to a monitor MM register that can be accessed via + the 1GbE control interface +- Do a functional simulation of the post synthesis netlist of the design (ie + without the timing, but with the FPGA logic) to verify whether synthese has + interpreted the RTL the same as the simulator. +- Use e.g. Altera SignalTap or Xilinx ChipScope to add an embedded logic + analyser to the DUT that can be accessed via JTAG. -- GitLab