Skip to content
Snippets Groups Projects
Commit 2984cd65 authored by Eric Kooistra's avatar Eric Kooistra
Browse files
parent 8725aca0
No related branches found
No related tags found
1 merge request!28Master
Author: Eric Kooistra, jan 2018
Title: Status of FPGA firmware devlopment at DESP
Purpose:
- Explain how we currently develop FPGA firmware at DESP
1) Develop FPGA hardware boards
- Review board design document and schematic, so that the board will not contain major bugs and
so that firmware engineers can already learn about the board and get familiar with it
- Pinning design to verify schematic
- Vendor reference designs to verify the IO
- Heater design to verify the cooling and the power supplies
- Board architecture:
. RSP ring with 4 AP (with ADC) and 1 BP
. UniBoard1 mesh with 4 BN (with ADC) and 4 FN, 4 transceivers per 10GbE, DDR3
. UniBoard2 4 PN, 1 transceiver per 10GbE, DDR4
. Gemini 1 FPGA, 25 Gb transceivers, DDR4, HBM
2) Technology independent FPGA:
- Wrap IP (IO, DSP, memory, PLL) --> needed for board_heater design, board_minimal, board_test
- Use to support:
* different vendors:
. Xilinx (LOFAR, SKA CSP Low)
. Altera (Aartfaac, Apertif, Arts)
* FPGA type and sample versions
* synthesis tool versions
3) Board firmware
- board_minimal design that provides control access to the FPGA board and the board control functions.
uses the monitoring and control protocol via the MM bus (UniBoard using Nios and UCP, Gemini using hard
coded Gemini protocol)
- board_test design that contain the minimal design plus interfaces to use the board IO (gigabit transceivers
, DDR)
- board library back, mesh models
- pinning files
4) Oneclick
- OneClick is our umbrella name for new ideas and design methods ('ideeen vijver'), focus on firmware specifcation and design.
New tools that are created within OneClick may end up as part of the RadioHDL envirionment. This has happened for example
with ARGS. Oneclick is about 'what' we could do, RadioHDL is about 'how' we do it.
- The name OneClick relates to our 'goal at the horizon' to get in rone click from design to realisation.
- Automate design flow
- Array notation (can be used in document and in code --> aims for simulatable specification)
- Modelling in python of data move, DSP and control
- We now work with data move libraries, but could we not better program these adhoc in 1 process?
5) RadioHDL
- Board toolset (unb1, unb2a, rsp, gmi, etc) to manage combinations of board version, FPGA version, tool version
- RadioHDL is our umbrella name for set of tool scripts that we use for firmware development, focus on implementation.
- The name RadioHDL covers HDL code for RadioAstronomy as a link to what we do at Astron. However by using only the word Radio we keep
the name a bit more general, because in fact the RadioHDL tool scripts can be used for any (FPGA) HDL development.
- Automate implementation flow (source --> config file --> tool script --> product, a product can be the source of a next product)
- Organize code in libraries using hdllib.cfg
- Manage tool versions using hdltool_<toolset name>.cfg
- Create project files for sim and synth
- ARGS (Automatic Register Generation System using MM bus and MM register config files in yaml)
. add more configuration levels:
peripheral configuration yaml
fpga configuration yaml
board yaml (board with 1 or more FPGA)
application (application image on one or more FPGA)
system (one or more application images that together form the entire FPGA system)
. add constants configuration yaml --> to define terminology and parameter section in specification document and to use
these also in firmware and software.
- Create FPGA info (used to be called system info) address map stored in FPGA to allow dynamic definition of address maps. The definition
of the MM register fields is kept in files because it typically remains fixed.
- Easily enroll the environment on a new PC and introduce a new employee (to be done)
6) VHDL design:
- Clean coding
- Reuse through HDL libraries
- Standard interfaces: MM & ST, support Avalon, AXI using VHDL records mosi/miso, sosi/siso
- Build FPGA appliciation design upon a board minimal design and the relevant IO from the board test design
- dp_sosi :
. data, re/im : real or complex data
. valid : strobe to indicate clock cycles that carries valid data samples, not needed for ADC input
. sop, eop : strobes to indicate start of packet and end of packet for blocks of data
. sync and bsn : timing strobe, block sequence number is timestamp, alignment of parallel streams
. channel : valid at sop to multiplex multiple channels in one stream
. empty : valid at eop
. error : valid at eop
- dp_siso:
. ready : backpressure flow control per data valid, only used for components that realy need it to avoid complexity and
to ease timing closure. The ready can be pipelined with dp_pipeline_ready.vhd. The ready flow control is e.g.
used to insert a header in front of data blocks to create a packet.
. xon : backpressure flow control per block of data. The xon flow control is used to stop the input source to avoid
overflow internal FIFOs. Together these FIFOs must at least be capable to store the current blocks. Our
applications are data driven, so making xon low will cause data to be dropped. For an application that read
data from a disk like in the all data storage systems, the xon can be used read the disk as fast as possible
by applciation, so DSP driven !!!.
- Synthesis tool ensures that the logic per clock cycle is reliable, we have to ensure at functional level that only
complete blocks of data are being passed on !!!:
. Incomplete blocks must be dropped at the input
. FIFOs should never overflow and should not be reset. Avoid overflow by using xon. Clear a FIFO by reading it empty
- Streaming data versus store and forward !!!
. dp_bsn_aligner.vhd, aligns input streams using the BSN
- The resource usage of the dp_bsn_aligner in Apertif Correlator (14 dec 2018) is:
nof ALM FF
streams align MM align MM
input 3 502 213 657 319
mesh 8 1162 544 1346 784
- Fill level of dp_bsn_aligner input FIFOs in Apertif Correlator (18 dec 2018) measured with util_dp_fifo_fill.py is:
fifo size max (min/max) used (min/max)
1st time 2nd time
input (3+5)*176 (or 180) = 1408 178/762 178/504 139/428
mesh (4+3)*88 (or 120) = 618 159/516 150/262 64/255
. dp_sync_checker.vhd, detect incomplete sync intervals, these are recoved using data from the next sync interval, so
the next sync interval will get lost. To avoid this would require to store and forward the data of a sync interval
because then it is possible to fill in missing blocks with dummy data. With store and forward it is also possible
to recover block order if necessary. The disadvantage of store and forward is latency and memory. Store and forward
is the general concept for how software operates (on CPU and GPU).
This scheme of sync interval recovery is only acceptable if dropped packets occur very rarely, because if one
stream has a dropped packet then the output of the BSN aligner and sync checker will drop a sync interval. E.g.
apertif X needs to aligne N_dish * N_pol = 12*2 = input streams and sync interval = 1.024 s. These input streams
come from 10GbE links. A bit error rate of 1e-10 means 1 bit error per s per link. A bit error will cause CRC error
and assume that then the packet gets dropped, this then causes that the BSN aligner cannot align that block and
that will cause that one sync interval gets corrupted and the next will get lost. After that the BSN aligner will
have recovered. Suppose this should only occur once per 8 hour observation = 28800 s. So with 24 links the BER
per link should then be less than 1e-10 / 28800 / 24 ~= 1e-16 or 1e-17, so only 1 per month.
. dp_packet_rx.vhd, ensure that only complete packets enter the FPGA
. FIFO overflow is a bug, as serious as a FPGA logic error
- Pass on sosi.info fields along a function that only needs data and valid
. dp_fifo_fill --> use FIFOs to delay sop info and eop info with variable latency
. dp_paged_sop_eop_reg --> use array pf of register pages to delay sop info and eop info by fixed latency. If
the latency is many sops or if only sync and BSN need to be passed on, then consider
using dp_block_gen_valid_arr
. dp_block_gen_valid_arr --> recreate sync, local BSN, sop, eop based on valid and pass on global BSN at sync or at
all sop. Usefull if the latency is >= 1 sync intervals or many sops.
- Component improvements:
. Verify flow control in tb of dp_offload_rx and dp_offload_tx_dev (wrapper of dp_concat_field_blk.vhd)
. reorder_matrix.vhd with timestamp accurate page swap
. dp_fifo_fill_eop.vhd : fill FIFO with one block instead of some number of words to avoid that FIFO cannot be read empty
. dp_bsn_aligner.vhd:
- A dp_bsn_aligner without flow control would make it much simpler.
- A further simplification is to make a dp_sync_aligner that only can recover alignement at a sync, instead of at
every sop (via the BSN).
- Instead of xoff_timeout it is also possible to wait until the FIFO has been read empty for all inputs.
- Timing and sync intervals
. At the ADC input the BSN timestamps are attached to the block data. The block size for the BSN depends on the length of
the FFT. This BSN relates the data to UTC. MAC initializes the BSN relative to 1 jan 1970.
. With ADC clock of 800MHz and FFT size of 1024 this yields 800M/1024 = 781250 subbands per sec. We process the data at
200MHz so we have 4 streams in parallel, each with 781250/4 = 195312.5 blocks per sec. In LOFAR we has also such
a situation and there we define odd and even second sync intervals. The even interval then has 195313 blocks and the
odd interval than has 195312 intervals. This was awkward for control. In Apertif a similar fractional block issue
occured in the correlator with 781250 / 64 = 12207.03125 channels per second. Therefore for Apertif we increased the sync
to 1.024 s, such that we have 800000 / 64 = 12500 channels per sync interval. Now we do not have
even and odd seconds anymore but still this 1.024 s sync interval is also akward because it does not align with the
1 s grid that human use and that also other parts of the telescope use.
Possible solutions for future systems would be to use a sampling frequency that is a multiple of the FFT size, so
e.g. 809.6MHz with FFT size = 1024, or 800MHz with FFT size = 800. These schemes have the additional advantage that
then the subband bandwidth is 1 MHz which fits the typical band width grid in VLBI and it also fits the fact that the
Apertif LO can be tuned in steps of 10MHz. With subband bandwidth of 781250 Hz only once every 50 MHz the subbands
align with the 10MHz grid, because 64*781250 = 50M.
. Using an oversampled filterbank introduces yet another block grid. For example with 32/27 and an FFT block size
of 1024 the oversampled block size becomes 1024 * 27/32 = 864. This oversampled 864 block grid only aligns with
the 1024 block grid once every 27 blocks of size 1024. For Apertif the 781250 blocks of 1024 align with the 1
sec grid, but the 32/27 oversampled blocks will only align every 27 sec. Hence with oversampling it is necessary to
accept that it becomes impossible to main block alignment within a 1 second grid.
. In APERTIF the misalignment between the channel period and the one second grid was avoided by defining a sync
interval of 1.024 s and use that sync interval as integration period. A sync interval of 1.024 s for LOFAR would mean
that a sync interval contains 160000 blocks at f_adc = 160M and 200000 blocks at f_adc = 200 MHz. However if other
parts of the system rely on a one second or e.g. ten second grid, then using a 1.024 second grid does not fit well
with those parts. Using an oversampled filterbank introduces yet another block grid. For example with r_os = 32/27
and an FFT block size N_fft = 1024 the oversampled block size becomes M_blk = 1024 * 27/32 = 864. This oversampled
M_blk = 864 block grid only integer aligns with the one second grid once every 27 seconds, because 200M / 864 * 27
= 6250000 and 160M / 864 * 27 = 5000000 yield an integer. The alternative would be to define a sync interval that is
an integer multiple of M_blk and close to 1 s. Preferably T_int is the same for f_adc = 200M and 160MHz. The ratio
160M / 200M = 4 / 5, so choose the sync interval to be a multiple of 4 * 5 * 864 = 17280 blocks. This then yields
e.g. ceil(200M / 17280) * 17280 = 200016000 and ceil(160M / 17280) * 17280 = 160012800, which both correspond to
T_int = 1.00008 s exact. However LOFAR 2.0 needs to be compatible with LOFAR 1.0, so the fact that 1.00008 != 1
will cause misalignment regarding the statistics like SST, BST, XST from a LOFAR 1.0 station and a LOFAR 2.0
station. Furthermore to read the statistics and update the BF weights the LCU needs to keep track of the 1.00008 s
grid. Therefore it is best to keep the one second grid and accept that some sync intervals contain 1 block more than
the other sync intervals. For the critically sampled filterbank as in LOFAR 1.0 with r_os = 1 this yields
200M / 1024 = 195312.5 blocks per second on average, so the number of blocks per sync interval then repeats with
period 2 s as: 195312 + 0,1. For the oversampled filterbank with e.g. r_os = 32/27 and M_blk = 864 this would yield
200M / 864 = 231481.481 blocks per second on average, so the number of blocks per sync interval then repeats with
period 27 s as: 231481 + 0,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,1 because 13 / 27 = 0.481. The
variation in number of blocks per sync interval is sufficiently small, 1/231481 = 4.3e-6, such that it does not
significantly affect the accuracy of the statistics values per sync interval.
- Flow control
. dp_siso : ready and xon
- Useful libraries and packages:
. base: common, dp, mm, diag, reorder, uth
. dsp: wpfb, bf, correlator, st
. io: eth, io_ddr, i2c
7) Applications:
- Build upon reused libraries.
- New functions are first added as libraries and then used in the application
- Qsys only used for the MM bus generation
8) VHDL testing:
- detailed unit tests per HDL library using entity IO
. verify corner cases
. often use stimuli --> DUT --> inverse DUT --> expected results
e.g.
rx - tx
encode - decode
mux - demux
. sometimes the same component can suppot both directions:
dp_repack
dp_deinterleave
- integration top level or multi FPGA tests using MM file IO
. MM file IO for testbenches at design level, 'breaking the hierarchy' in VHDL or providing access to Modelsim simulation with Python
. do not test the details those must be covered in the unit tests
- regard the firmware as a data computer, so independent of its functional (astronomical) use we need to verify and validate that for a
known stream of input data it outputs the expected output data.
- detailed unit tests per HDL library using entity IO
- integration top level or multi FPGA tests using MM file IO
. MM file IO for testbenches at design level, 'breaking the hierarchy' in VHDL or providing access to Modelsim simulation with Python
- regard the firmware as a data computer, so independent of its functional (astronomical) use we need to verify and validate that for a
known stream of input data it outputs the expected output data.
- Verification via simulation:
. use of g_sim, g_sim_record to differentiate between simulation and hardware
. use g_design_name to differentiate between revisions, e.g. to speed up simulation or synthesis
. behavoral models of external IO (DDR, Transceivers, ADC, I2C)
. break up data path using WG, BG, DB, data force
. optional use of transparant DSP models to pass on indices.
. verify data move by transporting meta data (indices) via the sosi data fields
. profiler to know time consuming parts
- VHDL regression test (if not there, then it is not used)
- Validation on hardware
. using Python peripherals for MM control using --cmd options per peripheral
. construct more complicated control scripts using sequence of peripheral scripts and --cmd
. we need proper data capture machines, to validata 10G, 40 GbE data output (e.g. using wireshark and some Python code)
9) Documentation
- Documentation is needed to specify what we have to make
. Detailed design document uses array notation to cleary describe all internal and external interfaces
. Detailed design document also identifies test logic that is needed for the integration top level tests
- No need to document what we have made, except for readme file and manuals
- The code is self explanatory (with comment in docstring style using purpose and description)
- The project scripts identify what is relevant for a product
- The regression tests identify what is relevant code (if it is not tested it is not important and should not have been made)
- It would be nice to have YouTube movies that show our workflow and boards
10) Project planning
- Wild ass guess based on time logs of previous projects
- System engineering design approach for total product life cycle
- Agile style with scrum and sprints
. If it is not an allocated epic/story/task in Redmine then it will not be done
- Roles within the team
. System architects remain actively involved during entire project to ensure that design ideas are preserved or
correctly adjusted
- Whiteboard meetings to steer detailed design
. with wide team to get common understanding and focus
- Definition of done
- What maintenance support do we provide after a project has finished
. firmware tends to become hardware in time, ' het verstaft'
. using virtual machines (dockers) to bundle a complete set of operating system, tools and code for the future or to
export as a starting point to an external party (e.g. for outsourcing)
11) Ethernet networks
- 1GbE, 10GbE, 40GbE, 100GbE IP
- Knowledege of switches
- Knowledege of UDP, IP, VLAN
- Monitoring and Control protocol (UniBoard, Gemini)
- Streaming data offload
12) Outreach, papers, collaborations, recruiting
- Oliscience opencores
- NWO digital special interest group
- student assignments
- Write paper on ARGS (done by Mia @ CSIRO)
- Write paper on RadioHDL (= also intro paper / user guide for RadioHDL on OpenCores)
- Write paper on RL = 0 coding style with state reg and pipeline reg clearly separated. The design should also work
without pipeline. Possibly the pipelining should be added automatically and only where needed.
13) DESP pillars
- All data storage
14) Externe info
* Technolotion in B&C 2019/2
- own IP libraries
- self-checking testbenches made by developer
- own HDL implementation of opensource Risc-V-Processor (instruction set architecture) can run Linux
- generic build server for simulation and synthesis (with all tool versions)
- regression test using nightly build
- version control using GIT (merge request --> review by collegue --> discussion via GIT server)
- HW regression test using a stimuli generator (e.g. video)
* High Tech Institute: System Configuration Management
- start with a model of the company processes
- first organize then automate
- baselinen is create timestamp versions numbers of components (HW, SW)
* Dutch system architecting conference 20 june 2019 Den Bosch
Author: Eric Kooistra, jan 2018
Title: Key aspects of FPGA firmware devlopment at DESP
Purpose:
- Provide a list of key aspects of FPGA firmware devlopment at DESP
- Identify libraries or toolscript that we could isolate and make public via e.g. OpenCores or GitHub
- Identify topics that we need to focus on in the future
1) Develop FPGA hardware boards
- Review board design document and schematic and layout, so that the board will not contain major bugs and
so that firmware engineers can already learn about the board and get familiar with it
- Pinning design to verify schematic
- Vendor reference designs to verify the IO
- Heater design to verify the cooling and the power supplies
2) Technology independent FPGA:
- Wrap IP (IO, DSP, memory, PLL)
- Xilinx (LOFAR, SKA CSP Low)
- Altera (Aartfaac, Apertif, Arts)
3) VHDL design:
- Clean coding
- Reuse through HDL libraries
- Standard interfaces: MM & ST, support Avalon, AXI using VHDL records mosi/miso, sosi/siso
- Use records not only for signals but also for generics, because adding a record field does not change the
component interface.
- Distinguish beteen state registers and pipeline registers.
. For example: dp_block_resize.vhd, dp_counter.vhd.
- Board minimal design that provides control access to the FPGA board and the board control functions
- Board test design that contain the minimal design plus interfaces to use the board IO (transceivers, DDR)
- Build FPGA application design upon a board minimal design and the relevant IO from the board test design
- Useful libraries and packages:
. base: common, dp, mm, diag, reorder, uth
. dsp: wpfb, bf, correlator, st
. io: eth, io_ddr, i2c
- Design for scaleability with generics that can be scaled over the logical range, e.g. >= 0, even if the
application only requires a certain fixed value. The reasons are:
. During development the application typically starts small (e.g. a BF with 4 inputs) while the final
application is much larger (e.g. a BF with 64 input). With generics both can be supported through a
parameter change.
. For simulation it is often necessary to reduce the size of the design to be able to simulate it in a
reasonable time. By scaling it down via generics the design preserves its structure but becomes much
smaller.
3) VHDL testing:
* Levels of application verification
- use refrence designs to verify the vendor phy IO IP, in the application these are replaced by models.
For example: tranceiver, DDR3, MM interface via MM file IO, ...
- detailed unit tests per HDL library using entity IO to proved that the unit is correct in all relevant
use cases and corner cases, usch that application tests can focus on integration tests.
- integration top level or multi FPGA tests using MM file IO
. MM file IO for testbenches at design level, 'breaking the hierarchy' in VHDL or providing access to Modelsim simulation with Python
. preferrably use MM file IO and revisions of the top level design to verify parts in the top level design, rather then making
a testbench for only that part using the IO of that part. The control interface should be enough to test the part, therefore
using MM file IO is enough and avoids testbenches that make use of other entity IO signals. Typically the revision can contain BG
and DB (with MM interface) to also have direct streaming access to the part in the top level.
* regard the firmware as a data computer, so independent of its functional (astronomical) use we need to verify and validate that for a
known stream of input data it outputs the expected output data.
* Verification via simulation:
. use of g_sim, g_sim_record to differentiate between simulation and hardware
speed up MM clk, I2C clk, skip PHY startup time, reduce size while keeping the structure,
skip or bypass functions
. use g_design_name to differentiate between revisions, e.g. to speed up simulation or synthesis
. behavoral models of external IO (DDR, Transceivers, ADC, I2C)
. break up data path using WG, BG, DB, data force
. optional use of transparant DSP models to pass on indices.
. verify data move by transporting meta data (indices) via the sosi data fields
. profiler to know time consuming parts
* VHDL regression test (if not there, then it is not used)
* Validation on hardware
. using Python peripherals for MM control using --cmd options per peripheral
. construct more complicated control scripts using sequence of peripheral scripts and --cmd
. we need proper data capture machines, to validata 10G, 40 GbE data output (e.g. using wireshark and some Python code)
4) RadioHDL
- RadioHDL is our umbrella name for set of tool scripts that we use for firmware development, focus on implementation. RadioHDL makes
it easier for developers to organize different versions and combinations of their firmware, tools and boards. RadioHDL is a platform?
- The name RadioHDL covers HDL code for RadioAstronomy as a link to what we do at Astron. However by using only the word Radio we keep
the name a bit more general, because in fact the RadioHDL tool scripts can be used for any (FPGA) HDL development. Outside Astron
the word RadioHDL can be advertised as an HDL radio station that one likes to listen to, ie. to use, so a feel good name with a
strong link to HDL but otherwise not explicitely telling what it is. The word RadioHDL also has no hits in Google search, so no
conflict or confusion with others.
- Automate implementation flow (source --> config file --> tool script --> product, a product can be the source of a next product)
- Organize code in libraries using hdllib.cfg
- Manage tool versions using hdltool_<toolset name>.cfg
- Create project files for sim and synth
- ARGS (Automatic Register Generation System using MM bus and MM register config files in yaml)
- Create FPGA info (used to be called system info) address map stored in FPGA to allow dynamic definition of address maps. The definition
of the MM register fields is kept in files because it typically remains fixed.
- Easily enroll the environment on a new PC and introduce a new employee (to be done --> OpenCores, Ruud Overeem)
5) Oneclick
- OneClick is our umbrella name for new ideas and design methods, focus on firmware specifcation and design. New tools that are
created within OneClick may end up as part of the RadioHDL envirionment. This has happened for example with ARGS.
- The name OneClick relates to our 'goal at the horizon' to get in one click from design to realisation.
- Automate design flow
- Array notation (can be used in document and in code --> aims for simulatable specification)
- Modelling in python of data move, DSP and control
6) New hardware, tools and languages
- FPGA, GPU, DSP, ASIC
- OpenCL
- HLS
- Compaan, Clash, Wavecore
7) Documentation
- Documentation is needed to specify what we have to make
. Detailed design document uses array notation to cleary describe all internal and external interfaces
. Detailed design document also identifies test logic that is needed for the integration top level tests
- No need to document what we have made, except for readme file and manuals
- The code is self explanatory (with comment in docstring style using purpose and description)
- The project scripts identify what is relevant for a product
- The regression tests identify what is relevant code (if it is not tested it is not important and should not have been made)
- It would be nice to have YouTube movies that show our workflow and boards
8) Project planning
- Wild ass guess based on time logs of previous projects
- Agile style with backlog, scrum and 3 week sprints (If it is not an allocated epic/story/task in Redmine then it will not be done).
- Review process:
. purpose is to ensure value and quality and to spread knowledge and awareness
. coder works based on a ticket in Rdemine, all production code must be reviewed by another team member
. coder delivers code according to coding style, with purpose-description/docstring and with regression test
. reviewer reviews code and function, reports via redmine ticket
. reviewer only reports
. coder does corrections and merges branch to trunk
- Roles within the team
- Definition of done
- Outsourcing
- Hiring temporary consultants
- What maintenance support do we provide after a project has finished
. firmware tends to become hardware in time, ' het verstaft'
. using virtual machines (dockers) to bundle a complete set of operating system, tools and code for the future or to
export as a starting point to an external party (e.g. for outsourcing)
- What if we would be with 10 - 15 digital/firmware engineers instead of about 5 as now (Gijs, Leon, Pieter, Jonathan, Eric, Daniel)
9) Ethernet networks
- 1GbE, 10GbE, 40GbE, 100GbE IP
- Knowledege of switches
- Knowledege of UDP, IP, VLAN
- Monitoring and Control protocol (UniBoard, Gemini)
- Streaming data offload
10) Outreach, collaborations. recruiting
- Oliscience opencores
- NWO digital special interest group
- student assignments
11) DESP pillars
- All data storage
12) Version control
- We use SVN and work on the trunk. This is feasible because we are a small team and has the advantage that issues are noted
in an early stage
- Common practice in larger software development teams is that code is developed on branches and merged to the trunk after
it has been verified
- In future use GIT?
13) FPGA - GPU
- ASTRON_MEM_193_Comparison_FPGA_GPU_switch
- FPGA are good at:
. can interface to ADC (not possible with GPU, so always need for a glue logic FPGA, but such an FPGA with gigabit
transceivers is also capable of quite some processing)
. can support many external IO ports via upto ~100 transceivers (GPU only a few fast external IO ports)
. reorder blocks of data, e.g. in a packet payload
. low latency applications (e.g. low latency trading), fixed latency (e.g. fast control loops, absolute timing)
. embedded, standalone applications
. have life time/ support time of > 10 years (GPU < 5 years)
- GPU are good at:
. more general to program
. fast compile times (< minutes, versus > hours for FPGA)
. uses floating point arithmetic by default (versus fixed point by default for FPGA)
. matrix operations
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment