diff --git a/doc/papers/2010/SPM/spm.tex b/doc/papers/2010/SPM/spm.tex index 6c92a61ac5e6568be93070eff078e6c673ef17b5..d903e1a7c7a19a8716693a792529ffbbefde73a8 100644 --- a/doc/papers/2010/SPM/spm.tex +++ b/doc/papers/2010/SPM/spm.tex @@ -65,6 +65,29 @@ Oude Hoogeveensedijk 4, 7991 PD\ \ Dwingeloo, The Netherlands \\ \maketitle \begin{abstract} +A recent development in radio astronomy is to replace traditional dishes +with many small antennas. The signals are combined to form one large, +virtual telescope. The enormous data streams are cross-correlated to +filter out noise. A recent trend is to correlate in software instead of dedicated hardware. Examples +include e-VLBI and LOFAR. +In this paper, we explain how to implement and optimize a correlator + on multi-core CPUs +and many-core architectures, such as NVIDIA and ATI GPUs, +and the \mbox{Cell/B.E.} The correlator is a streaming, real-time +application, and is much more I/O intensive than applications that are +typically implemented on many-core hardware today. We compare with +the LOFAR production correlator on an IBM Blue Gene/P supercomputer. +We identify several important architectural problems which cause +architectures to perform suboptimally, and also deal with programmability. +Our findings are applicable to signal processing applications in general. + +The results show that the processing power and memory bandwidth of +current GPUs are highly imbalanced. While +the production correlator on the Blue Gene/P achieves a superb 96\% of the +theoretical peak performance, this is only 14\% on ATI GPUs, and 26\% +on NVIDIA GPUs. The \mbox{Cell/B.E.} processor, in contrast, achieves an +excellent 92\%. The research presented is an +important pathfinder for next-generation telescopes. \end{abstract} \section{Introduction} @@ -206,7 +229,7 @@ FPGAs for on-the-field processing and a Blue Gene/P supercomputer to perform real-time, central processing. We describe LOFAR in more detail below. -% @@@ dit past hier niet +% dit past hier niet %% Recent many-core architectures seem to be a viable complement to the aforementioned processing platforms. %% GPUs provide more processing power and are more power-efficient than CPUs, %% while GPUs are more flexible and easier to program than FPGAs. @@ -214,7 +237,7 @@ We describe LOFAR in more detail below. %% extensive performance comparison between the architectures of popular GPUs %% for signal-processing purposes, particularly, for correlation %% purposes~\cite{Nieuwpoort:09}. -%@@@ Cell + \subsection{The LOFAR telescope}