diff --git a/.gitignore b/.gitignore index d43b50f85c0a9fbdbcbcafda7e75d3f1749fed95..22d7ca277721656d0839be949651f421aabcb557 100644 --- a/.gitignore +++ b/.gitignore @@ -11,5 +11,6 @@ SDC-006.out SDC-006.pdf SDC-006.toc SDC-006.run.xml +changes.tex meta.tex texput.log diff --git a/4-product.tex b/4-product.tex index 6bfd7c24e6a2353b27a3f55cccebd5c42c8c1259..7f9e5019096904af761a97e7a9836344e1a2520d 100644 --- a/4-product.tex +++ b/4-product.tex @@ -34,7 +34,7 @@ A non-exhaustive list includes: \begin{itemize} \item The Application Repository, which will store and make available software components for use in Data Processing Service and Portal. \item The Staging, Data Transfer, and Archiver services, which make data available to the Data Processing Service and ingest results back into the Data Product Repository. -\item The Science Data Repository and Virtual Observatory services, which provide metadata and interfaces that enable the Data Repository to be searched, indexed, and made accessible using standard tooling, enforcing the data access policies. +\item The Science Data Repository and Virtual Observatory services, which provide metadata and interfaces that enable the Data Repository to be searched, indexed, and made accessible using standard tooling, while ensuring that appropriate data access policies are enforced. \item The Federated Authentication and Authorization Infrastructure and the Community Management Service, which provide control over user access rights. \end{itemize} @@ -75,13 +75,13 @@ In general, however, the Repository is expected to take an hierarchical approach The distributed nature of the Repository represents a challenge in terms of data locality: in general, it is cheaper and more efficient to process data close to where it is stored, rather than transmitting it over a long-haul network for analysis. The Repository will therefore cooperate with the Data Processing Service and related ancillary services to route processing and analytics jobs to compute systems which are as close to the data as possible. -To facilitate data sharing (\Cref{sec:features:sharing}), the will provide support for automatically associating \glspl{PID} with published data products. +To facilitate data sharing (\Cref{sec:features:sharing}), the Repository will provide support for automatically associating \glspl{PID} with published data products. Ancillary services will ensure that appropriate data published to the Repository is made available to the \gls{VO} (\Cref{sec:features:vo}). -The Repository will provide data management functionality capable of implementing the capabilities described in \Cref{sec:features:drm}. +The Repository will provide the data management functionality required to implement the capabilities described in \Cref{sec:features:drm}. In particular, this will include support for the concept of ownership and rights to data. That is, when appropriate, it will be clear what organization produced and is responsible for a piece of data (for example, the \gls{ILT}), and it will be possible to define and enforce policies regarding which users have permission to access it. -These capabilities will be generic: it should be possible to use them to implement the policies of multiple data owners where appropriate. +These capabilities will be generic: it will be possible to use them to implement the policies of multiple data owners where appropriate. \subsubsection{Data Processing Service} @@ -209,7 +209,7 @@ Providing documentation appropriate to all levels of users is a core goal of the Documentation will be tightly integrated with and published through the Portal. Responsibility for generating documentation is shared by all aspects of the \gls{SDC}. -In particular, developers are responsible for providing \gls{API} and code documentation to accompany their software, while the operations team should provide provide higher-level guides and descriptions of the instruments, available datasets, processing pipelines, and analysis techniques. +In particular, developers are responsible for providing \gls{API} and code documentation to accompany their software, while the operations team should provide higher-level guides and descriptions of the instruments, available datasets, processing pipelines, and analysis techniques. \subsubsection{Communications Channels} diff --git a/5-sizing.tex b/5-sizing.tex index 0ba4e9602ef1d04a3fd541fda55602a3e91e3391..9bb8d0a84bcc1ee4e58ffbeda895508e8ab04343 100644 --- a/5-sizing.tex +++ b/5-sizing.tex @@ -6,11 +6,11 @@ This section aims to present a high-level overview of our current understanding \subsection{Data Storage} -The \gls{SDC} will store and offer to the community multiple petabytes of data (\Cref{sec:features:raw,sec:features:srdp,sec:features:simple}). -Our current data holdings, and their predicted growth rates, are: +The \gls{SDC} will store and offer to the community multiple petabytes of data (as described in \Cref{sec:features:raw,sec:features:srdp,sec:features:simple}). +Current data holdings, and their predicted growth rates, are estimated as: \begin{itemize} -\item 50\,PB of \gls{LOFAR} data distributed over three archive sites, growing at a rate of about 7\,PB a year +\item 50\,PB of \gls{LOFAR} data distributed over three archive sites, growing at a rate of about 7\,PB a year; \item 4\,PB of Apertif data, which will increase to 7.5\,PB by the end of the Apertif surveys. \end{itemize} diff --git a/Makefile b/Makefile index 675adce92e309a146bd04661a4fc7a214778beb9..4a20820acf64c6775ce5f24770de11efe0a7f69d 100644 --- a/Makefile +++ b/Makefile @@ -1,11 +1,12 @@ DOCNAME=SDC-006 export TEXMFHOME ?= astron-texmf/texmf -$(DOCNAME).pdf: $(DOCNAME).tex meta.tex +$(DOCNAME).pdf: $(DOCNAME).tex meta.tex changes.tex xelatex $(DOCNAME) makeglossaries $(DOCNAME) biber $(DOCNAME) xelatex $(DOCNAME) xelatex $(DOCNAME) -include astron-texmf/vcs-meta.make +include astron-texmf/make/vcs-meta.make +include astron-texmf/make/changes.make diff --git a/SDC-006.tex b/SDC-006.tex index 7d52a07aa4d572b8f2dfe6da2562a455925e4513..9f471ac61e98edc2c904f92a0c30ae98fc16b571 100644 --- a/SDC-006.tex +++ b/SDC-006.tex @@ -3,6 +3,7 @@ \usepackage{glossary-mcols} \input{meta} +\input{changes} \setDocTitle{ASTRON Science Data Centre Vision} \setDocNumber{SDC-006} @@ -10,12 +11,6 @@ \setDocDate{\vcsDate} \setDocProgram{SDC} -\setDocChangeRecord{ - \addChangeRecord{0.3}{2021-05-07}{Detailed response to comments from Pizzo} - \addChangeRecord{0.2}{2021-04-16}{Revised draft for distribution to A\&O} - \addChangeRecord{0.1}{2021-02-18}{Initial draft for distribution} -} - \setDocAuthors{ \addPerson{Roberto Pizzo}{ASTRON}{\vcsDate} \addPerson{John D. Swinbank}{ASTRON}{\vcsDate} diff --git a/a-users.tex b/a-users.tex index f25e83282fa32eecaf4ae5a247050edb54c65fe1..bdb9a2631a227066e0438c0493e87082e3e3292c 100644 --- a/a-users.tex +++ b/a-users.tex @@ -88,7 +88,7 @@ We hope to extend \gls{SDC} functionality to better serve this class of user in \paragraph{Profile} -As discussed in \cref{sec:mission}, the \gls{SDC} will have a uniquely close relationship with LOFAR: while the\gls{SDC} will process and serve data from a range of instrumentation, it will serve as the primary data delivery mechanism for LOFAR, with responsibility for generating advanced data products --- defined as L2+ using the terminology defined by the \gls{IVOA} ObsCore standard \autocite{2017ivoa.spec.0509L} --- lying solely within the \gls{SDC}. +As discussed in \cref{sec:mission}, the \gls{SDC} will have a uniquely close relationship with LOFAR: while the \gls{SDC} will process and serve data from a range of instrumentation, it will serve as the primary data delivery mechanism for LOFAR, with responsibility for generating advanced data products --- defined as L2+ using the terminology defined by the \gls{IVOA} ObsCore standard \autocite{2017ivoa.spec.0509L} --- lying solely within the \gls{SDC}. It follows that telescope operators will rely on the \gls{SDC} to provide prompt feedback on the quality of the data being collected and the results of pipeline processing to plan necessary instrument maintenance and other operational tasks. \paragraph{Prioritization} @@ -118,7 +118,7 @@ They will therefore be consumers of documentation and software. \paragraph{Prioritization} -Deep integration with the \gls{SRC} network is important to ASTRON's ambition of running a its own Regional Centre. +Deep integration with the \gls{SRC} network is important to ASTRON's ambition of directly operating a Regional Centre. \subsubsection{Institutional Partners} \label{sec:goals:users:institutions} diff --git a/astron-texmf b/astron-texmf index 0555476206ac766cde6163a8d07ee06ba65e80d3..6764795b003822e61cfd8e62e1f55d263d919b4f 160000 --- a/astron-texmf +++ b/astron-texmf @@ -1 +1 @@ -Subproject commit 0555476206ac766cde6163a8d07ee06ba65e80d3 +Subproject commit 6764795b003822e61cfd8e62e1f55d263d919b4f diff --git a/b-features.tex b/b-features.tex index 7324237c1b586ba8885ef74ec1b3c26717b4a981..745cf42a185879d1662cf9f87ff3e3b04ddb42f1 100644 --- a/b-features.tex +++ b/b-features.tex @@ -164,7 +164,7 @@ As for \cref{sec:features:sharing}, all data access services --- including both Standard data analysis tooling are packages widely used in the wider (radio) astronomy community for data analysis. This might include, for example, CASA\footnote{\url{https://casa.nrao.edu}}, TOPCAT\footnote{\url{http://www.star.bris.ac.uk/~mbt/topcat/}}, Aladin\footnote{\url{https://aladin.u-strasbg.fr}}, among a wide range of other packages. -Users are familiar with these packages and know how to use them to quickly obtain the results they nee. +Users are familiar with these packages and know how to use them to quickly obtain the results they need. These tools might be access directly within the \gls{SDC} (e.g. in a web-based environment, or running on a \gls{VM}), or may be packaged for convenient download and offline use. The tooling should interoperate seamlessly with the data products provided by the \gls{SDC} (\cref{sec:features:srdp,sec:features:raw,sec:features:simple}): having obtained the software and the data through \gls{SDC}-sanctioned channels, they should immediately be able to load and work with the data in their tool of choice without further adaptation. @@ -291,7 +291,7 @@ This is fundamental to enable the community to engage with the facility, and is \paragraph{System Design} -The \gls{SDC} will make extensive use off-the-shelf tooling for providing these capabilities. +The \gls{SDC} will make extensive use of off-the-shelf tooling for providing these capabilities. As appropriate, these will be integrated with the \portal{}. \subsubsection{Technical and \Acrshort{API} Documentation} @@ -308,7 +308,7 @@ It is also essential to the \gls{SDC} development workflow. \paragraph{System Design} -The \gls{SDC} will make extensive use off-the-shelf tooling for providing these capabilities. +The \gls{SDC} will make extensive use of off-the-shelf tooling for providing these capabilities. As appropriate, these will be integrated with the \portal{}. \subsubsection{Access to Source Code and Software}