From 3838155ee9bff96f7f936c0140403a437bb7f309 Mon Sep 17 00:00:00 2001 From: Jan David Mol <mol@astron.nl> Date: Thu, 7 Oct 2021 11:10:04 +0200 Subject: [PATCH] L2SS-424: Improved structure. --- docs/source/index.rst | 4 ++-- docs/source/installation.rst | 2 +- docs/source/logs.rst | 6 +++++- docs/source/monitoring.rst | 19 +++++++++++++------ docs/source/remote_interfaces.rst | 6 +++--- 5 files changed, 24 insertions(+), 13 deletions(-) diff --git a/docs/source/index.rst b/docs/source/index.rst index b5281d49b..54c7e318a 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -10,8 +10,8 @@ Welcome to LOFAR2.0 Station Control's documentation! :maxdepth: 2 :caption: Contents: - usage/installation - usage/remote_interfaces + installation + remote_interfaces devices/devices devices/recv control diff --git a/docs/source/installation.rst b/docs/source/installation.rst index b2d8f763a..51e534f31 100644 --- a/docs/source/installation.rst +++ b/docs/source/installation.rst @@ -42,7 +42,7 @@ You should see the following state: If not, you can inspect why with `docker logs <container>`. Note that the containers will automatically be restarted on failure, and also if you reboot. Stop them explicitly to bring them down (`make stop <container>`). Post-boot Initialisation ----------------- +--------------------------- The following procedure describes how to initialise the system, which is required after installation and after a system reboot. diff --git a/docs/source/logs.rst b/docs/source/logs.rst index de6ca4543..f0a92a386 100644 --- a/docs/source/logs.rst +++ b/docs/source/logs.rst @@ -15,10 +15,14 @@ To monitor the logs remotely, or to browse older logs, use the *ELK stack* that - Logs of all devices, - Logs of the Jupyter notebook server. -If you browse to the ELK stack (actually, it is Kibana providing the GUI), your go-to is the *Discover* view at http://localhost:5601/app/discover. There, you can construct (and save, load) a dashboard that provides a custom view of the logs. For example, `this dashboard http://localhost:5601/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-60m,to:now))&_a=(columns:!(extra.tango_device,level,message),filters:!(),index:'1e8ca200-1be0-11ec-a85f-b97e4206c18b',interval:auto,query:(language:kuery,query:''),sort:!())` shows the logs of the last hour, with some useful columns added to the default timestamp and message columns. Expand the time range if no logs appear, to look further back. You should see something like: +If you browse to the ELK stack (actually, it is Kibana providing the GUI), your go-to is the *Discover* view at http://localhost:5601/app/discover. There, you can construct (and save, load) a dashboard that provides a custom view of the logs, based on the *index pattern* `logstash-*`. There is a lot to take in, and there are excellent Kibana tutorials on the web. + +To get going, use for example `this dashboard <http://localhost:5601/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-60m,to:now))&_a=(columns:!(extra.tango_device,level,message),filters:!(),index:'1e8ca200-1be0-11ec-a85f-b97e4206c18b',interval:auto,query:(language:kuery,query:''),sort:!())>`_, which shows the logs of the last hour, with some useful columns added to the default timestamp and message columns. Expand the time range if no logs appear, to look further back. You should see something like: .. image:: elk_last_hour.png ELK allows you to filter, edit the columns, and a lot more. We enrich the log entries with several extra fields, for example the device that generated it, and stack traces if available. Click on the `>` before a log entry and the information expands, showing for example: .. image:: elk_log_fields.png + +Furthermore, statistics from the ELK stack, such as the number of ERROR log messages, are made available as a data source in :doc:`monitoring`. diff --git a/docs/source/monitoring.rst b/docs/source/monitoring.rst index faf71d364..5c5c30c52 100644 --- a/docs/source/monitoring.rst +++ b/docs/source/monitoring.rst @@ -6,7 +6,7 @@ Each device exposes a list of monitoring points as attributes with the `_R` pref Grafana ------------------------ -We offer `Grafana https://grafana.com/` dashboards on http://localhost:3000 that provide a quick overview of the station's status, including temperatures and settings. Several dashboards are included. An example:: +We offer `Grafana <https://grafana.com/>`_ dashboards on http://localhost:3000 that provide a quick overview of the station's status, including temperatures and settings. Several dashboards are included. An example: .. image:: grafana_dashboard_1.png .. image:: grafana_dashboard_2.png @@ -25,17 +25,24 @@ The Grafana dashboards are configured with the following data sources: Prometheus ------------------------- -`Prometheus https://prometheus.io/docs/introduction/overview/` is a low-level monitoring system that allows us to periodically retrieve the values of all the attributes of all our devices, and cache them to be used in Grafana: +`Prometheus <https://prometheus.io/docs/introduction/overview/>`_ is a low-level monitoring system that allows us to periodically retrieve the values of all the attributes of all our devices, and cache them to be used in Grafana: -- Every several seconds, Prometheus scrapes our `TANGO-Grafana Exporter https://git.astron.nl/lofar2.0/ska-tango-grafana-exporter` (our local fork of https://gitlab.com/ska-telescope/TANGO-grafana.git), collecting all values of all the device attributes (except the large ones, for performance reasons). +- Every several seconds, Prometheus scrapes our `TANGO-Grafana Exporter <https://git.astron.nl/lofar2.0/ska-tango-grafana-exporter>`_ (our fork of https://gitlab.com/ska-telescope/TANGO-grafana.git), collecting all values of all the device attributes (except the large ones, for performance reasons). - Prometheus can be queried directly on http://localhost:9090, -- The query language is `PromQL https://prometheus.io/docs/prometheus/latest/querying/basics/`, which is also used in Grafana to query Prometheus, +- The query language is `PromQL <https://prometheus.io/docs/prometheus/latest/querying/basics/>`, which is also used in Grafana to query Prometheus, Prometheus stores attributes in the following format:: - device_attribute{device="lts/recv/1", dim_x="32", dim_y="0", instance="tango-prometheus-exporter:8000", job="tango", label="RCU_temperature_R", name="RCU_temperature_R", type="float", x="00", y="0"} + device_attribute{device="lts/recv/1", + dim_x="32", dim_y="0", + instance="tango-prometheus-exporter:8000", + job="tango", + label="RCU_temperature_R", + name="RCU_temperature_R", + type="float", + x="00", y="0"} -The above describes a single data point and its labels. Each point furthermore has a value (integer) and a timestamp. The following transformations take place: +The above describes a single data point and its labels. The primary identifying labels are `device` and `name`. Each point furthermore has a value (integer) and a timestamp. The following transformations take place: - For 1D and 2D attributes, each array element is its own monitoring point, with `x` and `y` labels describing the indices. The labels `dim_x` and `dim_y` describe the array dimensionality, - Attributes with string values get a `str_value` label describing their value. diff --git a/docs/source/remote_interfaces.rst b/docs/source/remote_interfaces.rst index 632147db4..8dea68f92 100644 --- a/docs/source/remote_interfaces.rst +++ b/docs/source/remote_interfaces.rst @@ -6,11 +6,11 @@ The station provides the following interfaces accessible through your browser (a +---------------------+---------+----------------------+-------------------+ |Interface |Subsystem|URL |Default credentials| +=====================+=========+======================+===================+ -|Interactive scripting|Jupyter |http://localhost:8888 | | +| :doc:`control` |Jupyter |http://localhost:8888 | | +---------------------+---------+----------------------+-------------------+ -|Monitoring |Grafana |http://localhost:3000 |admin/admin | +| :doc:`monitoring` |Grafana |http://localhost:3000 |admin/admin | +---------------------+---------+----------------------+-------------------+ -|Logs |Kibana |http://localhost:5601 | | +| :doc:`logs` |Kibana |http://localhost:5601 | | +---------------------+---------+----------------------+-------------------+ Futhermore, there are some low-level interfaces: -- GitLab