Skip to content
Snippets Groups Projects
Commit 3838155e authored by Jan David Mol's avatar Jan David Mol
Browse files

L2SS-424: Improved structure.

parent f42deecc
Branches
Tags
1 merge request!150L2SS-434: Add sphinx documentation content
......@@ -10,8 +10,8 @@ Welcome to LOFAR2.0 Station Control's documentation!
:maxdepth: 2
:caption: Contents:
usage/installation
usage/remote_interfaces
installation
remote_interfaces
devices/devices
devices/recv
control
......
......@@ -42,7 +42,7 @@ You should see the following state:
If not, you can inspect why with `docker logs <container>`. Note that the containers will automatically be restarted on failure, and also if you reboot. Stop them explicitly to bring them down (`make stop <container>`).
Post-boot Initialisation
----------------
---------------------------
The following procedure describes how to initialise the system, which is required after installation and after a system reboot.
......
......@@ -15,10 +15,14 @@ To monitor the logs remotely, or to browse older logs, use the *ELK stack* that
- Logs of all devices,
- Logs of the Jupyter notebook server.
If you browse to the ELK stack (actually, it is Kibana providing the GUI), your go-to is the *Discover* view at http://localhost:5601/app/discover. There, you can construct (and save, load) a dashboard that provides a custom view of the logs. For example, `this dashboard http://localhost:5601/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-60m,to:now))&_a=(columns:!(extra.tango_device,level,message),filters:!(),index:'1e8ca200-1be0-11ec-a85f-b97e4206c18b',interval:auto,query:(language:kuery,query:''),sort:!())` shows the logs of the last hour, with some useful columns added to the default timestamp and message columns. Expand the time range if no logs appear, to look further back. You should see something like:
If you browse to the ELK stack (actually, it is Kibana providing the GUI), your go-to is the *Discover* view at http://localhost:5601/app/discover. There, you can construct (and save, load) a dashboard that provides a custom view of the logs, based on the *index pattern* `logstash-*`. There is a lot to take in, and there are excellent Kibana tutorials on the web.
To get going, use for example `this dashboard <http://localhost:5601/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-60m,to:now))&_a=(columns:!(extra.tango_device,level,message),filters:!(),index:'1e8ca200-1be0-11ec-a85f-b97e4206c18b',interval:auto,query:(language:kuery,query:''),sort:!())>`_, which shows the logs of the last hour, with some useful columns added to the default timestamp and message columns. Expand the time range if no logs appear, to look further back. You should see something like:
.. image:: elk_last_hour.png
ELK allows you to filter, edit the columns, and a lot more. We enrich the log entries with several extra fields, for example the device that generated it, and stack traces if available. Click on the `>` before a log entry and the information expands, showing for example:
.. image:: elk_log_fields.png
Furthermore, statistics from the ELK stack, such as the number of ERROR log messages, are made available as a data source in :doc:`monitoring`.
......@@ -6,7 +6,7 @@ Each device exposes a list of monitoring points as attributes with the `_R` pref
Grafana
------------------------
We offer `Grafana https://grafana.com/` dashboards on http://localhost:3000 that provide a quick overview of the station's status, including temperatures and settings. Several dashboards are included. An example::
We offer `Grafana <https://grafana.com/>`_ dashboards on http://localhost:3000 that provide a quick overview of the station's status, including temperatures and settings. Several dashboards are included. An example:
.. image:: grafana_dashboard_1.png
.. image:: grafana_dashboard_2.png
......@@ -25,17 +25,24 @@ The Grafana dashboards are configured with the following data sources:
Prometheus
-------------------------
`Prometheus https://prometheus.io/docs/introduction/overview/` is a low-level monitoring system that allows us to periodically retrieve the values of all the attributes of all our devices, and cache them to be used in Grafana:
`Prometheus <https://prometheus.io/docs/introduction/overview/>`_ is a low-level monitoring system that allows us to periodically retrieve the values of all the attributes of all our devices, and cache them to be used in Grafana:
- Every several seconds, Prometheus scrapes our `TANGO-Grafana Exporter https://git.astron.nl/lofar2.0/ska-tango-grafana-exporter` (our local fork of https://gitlab.com/ska-telescope/TANGO-grafana.git), collecting all values of all the device attributes (except the large ones, for performance reasons).
- Every several seconds, Prometheus scrapes our `TANGO-Grafana Exporter <https://git.astron.nl/lofar2.0/ska-tango-grafana-exporter>`_ (our fork of https://gitlab.com/ska-telescope/TANGO-grafana.git), collecting all values of all the device attributes (except the large ones, for performance reasons).
- Prometheus can be queried directly on http://localhost:9090,
- The query language is `PromQL https://prometheus.io/docs/prometheus/latest/querying/basics/`, which is also used in Grafana to query Prometheus,
- The query language is `PromQL <https://prometheus.io/docs/prometheus/latest/querying/basics/>`, which is also used in Grafana to query Prometheus,
Prometheus stores attributes in the following format::
device_attribute{device="lts/recv/1", dim_x="32", dim_y="0", instance="tango-prometheus-exporter:8000", job="tango", label="RCU_temperature_R", name="RCU_temperature_R", type="float", x="00", y="0"}
device_attribute{device="lts/recv/1",
dim_x="32", dim_y="0",
instance="tango-prometheus-exporter:8000",
job="tango",
label="RCU_temperature_R",
name="RCU_temperature_R",
type="float",
x="00", y="0"}
The above describes a single data point and its labels. Each point furthermore has a value (integer) and a timestamp. The following transformations take place:
The above describes a single data point and its labels. The primary identifying labels are `device` and `name`. Each point furthermore has a value (integer) and a timestamp. The following transformations take place:
- For 1D and 2D attributes, each array element is its own monitoring point, with `x` and `y` labels describing the indices. The labels `dim_x` and `dim_y` describe the array dimensionality,
- Attributes with string values get a `str_value` label describing their value.
......@@ -6,11 +6,11 @@ The station provides the following interfaces accessible through your browser (a
+---------------------+---------+----------------------+-------------------+
|Interface |Subsystem|URL |Default credentials|
+=====================+=========+======================+===================+
|Interactive scripting|Jupyter |http://localhost:8888 | |
| :doc:`control` |Jupyter |http://localhost:8888 | |
+---------------------+---------+----------------------+-------------------+
|Monitoring |Grafana |http://localhost:3000 |admin/admin |
| :doc:`monitoring` |Grafana |http://localhost:3000 |admin/admin |
+---------------------+---------+----------------------+-------------------+
|Logs |Kibana |http://localhost:5601 | |
| :doc:`logs` |Kibana |http://localhost:5601 | |
+---------------------+---------+----------------------+-------------------+
Futhermore, there are some low-level interfaces:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment