The Grafana system exposed on http://localhost:3001 allows visualisation of the monitoring information collected by Prometheus (and other sources). It contains, with links to the relevant Grafana documentation:
* A series of `dashboards <https://grafana.com/docs/grafana/latest/dashboards/>`_, organised into *folders*. Each dashboard is an independent page of visualisations. If you login, you will see the configured "Home" dashboard.
* Each dashboard has a series of `panels <https://grafana.com/docs/grafana/latest/panels/>`_, often organised into collapsable *rows*. Each panel contains a specific visualisation, and can have alarms configured on them. The panels are tiled.
* Each panel has a set of *queries*, which describe the data to be visualised, and a single *visualization*, which is how the data is visualised.
The Grafana documentation will help you with using Grafana in general. Also be sure to check out the `webinars and videos <https://grafana.com/videos/>`_ provided by them.
Writing Queries
------------------------------------
Most of the data will be queried from the *Prometheus* backend:
* Grafana provides a `Prometheus query editor <https://grafana.com/docs/grafana/latest/datasources/prometheus/#prometheus-query-editor>`_ to interactively setup queries,
* The queries themselves use the `PromQL <https://prometheus.io/docs/prometheus/latest/querying/basics/>`_ syntax.
* Apart from configuring panels, you can also play with queries in the Explore tab (http://localhost:3001/explore), and directly in the Prometheus backend (http://localhost:9091).
The Prometheus database is flat, containing time-series for metrics which carry a name, labels, and a float value::
The queries express selections on these entries for a given name, filtered by the given labels. For example, the following query returns all FPGA temperatures across all stations, including the above entry::
Furthermore, values of different metrics can be combined (added, merged, etc). See the PromQL documentation for more details.
Querying LOFAR Station Control
````````````````````````````````````
The `LOFAR Station Control <https://lofar20-station-control.readthedocs.io/en/latest/>`_ software exposes a series of metrics from each station:
:device_attribute: All monitoring points from Tango, that are configured to be exposed to Prometheus. For arrays, each element is its own metric. It carries the following labels:
:job: `stations`
:host: Station hostname from which the value was obtained (f.e. `dts-lcu`),
:station: Name of the station, as reported by the station (f.e. `DTS`) (NB: for now, the host is more reliable to use),
:device: Tango device of this attribute (f.e. `stat/recv/1`),
:name: Tango attribute name (f.e. `ANT_mask_RW`),
:type: Data type (f.e. `string`, `float`, `bool`),
:x: Offset in the first dimension, if the attribute is a 1D or 2D array, or "00",
:y: Offset in the second dimension, if the attribute is a 2D array, or "00",
:idx: Global offset in the array, combining `x` and `y`,
:str_value: The value of the attribute, if the attribute type is a string.
:device_scraping: Time required to scrape each Tango device, in seconds. It carries the following labels:
:job: `stations`
:host: Station hostname from which the value was obtained (f.e. `dts-lcu`),
:station: Name of the station, as reported by the station (f.e. `DTS`) (NB: for now, the host is more reliable to use),
:device: Tango device scraped.
Metrics from the non-Tango services are exposed as well. See the linked documentation, or use the interactive interfaces, to explore them further:
:scrape\_\*: Metrics describing scraping (=Prometheus periodically requesting the metrics), see https://prometheus.io/docs/concepts/jobs_instances/.
:job: `stations`
:host: Station hostname from which the value was obtained (f.e. `dts-lcu`),
:exported_job: Original job on the station (`host`, `prometheus`, `grafana`).
:node\_\*: Metrics describing the server, see https://github.com/prometheus/node_exporter.
:job: `stations`
:host: Station hostname from which the value was obtained (f.e. `dts-lcu`),
:exported_job: `host`
:go\_\*, grafana\_\*: Metrics from Grafana, see https://grafana.com/docs/grafana/latest/administration/view-server/internal-metrics/ and https://grafana.com/docs/grafana/latest/alerting/unified-alerting/fundamentals/evaluate-grafana-alerts/.
:job: `stations`
:host: Station hostname from which the value was obtained (f.e. `dts-lcu`),
:exported_job: `grafana`
Querying Operational Central Management
````````````````````````````````````````
This software stack itself also exposes metrics from its various services:
:scrape\_\*: Metrics describing scraping (=Prometheus periodically requesting the metrics), see https://prometheus.io/docs/concepts/jobs_instances/.
:job: `prometheus`
:node\_\*: Metrics describing the server, see https://github.com/prometheus/node_exporter.
:job: `host`
:go\_\*, grafana\_\*: Metrics from Grafana, see https://grafana.com/docs/grafana/latest/administration/view-server/internal-metrics/ and https://grafana.com/docs/grafana/latest/alerting/unified-alerting/fundamentals/evaluate-grafana-alerts/.