We use `Grafana <https://grafana.com/docs/grafana/latest/introduction/>`_ to visualise the monitoring information through a series of *dashboards*. It allows us to:
* Interactively create sets of plots (*panels*) of monitoring points, visualised in various ways (including instrument diagrams),
* Have access to a wide variety of data sources,
* Add *alerts* to trigger on monitoring point formulas reaching a certain treshhold.
Configuration
---------------------------------
Grafana comes with preinstalled datasources and dashboards, provided in the ``grafana-central/`` directory. By default, the following datasources are configured:
* *Prometheus* (default), providing almost all monitoring metrics,
* *Alerta UI*, providing state from the Alerta Alertmanager (see the `Alerta ReST API <https://docs.alerta.io/api/reference.html>`_),
* *Grafana API*, providing access to Grafana's API (see f.e. the `Grafana Alerting ReST API <https://editor.swagger.io/?url=https://raw.githubusercontent.com/grafana/grafana/main/pkg/services/ngalert/api/tooling/post.json>`_).
Using Grafana
---------------------------------
Go to http://localhost:3001 to access the Grafana instance. The default guest access allows looking at dashboards and inspecting the data in the datasources manually. To create or edit dashboards, or change settings, you need to Sign In. The default credentials are ``admin/admin``.
Adding alerts
---------------------------------
We use the `Grafana 8+ alerts <https://grafana.com/docs/grafana/latest/alerting/>`_ to monitor our system. You can add alerts to panels, or add free-floating ones under the ``(alarm bell) -> Alert rules`` menu, which is also used to browse the state of the existing alerts. Some tips:
* Select the *Alert groups* tab to filter alerts or apply custom grouping, for example, by station or by component.
Forwarding alerts to Alerta
---------------------------------
The alerts in Grafana come and go, without leaving a track record of ever having been there. To keep track of alerts, we forward them to our Alerta instance. This fowarding has to be configured manually:
- Go to Grafana (http://localhost:3001) and sign in with an administration account (default: ``admin/admin``),
- In the left menubar, go to ``(alarm bell) -> Admin``, paste the following configuration, and press ``Save``:
.. hint:: Whether Grafana can send alerts to Alerta can be tested by sending a `test alert <http://localhost:3001/alerting/notifications/receivers/Alerta/edit?alertmanager=grafana>`_.
We use `Prometheus <https://prometheus.io/docs/introduction/overview/>`_ to *scrape* monitoring data ("metrics") from across the telescope, and collect it into a single time-series database. Our Prometheus instance is running as the ``prometheus-central`` docker container, which periodically (every 10-60s) obtains metrics from the configured end points. This setup has several advantages: