- The Tango Controls `hdbpp subsystem <https://tango-controls.readthedocs.io/en/latest/administration/services/hdbpp/hdb++-design-guidelines.html>`_ archives data-value changes into a TimescaleDB database,
- Grafana allows `Alert rules <https://grafana.com/docs/grafana/latest/alerting/>`_ to be configured, which poll TimescaleDB and generate an *alert* when the configured condition is met. It also maintains a list of currently firing alerts,
- `Alerta <https://alerta.io/>`_ is the *alert manager*: itreceives these alerts, manages duplicates, and maintains alerts until the operator explicitly acknowledges them. It thus also has a list of alerts that fired in the past.
Setting up alerts
```````````````````
To setup alerting, you first need to post-configure Grafana to populate it with alerting rules, and a policy to forward rules to Grafana:
To setup alerting, you first need to post-configure Grafana to populate it with alerting rules, and a policy to forward rules to Grafana:
- Go to Grafana (http://localhost:3000) and sign in with an administration account (default: admin/admin),
- Go to Grafana (http://localhost:3000) and sign in with an administration account (default: admin/admin),
...
@@ -9,3 +18,32 @@ To setup alerting, you first need to post-configure Grafana to populate it with
...
@@ -9,3 +18,32 @@ To setup alerting, you first need to post-configure Grafana to populate it with
.. hint:: Whether Grafana can send alerts to Alerta can be tested by sending a `test alert <http://localhost:3000/alerting/notifications/receivers/Alerta/edit?alertmanager=grafana>`_.
Slack integration
```````````````````
Our Alerta setup is configured to send alerts to Slack. To set this up, you need to:
- Create a Slack App: https://api.slack.com/apps?new_app=1
- Under ``OAuth & Permissions``, add the following ``OAuth Scope``: ``chat:write``,
- Install the App in your Workspace,
- Copy the ``OAuth Token``.
.. hint:: To obtain the ``OAuth Token`` later on, go to https://api.slack.com/apps, click on your App, and look under ``Install App``.
The ``SLACK_TOKEN`` is the ``OAuth Token``, and the ``SLACK_CHANNEL`` is the channel in which to post the alerts.
Any further tweaking can be done by modifying ``docker-compose/alerta-web/alertad.conf``.
Debugging hints
````````````````````````
- Grafana sends alerts to Alerta using the *Prometheus AlertManager* format, and thus uses the Prometheus webhook to do so. To see what Grafana emits, configure it to send to your custom https://hookbin.com/ endpoint,
- Grafana by default resends firing alerts every 4 hours, and we set this to 10 minutes. This means that if an alert was succesfully sent but lost (or deleted), it takes that long to get it back. For debugging, you may want to lower this to f.e. 10 seconds in the ``Alerting -> Notification policies`` settings of Grafana,
- Alerta has a plugin system which allows easily modifying the attributes of an alert (see ``docker-compose/alerta-web`` and https://github.com/alerta/alerta-contrib). To see which attributes an alert has, simply go to the alert in the web GUI, press *Copy*, and paste in your editor,
- Alerta allows a ``DEBUG=True`` parameter in ``docker-compose/alerta-web/alertad.conf`` to generate debug output.