Skip to content
Snippets Groups Projects

LDV Specification

LDV Specification Application. For filling ATDB-LDV with processing tasks for LOFAR data

Documentation (Confluence)

Collaborate

  • create your branch from main

  • add your functionality

  • test your functionality locally

  • merge main into your branch before creating a MR

  • merge with main

  • deploy in test, and test it

  • deploy in production, and test it

Local update

After a colleague has made changes, then locally:

  > git pull
  > pip install -r requirements\dev.txt
  > python manage.py migrate --settings=ldvspec.settings.dev

Run migration

docker exec -ti ldv-specification python manage.py migrate --settings ldvspec.settings.docker_sdc

Environments

Local Development Environment

Start developing

  • Copy the ldvspec.example.env, rename it to ldvspec.env and fill in the variables. The variables should match the local.py settings which are coherent with the docker-compose-local.yml setup.
  • Run the docker build for the base image of the Dockerfile

    docker build -f Dockerfile.base -t git.astron.nl:5000/astron-sdc/ldv-specification/base:latest .

  • Run the docker-compose build for the local environment from folder lvdspec (Note: the path from which you build the docker-compose matters here!)

    docker-compose -f ./docker/docker-compose-local.yml build

  • Run docker-compose -f docker-compose-local.yml up -d with the following compose file to spin up a new Postgres container, celery worker and rabbitMQ.
  • Run the following python commands to start developing
    • migrate

      python manage.py migrate --settings=ldvspec.settings.local

    • create a superuser

    python manage.py createsuperuser --settings=ldvspec.settings.local

    Or use the following to create it without any user interaction (admin, admin)

    DJANGO_SUPERUSER_PASSWORD=admin python ldvspec/manage.py createsuperuser --username admin --email no-reply@example.com --noinput --settings=ldvspec.settings.local

    • Run the server

    python manage.py runserver --settings=ldvspec.settings.local

  • You can use the following fixture to fill the database with complete example data

    python manage.py loaddata fixtures/fixture_16122022.json --settings=ldvspec.settings.local

Django Application

  • clone the repo

  • open the project in Pycharm

  • create a venv (File -> Settings -> Project -> Project Interpreter -> (click cog) -> add)

  • pip install -r requirements\dev.txt

  • check and/or change the database connection in settings/dev/py. In this example it connects to a database server on 'raspiastro', you have to change that to the server where you run your Postgres Docker container (localhost?)

DATABASES = {
    'default': {
         'ENGINE': 'django.db.backends.postgresql_psycopg2',
         'USER': 'postgres',
         'PASSWORD': 'secret',
         'NAME': 'ldv-spec-db',
         'HOST': 'raspiastro',
         'PORT': '5433',
    },
}
   > python manage.py migrate --settings=ldvspec.settings.dev
   > python manage.py createsuperuser --settings=ldvspec.settings.dev
   > python manage.py runserver --settings=ldvspec.settings.dev

   # In another terminal (for background tasks):
   > celery -A ldvspec worker -l INFO
   # Note: for windows you might need to add the `--pool=solo` parameter

Test Environment

Production Environment


Usage

See also:

Add a work specification

With this url you can specify work

This is an example of structure of the LOFAR data in the ldv-spec-db database. Which also shows which fields can be used to filter on.

GET /ldvspec/api/v1/data/
HTTP 200 OK
Allow: GET, POST, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "count": 10010,
    "next": "http://127.0.0.1:8000/ldvspec/api/v1/data/?page=2",
    "previous": null,
    "results": [
        {
            "id": 3155,
            "obs_id": "102092",
            "oid_source": "SAS",
            "dataproduct_source": "LOFAR LTA",
            "dataproduct_type": "Correlator data",
            "project": "LC0_043",
            "activity": "Raw Observation",
            "surl": "srm://lofar-srm.fz-juelich.de:8443/pnfs/fz-juelich.de/data/lofar/ops/projects/lc0_043/102092/L102092_SAP000_SB261_uv.MS_8d9ea7c0.tar",
            "filesize": 1477939200,
            "dysco_compression": "False",
            "location": "Juelich"
        },
  • Enter the filter in json format in the filter fields, for example {"obs_id": 102092, "dysco_compression": true}

  • Choose valid workflow, for example imaging_compress_pipeline_v02 (see the worklows endpoint in the ATDB API for an overview of valid workflows: https://sdc.astron.nl:5554/atdb/workflows/)

  • After clicking 'POST', the response should look like this.

HTTP 201 Created
Allow: GET, POST, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "id": 2,
    "created_on": "2022-08-15T07:07:39.479546Z",
    "filters": {
        "obs_id": 102092,
        "dysco_compression": true
    },
    "inputs": null,
    "selected_workflow": "imaging_compress_pipeline_v02",
    "related_tasks": null,
    "async_task_result": "99622e7b-71f0-4f05-826d-23c13846642d",
    "created_by": 1,
    "processing_site": null
}

The workspecification endpoint now shows an overview of specified work, which is ready to be sent to ATDB-LDV:


Other

Configuration

See ~/shared/ldvspec.env for database credentials and where to find ATDB-LDV

NOTE: currently a postgres database in a Docker container is also used in production. This will change to a database on the sdc-db machine.

admin user

  • admin:admin

Build & Deploy

The CI/CD pipeline creates 2 Docker containers:

  • ldv-specification : The Django application
  • ldv-spec-postgres : The Postgres database

The database can also be accessed externally:

  • host : sdc-dev.astron.nl / sdc.astron.nl
  • port : 12000
  • database: ldv-spec-db

Manual steps (add them somewhere)

Log into the ldv-specification container. (using the portainer GUI or with the docker exec)

> cd /src
> python manage.py migrate --settings=ldvspec.settings.docker_sdc
> python manage.py createsuperuser --settings=ldvspec.settings.docker_sdc

Profiling

There are two settings files for running the application with profiling enabled:

  • ldvspec.settings.dev_profiling, which you can use for local development (in favor of ldvspec.settings.dev)
  • ldvspec.settings.docker_sdc_profiling, which can be used in dev/production, by changing the DJANGO_SETTINGS_MODULE env var to point to this settings file

Subsequently, navigate to /ldvspec/silk/ to visit the profiler dashboard.

NOTE: The silk profiler expects some database tables to be present. These are migrations that only show up when you use the ldvspec.settings.docker_sdc_profiling settings. In order to use profiling, these migrations must be performed at least once on an environment:

> python.manage.py migrate --settings=ldvspec.settings.docker_sdc_profiling

Caching

We use the Django Cache Framework, which is a generic API into which you can plug different types of cache. Some common ones are Memcached and Redis. This generic API is described here: https://docs.djangoproject.com/en/4.1/topics/cache/. In LDVspec, we use Memcached as caching backend. This is defined in the settings file like so:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.PyMemcacheCache',
        'LOCATION': f'{os.environ["CACHE_HOST_SERVER"]}:{os.environ["CACHE_HOST_PORT"]}',
    }
}

When looking at documentation on how to use caching, look at the Django Cache Framework documentation listed above. Don't look at the documentation of the underlying cache, because their API will probably be different. By using the Django Cache Framework, we can abstract away the underlying implementation and consistently use the Django API.

For example, when you add something to the cache using cache.set function, there is a parameter for expiration timeout, for example: cache.set('my_key', 'hello, world!', 30). Use the Django Cache Framework definition of this parameter:

The timeout argument is optional and defaults to the timeout argument of the appropriate backend in the CACHES setting (explained above). It’s the number of seconds the value should be stored in the cache. Passing in None for timeout will cache the value forever. A timeout of 0 won’t cache the value.

In memcached, you would pass the value 0 for have an infinite timeout. In the Django Cache Framework, you need to pass the value None (which will then subsequently pass 0 to Memcached)

NOTE: when you add something to the cache, always pass in a timeout explicitly, as it depends on the use case. Don't rely on the framework default (which is 5 minutes).


Troubleshooting

Q: OperationalError at /ldvspec/api/v1/workspecification/ [WinError 10061] No connection could be made because the target machine actively refused it

A: make sure that you have a connection to a celery broker (RabbitMQ) when running the application in development mode.

Example on Windows machine:

SET CELERY_BROKER_URL=amqp://guest@raspiastro:5672
python manage.py runserver --settings=ldvspec.settings.dev