LDV Specification
LDV Specification Application. For filling ATDB-LDV with processing tasks for LOFAR data
Documentation (Confluence)
-
The plan: https://support.astron.nl/confluence/pages/viewpage.action?pageId=84215267
-
https://support.astron.nl/confluence/display/SDCP/LDV+Documentation
-
deployment diagram of the current situation (in production)
-
Running a manual migration: Migration
-
Integration testing: Integration Testing
Collaborate
-
create
your branch
frommain
-
add your functionality
-
test your functionality locally
-
merge
main
intoyour branch
before creating a MR -
merge with
main
-
deploy in test, and test it
-
deploy in production, and test it
Local update
After a colleague has made changes, then locally:
> git pull
> pip install -r requirements\dev.txt
> python manage.py migrate --settings=ldvspec.settings.dev
Run migration
docker exec -ti ldv-specification python manage.py migrate --settings ldvspec.settings.docker_sdc
Environments
Local Development Environment
Start developing
- Copy the
ldvspec.example.env
, rename it toldvspec.env
and fill in the variables. The variables should match thelocal.py
settings which are coherent with thedocker-compose-local.yml
setup. - Run the docker build for the base image of the Dockerfile
docker build -f Dockerfile.base -t git.astron.nl:5000/astron-sdc/ldv-specification/base:latest .
- Run the docker-compose build for the local environment from folder
lvdspec
(Note: the path from which you build the docker-compose matters here!)docker-compose -f ./docker/docker-compose-local.yml build
- Run
docker-compose -f docker-compose-local.yml up -d
with the following compose file to spin up a new Postgres container, celery worker and rabbitMQ. - Run the following python commands to start developing
- migrate
python manage.py migrate --settings=ldvspec.settings.local
- create a superuser
python manage.py createsuperuser --settings=ldvspec.settings.local
Or use the following to create it without any user interaction (admin, admin)
DJANGO_SUPERUSER_PASSWORD=admin python ldvspec/manage.py createsuperuser --username admin --email no-reply@example.com --noinput --settings=ldvspec.settings.local
- Run the server
python manage.py runserver --settings=ldvspec.settings.local
- migrate
- You can use the following fixture to fill the database with complete example data
python manage.py loaddata fixtures/fixture_16122022.json --settings=ldvspec.settings.local
Django Application
-
clone the repo
-
open the project in Pycharm
-
create a venv (File -> Settings -> Project -> Project Interpreter -> (click cog) -> add)
-
pip install -r requirements\dev.txt
-
check and/or change the database connection in settings/dev/py. In this example it connects to a database server on 'raspiastro', you have to change that to the server where you run your Postgres Docker container (localhost?)
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'USER': 'postgres',
'PASSWORD': 'secret',
'NAME': 'ldv-spec-db',
'HOST': 'raspiastro',
'PORT': '5433',
},
}
> python manage.py migrate --settings=ldvspec.settings.dev
> python manage.py createsuperuser --settings=ldvspec.settings.dev
> python manage.py runserver --settings=ldvspec.settings.dev
# In another terminal (for background tasks):
> celery -A ldvspec worker -l INFO
# Note: for windows you might need to add the `--pool=solo` parameter
Test Environment
Production Environment
Usage
See also:
Add a work specification
With this url you can specify work
This is an example of structure of the LOFAR data in the ldv-spec-db
database.
Which also shows which fields can be used to filter on.
GET /ldvspec/api/v1/data/
HTTP 200 OK
Allow: GET, POST, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
{
"count": 10010,
"next": "http://127.0.0.1:8000/ldvspec/api/v1/data/?page=2",
"previous": null,
"results": [
{
"id": 3155,
"obs_id": "102092",
"oid_source": "SAS",
"dataproduct_source": "LOFAR LTA",
"dataproduct_type": "Correlator data",
"project": "LC0_043",
"activity": "Raw Observation",
"surl": "srm://lofar-srm.fz-juelich.de:8443/pnfs/fz-juelich.de/data/lofar/ops/projects/lc0_043/102092/L102092_SAP000_SB261_uv.MS_8d9ea7c0.tar",
"filesize": 1477939200,
"dysco_compression": "False",
"location": "Juelich"
},
-
Enter the filter in json format in the
filter
fields, for example{"obs_id": 102092, "dysco_compression": true}
-
Choose valid workflow, for example
imaging_compress_pipeline_v02
(see the worklows endpoint in the ATDB API for an overview of valid workflows: https://sdc.astron.nl:5554/atdb/workflows/) -
After clicking 'POST', the response should look like this.
HTTP 201 Created
Allow: GET, POST, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
{
"id": 2,
"created_on": "2022-08-15T07:07:39.479546Z",
"filters": {
"obs_id": 102092,
"dysco_compression": true
},
"inputs": null,
"selected_workflow": "imaging_compress_pipeline_v02",
"related_tasks": null,
"async_task_result": "99622e7b-71f0-4f05-826d-23c13846642d",
"created_by": 1,
"processing_site": null
}
The workspecification endpoint now shows an overview of specified work, which is ready to be sent to ATDB-LDV:
Other
Configuration
See ~/shared/ldvspec.env
for database credentials and where to find ATDB-LDV
NOTE: currently a postgres database in a Docker container is also used in production. This will change to a database on the sdc-db machine.
admin user
- admin:admin
Build & Deploy
The CI/CD pipeline creates 2 Docker containers:
- ldv-specification : The Django application
- ldv-spec-postgres : The Postgres database
The database can also be accessed externally:
- host : sdc-dev.astron.nl / sdc.astron.nl
- port : 12000
- database: ldv-spec-db
Manual steps (add them somewhere)
Log into the ldv-specification container. (using the portainer GUI or with the docker exec)
> cd /src
> python manage.py migrate --settings=ldvspec.settings.docker_sdc
> python manage.py createsuperuser --settings=ldvspec.settings.docker_sdc
Profiling
There are two settings files for running the application with profiling enabled:
-
ldvspec.settings.dev_profiling
, which you can use for local development (in favor ofldvspec.settings.dev
) -
ldvspec.settings.docker_sdc_profiling
, which can be used in dev/production, by changing theDJANGO_SETTINGS_MODULE
env var to point to this settings file
Subsequently, navigate to /ldvspec/silk/
to visit the profiler dashboard.
NOTE: The silk profiler expects some database tables to be present. These are migrations that only show up when you use the ldvspec.settings.docker_sdc_profiling
settings. In order to use profiling, these migrations must be performed at least once on an environment:
> python.manage.py migrate --settings=ldvspec.settings.docker_sdc_profiling
Caching
We use the Django Cache Framework, which is a generic API into which you can plug different types of cache. Some common ones are Memcached and Redis. This generic API is described here: https://docs.djangoproject.com/en/4.1/topics/cache/. In LDVspec, we use Memcached as caching backend. This is defined in the settings file like so:
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.memcached.PyMemcacheCache',
'LOCATION': f'{os.environ["CACHE_HOST_SERVER"]}:{os.environ["CACHE_HOST_PORT"]}',
}
}
When looking at documentation on how to use caching, look at the Django Cache Framework documentation listed above. Don't look at the documentation of the underlying cache, because their API will probably be different. By using the Django Cache Framework, we can abstract away the underlying implementation and consistently use the Django API.
For example, when you add something to the cache using cache.set
function, there is a parameter for expiration timeout, for example: cache.set('my_key', 'hello, world!', 30)
. Use the Django Cache Framework definition of this parameter:
The timeout argument is optional and defaults to the timeout argument of the appropriate backend in the CACHES setting (explained above). It’s the number of seconds the value should be stored in the cache. Passing in None for timeout will cache the value forever. A timeout of 0 won’t cache the value.
In memcached, you would pass the value 0
for have an infinite timeout. In the Django Cache Framework, you need to pass the value None
(which will then subsequently pass 0
to Memcached)
NOTE: when you add something to the cache, always pass in a timeout explicitly, as it depends on the use case. Don't rely on the framework default (which is 5 minutes).
Troubleshooting
Q: OperationalError at /ldvspec/api/v1/workspecification/ [WinError 10061] No connection could be made because the target machine actively refused it
A: make sure that you have a connection to a celery broker (RabbitMQ) when running the application in development mode.
Example on Windows machine:
SET CELERY_BROKER_URL=amqp://guest@raspiastro:5672
python manage.py runserver --settings=ldvspec.settings.dev