Newer
Older
# HOW TO: Migrating ASTROWISE data to the LDV Specification database without the API
For filling the ldv specification database with data dumps from the LTA (tables in schema astrowise from database ldvadmin)
### Pre-requisites
- An existing `ldv-spec-db` database with the corresponding `lofardata_dataproduct` table as defined in the Django model `DataProduct`
- A connection to the `ldvadmin` database on the database server (_Note_: accessing requires a password)
## Create combined table
First create a table in the ldvadmin database (which you should be connected to currently) in accordance with the `DataProduct` model:
> CREATE TABLE lofardata_dataproduct(
obs_id VARCHAR(15),
oid_source VARCHAR(15),
dataproduct_source VARCHAR(20),
dataproduct_type VARCHAR(50),
project VARCHAR(50),
location VARCHAR(50),
activity VARCHAR(50),
surl VARCHAR(200),
There are 3 tables in the ldvadmin database:
- raw_dataproducts
- pl_dataproducts _(pl stand for pipeline)_
- unsp_dataproducts _(unsp stands for unspecified as it's unknown what process it went through)_
They all need to be queried to retrieve the wanted information via the commands below. Note that these queries transform information to the wanted model. You should execute these one after another (NOTE: they may take some time to complete).
SELECT obsid AS obs_id, obsid_source AS oid_source, 'LOFAR LTA', dp_type AS dataproduct_type, project, location, activity, uri AS surl, size AS filesize, antenna_set, instrument_filter, dysco AS dysco_compression
pl_dataproducts:
>INSERT INTO lofardata_dataproduct
SELECT obsid AS obs_id, obsid_source AS oid_source, 'LOFAR LTA', dp_type AS dataproduct_type, project, location, activity, uri AS surl, size AS filesize, NULL, NULL, dysco AS dysco_compression
FROM astrowise.pl_dataproducts;
unsp_dataproducts:
>INSERT INTO lofardata_dataproduct
SELECT obsid AS obs_id, obsid_source AS oid_source, 'LOFAR LTA', dp_type AS dataproduct_type, project, location, activity, uri AS surl, size AS filesize, dysco AS dysco_compression
FROM astrowise.unsp_dataproducts;
### Remove duplicates
Since the surl index in our model is unique but the combined table, by above commands, does not account for this, the duplicates need to be removed after filling the table with data. For that a primary key, here 'id', needs to be constructed first (NOTE: once again, these commands may take some time due to table sizes).
#### Create unique id:
> ALTER TABLE lofardata_dataproduct ADD COLUMN id bigserial PRIMARY KEY;
#### Delete duplicate surls:
> DELETE FROM lofardata_dataproduct x USING lofardata_dataproduct y WHERE x.id < y.id AND x.surl = y.surl;
## Dump and restore
On the database server, dump the newly combined table with the correct format and the date in the filename for future reference as follows:
> pg_dump -d ldvadmin -U ldvadmin -t lofardata_dataproduct -O > lofardata_fixture_$(date +%d-%m-%y).sql
After the dump, copy the dumped .sql file to the server where ldvspec is deployed using the SCP command (for production, this is sdc.astron.nl). Subsequently, connect to this server and then to the ldv-spec-postgres docker container, which hosts the ldvspec database.
Before loading the data into the lofardata_dataproduct table, make sure to login to the database and empty the table:
> truncate table lofardata_dataproduct;
Now, exit the docker container again and load the data into the lofardata_dataproduct table created by Django (make sure you use the correct file date wise if you decide to load the data on another date):
> docker exec -i ldv-spec-postgres psql -d ldv-spec-db -U postgres -t lofardata_dataproduct < lofardata_fixture_$(date +%d-%m-%y).sql
**Note:** _the loading of the data may result in the error `ERROR: multiple primary keys for table "lofardata_dataproduct" are not allowed`. This is because of conflicting IDs for the created table w.r.t. DataProduct model table. This error can be ignored because it will take the model's setting and this is wanted behaviour._
### Validate
An inspection of the ldv-spec-db logging and simple count on both databases should suffice:
> select count(*) from lofardata_dataproduct;
### Saving the fixture
You can save the fixture on the static webserver on the sdc-dev (location server: `/var/www/ldv_spec_db_fixture`)