Skip to content
Snippets Groups Projects
Commit 84ed76d9 authored by Mario Raciti's avatar Mario Raciti
Browse files

TMSS-691: Add more captions; refactoring

parent 9c219e19
No related branches found
No related tags found
1 merge request!424Resolve TMSS-691
%% Cell type:markdown id:9cdf5a35 tags:
# Project Report PoC - TMSS
This document shows how to generate a report for a project.
This notebook shows how to generate a report for a project.
The data is retrieved through the *TMSS APIs* and it is analysed and visualised using the *Pandas* library.
---
%% Cell type:markdown id:ec22618d tags:
### Prerequirements
The following modules are needed before proceeding:
Before proceeding you need to import some modules, as well as specify some configurations.
%% Cell type:code id:48b766e7 tags:
%% Cell type:markdown id:e36852c6 tags:
#### Imports
The Pandas and Requests libraries are required.
%% Cell type:code id:ae6bff14 tags:
``` python
import pandas as pd
import requests
```
%% Cell type:markdown id:6ada730f tags:
#### Configs
Your authentication credentials are needed to perform HTTP requests to the TMSS APIs.
%% Cell type:code id:48b766e7 tags:
``` python
BASE_URL = 'http://localhost:8000/api' # TMSS API endpoint
auth = ('test', 'test') # username and password
```
%% Cell type:markdown id:340c050b tags:
---
%% Cell type:markdown id:9812780a tags:
## Retrieve the data
To retrieve the data, you need to perform a GET request to the following endpoint: `http://127.0.0.1:8000/api/project/<project>/report`
This can be done by using the `requests` module. You need to provide your authentication credentials in the `auth` parameter. Since the response will be a JSON object, you can simply store the result of `response.json()` as follows:
This can be done by using the `requests` module. To perform the request, you need to provide your target project, by specifying its *id* in the `project` variable, and to pass your authentication credentials in the `auth` parameter. Since the response will be a JSON object, you can simply store the result of `response.json()` in the `result` variable.
%% Cell type:code id:62acf8a9 tags:
``` python
BASE_URL = 'http://localhost:8000/api'
project = 'test_for_report' # project id
credentials = ('test', 'test') # username, password
project = 'high' # Specify your target project
response = requests.get(BASE_URL + '/project/%s/report' % project, auth=credentials)
# Retrieve the data related to project
response = requests.get(BASE_URL + '/project/%s/report' % project, auth=auth)
result = response.json()
result
```
%% Output
{'project': 'test_for_report',
'quota': [{'id': 4,
'resource_type_id': 'my_resource_type_5e1b484b-2466-4c9c-892c-73a675ebe323',
'value': 1000.0}],
'durations': {'total': 1800.000009,
'total_succeeded': 600.000003,
'total_not_cancelled': 1200.000006,
'total_failed': 600.000003,
'scheduling_unit_blueprints_finished': [{'id': 13,
'name': 'my_scheduling_unit_blueprint_0b668c68-39e7-40d5-9843-3fb1cba1b6bb',
'duration': 600.000003}],
'scheduling_unit_blueprints_failed': [{'id': 14,
'name': 'my_scheduling_unit_blueprint_2645e268-6b1b-444e-a4ee-383b208fba38',
'duration': 600.000003}]},
'LTA dataproducts': {'size__sum': 246},
'SAPs': [{'sap_name': 'placeholder', 'total_exposure': 0}]}
%% Cell type:code id:3276ce6d tags:
``` python
# TODO: Remove, just for testing purposes.
result = {
"project": "test_for_report",
"quota": [
{
"id": 4,
"resource_type_id": "LTA Storage",
"value": 1000.0
},
{
"id": 11,
"resource_type_id": "LTA Storage",
"value": 2400.0
}
],
"durations":{
"total": 2800.000012,
"total_succeeded": 1400.000006,
"total_not_cancelled": 1400.000006,
"total_failed": 800.000006,
"scheduling_unit_blueprints_finished": [
{
"id": 8,
"name": "amazing_sub",
"duration": 600.000003
},
{
"id": 21,
"name": "another_amazing_sub",
"duration": 800.000003
}
],
"scheduling_unit_blueprints_failed": [
{
"id": 12,
"name": "horrible_sub",
"duration": 600.000003
},
{
"id": 36,
"name": "another_horrible_sub",
"duration": 200.000003
}
]
},
"LTA dataproducts": {
"size__sum": 246
},
"SAPs": [
{
"sap_name": "sap_1",
"total_exposure": 340.0
},
{
"sap_name":"sap_2",
"total_exposure": 195.0
},
{
"sap_name":"sap_3",
"total_exposure": 235.0
}
]
}
```
%% Cell type:markdown id:1721b2bc tags:
## Manage the data
### Manage the data
Once you have retrieved the data, you need to apply some preprocessing. In the following snippet, we extract the data into variables that will be used afterwards.
Once you have retrieved the data, you need to extract it in a proper way. In the following snippet, we do such operation by defining some variables that will be used afterwards.
%% Cell type:code id:d1b58c3a tags:
``` python
project_id = result['project'] # Project id
quota = result['quota'] # Allocated resources
durations = dict(list(result['durations'].items())[:4]) # Durations are the first 4 elements in the JSON object.
subs_finished = result['durations']['scheduling_unit_blueprints_finished'] # SUBs succeeded
subs_failed = result['durations']['scheduling_unit_blueprints_failed'] # SUBs failed
lta_dataproducts = result['LTA dataproducts'] # LTA Dataproducts sizes
saps = result['SAPs'] # SAPs
```
%% Cell type:markdown id:883d53a9 tags:
You can now use Pandas for the data analysis and visualisation parts.
You can now use a library (i.e., Pandas) for the data analysis and visualisation parts.
---
%% Cell type:markdown id:d2e62c29 tags:
## Create tables
%% Cell type:markdown id:c9b7d51e tags:
## Create a report
### Summary Table
You can create a unique table within all the data related to a project. It might be convenient to create a different `DataFrame` for each variable of the previous step, as they could be used for subsequent analysis later.
%% Cell type:code id:8a0a7ed9 tags:
``` python
# Create a DataFrame for each data you want to summarise
df_durations = pd.DataFrame(durations, index=[project_id])
df_lta_dataproducts = pd.DataFrame(lta_dataproducts, index=[project_id])
# Create a general DataFrame as a summary table
df = pd.concat([df_durations, df_lta_dataproducts], axis=1)
df.style.set_caption('Summary Table')
```
%% Output
<pandas.io.formats.style.Styler at 0x7f4c158faa90>
<pandas.io.formats.style.Styler at 0x7f6bb40e6f28>
%% Cell type:markdown id:ddfa7824 tags:
%% Cell type:markdown id:8f9b1da2 tags:
Note that for the other values, you can follow a similar procedure as illustrated by the following sections.
For the other values, you can follow a similar procedure:
%% Cell type:markdown id:c2f231ba tags:
%% Cell type:code id:aec5ebb1 tags:
### Quota table
%% Cell type:code id:1b94ce76 tags:
``` python
# Create a DataFrame for quota
df_quota = pd.DataFrame(quota)
df_quota.set_index('id')
```
%% Output
resource_type_id value
id
4 LTA Storage 1000.0
11 LTA Storage 2400.0
%% Cell type:code id:a2621ab9 tags:
%% Cell type:markdown id:47f4503c tags:
``` python
df_saps = pd.DataFrame(saps)
df_saps.set_index('sap_name')
```
### SchedulingUnitBlueprints
%% Output
%% Cell type:markdown id:93b336e7 tags:
total_exposure
sap_name
sap_1 340.0
sap_2 195.0
sap_3 235.0
#### Finished SUBs
%% Cell type:code id:192cea33 tags:
%% Cell type:code id:10537659 tags:
``` python
# Create a DataFrame for finished SUBs
df_subs_finished = pd.DataFrame(subs_finished)
df_subs_finished.set_index('id')
```
%% Output
name duration
id
8 amazing_sub 600.000003
21 another_amazing_sub 800.000003
%% Cell type:code id:02d2a788 tags:
%% Cell type:markdown id:c67ef360 tags:
#### Failed SUBs
%% Cell type:code id:6487f39e tags:
``` python
# Create a DataFrame for failed SUBs
df_subs_failed = pd.DataFrame(subs_failed)
df_subs_failed.set_index('id')
```
%% Output
name duration
id
12 horrible_sub 600.000003
36 another_horrible_sub 200.000003
%% Cell type:markdown id:265697ce tags:
### SAPs
%% Cell type:code id:669389c5 tags:
``` python
# Create a DataFrame for SAPs
df_saps = pd.DataFrame(saps)
df_saps.set_index('sap_name')
```
%% Output
total_exposure
sap_name
sap_1 340.0
sap_2 195.0
sap_3 235.0
%% Cell type:markdown id:193c3998 tags:
---
%% Cell type:markdown id:b7a4b0b9 tags:
### Create a plot
## Create a plot
To better visualise the data, you could plot it in several ways. For example, you might want to observe the differences between all of the four durations retrieved, as follows:
%% Cell type:markdown id:3425a1bd tags:
%% Cell type:markdown id:b5a0d68a tags:
#### Quota
### Quota
%% Cell type:code id:c14dacaa tags:
%% Cell type:code id:8b6ad3ba tags:
``` python
# Plot a horizontal bar graph
# Plot a bar graph
ax_quota = df_quota.plot.bar(title='Quota', x='id', color=['#58a5f0'])
```
%% Output
%% Cell type:markdown id:cffa43f9 tags:
%% Cell type:markdown id:619ec2aa tags:
#### Durations
### Durations
%% Cell type:code id:da3340db tags:
``` python
# You can associate a color for each duration
colors = {'total': '#58a5f0', 'total_not_cancelled': '#ffd95a', 'total_succeeded': '#60ad5e', 'total_failed': '#ff5f52'}
# Plot a horizontal bar graph
ax_durations = df_durations.plot.barh(title='Durations', color=colors)
```
%% Output
%% Cell type:markdown id:f7eb2288 tags:
%% Cell type:markdown id:982e28e7 tags:
#### Scheduling Unit Blueprints
### Scheduling Unit Blueprints
You can plot either the finished or the failed SUBs. In addiction, you can also plot a unified bar graph. Here all of the three options are shown.
%% Cell type:code id:c84e2c5e tags:
%% Cell type:markdown id:07b8933f tags:
#### Finished SUBs
%% Cell type:code id:4835b891 tags:
``` python
# Plot a horizontal bar graph
# Plot a bar graph
ax_subs_finished = df_subs_finished.plot.bar(title='Finished SUBs', x='id', color='#60ad5e')
```
%% Output
%% Cell type:code id:2ae02964 tags:
%% Cell type:markdown id:e385353c tags:
#### Failed SUBs
%% Cell type:code id:18074170 tags:
``` python
# Plot a horizontal bar graph
# Plot a bar graph
ax_subs_failed = df_subs_failed.plot.bar(title='Failed SUBs', x='id', color='#ff5f52')
```
%% Output
%% Cell type:markdown id:340f067d tags:
%% Cell type:markdown id:51e4faea tags:
##### SUBs Summary
#### SUBs Summary
To summarise both finished and failed SchedulingUnitBlueprints, you can concatenate the prior DataFrames as well as adding a new column to distinguish them in the new DataFrame:
%% Cell type:code id:a2e8f9cb tags:
%% Cell type:code id:d709f04e tags:
``` python
# Add a status column to differentiate colors later
df_subs_finished['status'] = 'finished'
df_subs_failed['status'] = 'failed'
# Create a new DataFrame, within index sorting, as a concatenation of finished and failed SUBs.
df_subs = pd.concat([df_subs_finished, df_subs_failed]).set_index('id').sort_index()
df_subs
```
%% Output
name duration status
id
8 amazing_sub 600.000003 finished
12 horrible_sub 600.000003 failed
21 another_amazing_sub 800.000003 finished
36 another_horrible_sub 200.000003 failed
%% Cell type:markdown id:3be9b79e tags:
%% Cell type:markdown id:b7e56fc8 tags:
Then, you can plot a bar graph discriminting colors by status:
%% Cell type:code id:117d845e tags:
%% Cell type:code id:0ed4233a tags:
``` python
# Associate colors
colors = {'finished': '#60ad5e', 'failed': '#ff5f52'}
# Plot the concatenated DataFrame
ax_subs = df_subs.plot.bar(title='Finished and Failed SUBs', y='duration', legend=False, color=list(df_subs['status'].map(colors)))
```
%% Output
%% Cell type:markdown id:f5be0a1d tags:
%% Cell type:markdown id:a2682239 tags:
#### SAPs
%% Cell type:code id:b323083e tags:
``` python
# Plot a horizontal bar graph
# Plot a bar graph
ax_saps = df_saps.plot.bar(title='SAPs', x='sap_name', color=['#ffd95a'])
```
%% Output
%% Cell type:code id:91042123 tags:
%% Cell type:markdown id:9cc1a590 tags:
---
%% Cell type:code id:a5d9b0db tags:
``` python
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment