Skip to content
Snippets Groups Projects
Commit ff6a2dc4 authored by John Swinbank's avatar John Swinbank
Browse files

Merge branch 'updated' into 'main'

Updated Workflow

See merge request !1
parents 677f5059 586d749d
No related branches found
No related tags found
1 merge request!1Updated Workflow
## Integrating Machine Learning
## Zooniverse - Integrating Machine Learning
This directory contains resources for the _integrating Machine Learning_ tutorial.
This directory contains resources for the _Integrating Machine Learning_ tutorial. This tutorial forms part of a series of advanced guides for managing Zoonivere projects through Python. While they can be done independently, for best usage you may want to complete them in the following order (these are also all available as Interactive Analysis workflows in the ESAP GUI):
1. [Advanced Project Building](https://git.astron.nl/astron-sdc/escape-wp5/workflows/zooniverse-advanced-project-building)
2. [Advanced Aggregation with Caesar](https://git.astron.nl/astron-sdc/escape-wp5/workflows/zooniverse-advanced-aggregation-with-caesar)
3. [Integrating Machine Learning (current)](https://git.astron.nl/astron-sdc/escape-wp5/workflows/zooniverse-integrating-machine-learning)
Zooniverse's _Caesar_ advanced retirement and aggregation engine allows for the setup of more advanced rules for retiring subjects. _Caesar_ also provides a powerful way of collecting and analysing volunteer classifications (aggregation). Machine learning models can be used with _Caesar_ to make classification workflows more efficient, such as for implementing advanced subject retirement rules, filtering subjects prior to being shown to volunteers, or setting up an active learning cycle so that volunteer classifications help train the machine learning model.
For guides on creating a Zooniverse project through the web interface or by using Python, take a look at the _Advanced Project Building_ tutorial above and the links therein. For an introduction to _Caesar_, take a look at the _Advanced Aggregation with Caesar_ tutorial above and the links therein. Note that this tutorial does not cover the basics of machine learning, for which there are various guides online and in print.
The advanced tutorial presented here includes demonstrates of using Python for:
* Advanced retirement rules using machine learning
* Option 1: pre-classifying with machine learning in preparation for volunteer classifications
* Option 2: "on the fly" retirement decisions made after both machine learning and volunteer classifications
* Using machine learning to filter out uninteresting subjects prior to volunteer classification
* Setting up active learning: volunteer classifications train the machine learning model, which in turn handles the "boring" subjects and leaves the more challenging/interesting subjects for volunteers.
You can find the code for the tutorial in the `notebooks` folder and the data that were used in the `data` folder.
As with the _Advanced Project Building_ tutorial, this tutorial makes use of example material (subjects, metadata, classifications) from the [_SuperWASP Variable Stars_](https://www.zooniverse.org/projects/ajnorton/superwasp-variable-stars) Zooniverse project, which involves classifying light curves (how brightness varies over time) of stars.
A recorded walkthrough of this advanced tutorial is available [here](https://youtu.be/o9SzgsZvOCg?t=8218) as part of the [First ESCAPE Citizen Science Workshop](https://indico.in2p3.fr/event/21939/).
The ESAP Archives (accessible via the ESAP GUI) include data retrieval from the Zooniverse Classification Database using the ESAP Shopping Basket. For a tutorial on loading Zooniverse data from a saved shopping basket into a notebook and performing simple aggregation of the classification results, see [here](https://git.astron.nl/astron-sdc/escape-wp5/workflows/muon-hunters-example/-/tree/master) (also available as an Interactive Analyis workflow).
### Setup
#### Option 1: ESAP workflow as a remote notebook instance
You may need to install the `panoptes_client`, `pandas`, and `boto3` packages. `boto3` is a package that implements the Amazon Web Service (AWS) Python SDK (see the [documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html) for more detail), and is used in the second half of the tutorial.
```
!python -m pip install panoptes_client
!python -m pip install pandas
!python -m pip install boto3
```
#### Option 2: Local computer
1. Install Python 3: the easiest way to do this is to download the Anaconda build from https://www.anaconda.com/download/. This will pre-install many of the packages needed for the aggregation code.
2. Open a terminal and run: `pip install panoptes_client` and `pip install boto3`.
3. Download the [Integrating Machine Learning](https://git.astron.nl/astron-sdc/escape-wp5/workflows/zooniverse-integrating-machine-learning/) tutorial into a suitable directory.
#### Option 3: Google Colab
Google Colab is a service that runs Python code in the cloud.
1. Sign into Google Drive.
2. Make a copy of the [Integrating Machine Learning](https://git.astron.nl/astron-sdc/escape-wp5/workflows/zooniverse-integrating-machine-learning/) tutorial in your own Google Drive.
3. Right click the `MachineLearning.ipynb` file > Open with > Google Colaboratory.
1. If this is not an option click "Connect more apps", search for "Google Colaboratory", enable it, and refresh the page.
4. Run the following in the notebook:
1. `!pip install panoptes_client` and `!pip install boto3` to install the required packages,
2. `from google.colab import drive; drive.mount('/content/drive')` to mount Google Drive, and
3. `import os; os.chdir('/content/drive/MyDrive/zooniverse-integrating-machine-learning/')` to change the current working directory to the example folder (adjust if you have renamed the example folder).
### Other Useful Resources
Here is a list of additional resources that you may find useful when building your own Zooniverse citizen science project.
* [_Zooniverse_ website](http://zooniverse.org) - Interested in Citizen Science? Create a **free** _Zooniverse_ account, browse other projects for inspiration, contribute yourself as a citizen scientist, and build your own project.
* [Zooniverse project builder help pages](https://help.zooniverse.org) - A great resource with practical guidance, tips and advice for building great Citizen Science projects. See the ["Building a project using the project builder"](https://youtu.be/zJJjz5OEUAw?t=7633) recorded tutorial for more information.
* [_Caesar_ web interface](https://caesar.zooniverse.org) - An online interface for the _Caesar_ advanced retirement and aggregation engine. See the ["Introducing Caesar"](https://youtu.be/zJJjz5OEUAw?t=10830) recorded tutorial for tips and advice on how to use Caesar to supercharge your _Zooniverse_ project.
* [The `panoptes_client` documentation](https://panoptes-python-client.readthedocs.io/en/v1.1/) - A comprehensive reference for the Panoptes Python Client.
* [The `panoptes_aggregation` documentation](https://aggregation-caesar.zooniverse.org/docs) - A comprehensive reference for the Panoptes Aggregation tool.
* [The `aggregation-for-caesar` GitHub](https://github.com/zooniverse/aggregation-for-caesar) - A collection of external reducers for _Caesar_ and offline use.
* [Amazon Web Services (AWS)](https://aws.amazon.com) - A cloud computation provider that can be used to create your own SQS queue like the one detailed in this tutorial. You can register for a free account and **some** services are available free of charge **up to specific usage limits**. Be careful you don't exceed these limits or you may end up with a bill to pay!
* [AWS Simple Queue Service (SQS)](https://aws.amazon.com/sqs/) - Information about the SQS message queueing system that can be used together with _Caesar_ to implement computationally intensive extraction and reduction tasks and apply them to your project's classifications.
\ No newline at end of file
File added
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment