D1. Stage the data on a given resource (in a more user-friendly way)
We want to stage data to a given compute resource in a way so that:
- It can handle 1000+ rows of the shopping basket
- It does not require an (extra) login and uses existing credentials
- It does not require manual actions during runtime (such as Jupyter Notebook commands, working with a plugin)
For this, an async worker could be started using a generic interface taking:
- Metadata for which data to store
- Metadata for which software to use
- Metadata for where to run it on (Compute infrastructure)
An initial diagram of this workflow can be seen here: https://drive.google.com/file/d/1slRgVOwpkuvBulcou9eRnoGEFIIb66yA/view?usp=sharing
Topics to discuss:
- Authorization: Just use the token from the user? Get certificates from IAM? Use a "trusted" worker/plugin?
Edited by Klaas Kliffen