Skip to content

Fix slowness and add profiling

Robbie Luijben requested to merge SDC-870/fix-slowness into main

This MR does two things:

  1. Add profiling with silk:
  • There is now a django setting ENABLE_PROFILING, this turns off or on runtime settings (like adding the correct urls for the profiler dashboard)
  • There is a decorator/wrapper for the silk_profiler decorator (which adds detailed profiling on views). This wrapper looks at the ENABLE_PROFILING to determine whether to profile or not
  • There are settings files for local development (dev_profiling) and dev/prod (docker_sdc_profiling)

This combination will also prevent warning/error messages regarding profiling. For local development, just run with dev_profiling and on dev/prod, change the settings env var to sdc_docker_profiling in favor of sdc_docker and restart the application (e.g., redeploy) Example:

image

  1. Solve the slowness on the create/detail page by optimizing the call to determine distinct dataproduct filter types.

The problem lies in retrieving distinct dataproduct fields (e.g., location, activity, dataproduct_type) from a table with 10+ million records. In reality, across millions of records, only a handful of distinct values exist. Some sort of caching/memoization solution is in order.

Two major solutions were considered: adding an index to certain fields on dataproduct columns or caching distinct values directly. The latter solution was chosen.

Two major cons of adding an index:

  • there are only a few distinct values, but an index on these fields would 'cache' the value for every record (potentially 10+ million), this is wasteful and costs a lot of diskpace.
  • postgres is not so good at efficient distinct queries because it lacks a decent index skip scan, which requires complicated querying

Chosen caching solution is based on Django's cache framework: https://docs.djangoproject.com/en/4.1/topics/cache/

It's used in conjunction with Memcached, which is spun up in a docker container from docker compose configuration.

Edited by Robbie Luijben

Merge request reports