Data Workflow

Goal

To deliver an automated export of JSON files with metrics from Google Analytics to a publicly accessible page with a low amount of steps and dependencies

Requirements

  • Automated exports of data on a daily basis
  • Customisable reports for different needs
  • Read only final output without requiring authentication

Possible workflows to use

graph LR; A(Google Analytics) -- GA addon for Google Sheets --> B(Report) subgraph Google Sheets B -.- C((Publish to Web)) C -.-|Option| D(JSON) C -.-|Option| E(CSV) end D --- F[Public page] E --- F[Public page]

Early on the JSON option was preferred, but due to the complexity of the exported JSON from Google Sheets, and subsequent need to have another step to process that data into something more manageable, it was dropped in favour of CSV.

The Google Analytics addon can schedule reports to update daily automatically. The “Publish to web” will display any changes made to a report within a few minutes.

3rd party option

There are several 3rd parties that could provide simpler/easier workflows but this would probably imply costs, limitations and dependabilities that could be counterproductive as a scalable and long term service.

Limitations

  • A new EMBL account that is not linked to a specific person should be created to host the reports;
  • EBI accounts don’t share the restriction of EMBL accounts in terms of publishing public pages - a requirement for exporting a publicly accessible file;
    HD IT changed this in May 2019. All EMBL accounts are now able to publish publicly.
  • Google Analytics has restrictions in the ammount of views per property (25 I believe). This might have an impact depending on how GA is set up at EMBL. The view is the basis of the creation of reports;
  • The scheduling reporting has hit some GA call limitations. A workaround was to set them to different scheduled hours.