Skip to content

Recommendations for FAIR Data Analysis

The following practices support FAIR (Findable, Accessible, Interoperable, Reusable) data analysis. They apply regardless of workflow format or tooling, and help ensure that your work can be replicated, reproduced, or reused – by yourself or others, in the future or in new contexts.

  • Each script should ideally perform only one distinct task.
  • Avoid mixing unrelated processes in a single script.
  • This improves:
    • Reusability across projects
    • Understandability of the script’s purpose
    • Clarity around its inputs and outputs- A focused script is easier to reuse, understand, and annotate, especially regarding the inputs it consumes and the outputs it creates
  • Avoid hard-coded paths or filenames.
  • Parameterize inputs, outputs, and configuration as arguments.
  • Benefits:
    • Easier reuse across datasets and platforms
    • Smoother integration into larger workflows or pipelines
  • Record the script language (e.g. Python, R, Bash)
  • List all external packages, libraries, and software tools used
  • Specify versions where applicable
  • List runtime environments and hardware requirements (if relevant)
  • Include useful metadata:
  • Ensure that your analysis can be run anywhere, not just on your machine.
  • Use containerization (e.g., Docker, Singularity) to bundle:
    • Dependencies
    • Correct versions
    • Required system tools
  • Use existing community-maintained containers where possible, e.g.
  • This makes your analysis portable and reproducible across different systems.

This guide is adapted and includes recommendations from: