RDM fundamentals

Dominik Brilhaus
Sept 20th, 2023

Legal aspects of RDM

Different laws touched by RDM

Hartmann, Thomas. (2019). Rechtsfragen: Institutioneller Rahmen und Handlungsoptionen für universitäres FDM. Zenodo. https://doi.org/10.5281/zenodo.2654306

Open Access (OA) categories

  • Gold: Published in an open-access journal that is indexed by the DOAJ.
  • Green: Toll-access on the publisher page, but there is a free copy in an OA repository.
  • Hybrid: Free under an open license in a toll-access journal.
  • Bronze: Free to read on the publisher page, but without a clearly identifiable license.
  • Closed: All other articles, including those shared only on an Academic Social Network or in Sci-Hub.

Piwowar H et al. (2018), PeerJ https://doi.org/10.7717/peerj.4375

Open Science is more than Open Access

Okafor et al. (2022) https://doi.org/10.3389/frma.2022.855198, Figure 1

Creative commons

Check out: https://creativecommons.org/about/cclicenses/

adapted from https://wiki.creativecommons.org/images/0/01/6licenses-folded.pdf

Data protection

GDPR: General Data Protection Regulation
DS-GVO (german): Datenschutz-Grundverordnung

Use of biological materials

FAIR and CARE

https://www.gida-global.org/care

CARE principles

https://datascience.codata.org/articles/10.5334/dsj-2020-043/

Research Data policies

Hiemenz, Bea & Kuberek, Monika (2018) http://dx.doi.org/10.14279/depositonce-7521

CEPLAS relevant data handling guidelines & policies

The Data Management Plan (DMP)

  • Covers the full research data lifecycle
  • Frequently updated as your project develops
  • Required to different extents by funding agencies (e.g. DFG, Horizon Europe, BMBF, BMEL, ... )

DMP tools

Check out the Elixir RDMkit for more

Public data repositories

Domain-specific data repositories

Repository Description Biological data domain
EBI-ENA European Nucleotide Archive genome / transcriptome sequences
EBI-ArrayExpress Archive of Functional Genomics Data transcriptome
EBI-MetaboLights Database of Metabolomics metabolome
EBI-PRIDE PRoteomics IDEntifications Database proteome
EBI-BioImage Archive Stores and distributes biological images imaging, microscopy
e!DAL-PGP Plant Genomics & Phenomics Research Data Repository phenome
NCBI-GEO Gene Expression Omnibus transcriptome
NCBI-GenBank Genetic Sequence Database genome
NCBI-SRA Sequence Read Archive genome / transcriptome sequences

Choosing a data repository

Domain-specific >> Generic >> Institutional

Find repositories at:

Domain-specific data repositories

Good

  • Assign PIDs / DOIs
  • Long-term accessible
  • Data type specific
  • Apply metadata standards
  • Usually recommended / required by journals
  • Mostly accepted by the community

Intermediate

  • User-friendliness
  • Different metadata schema
  • Complex and versatile submission routines

Generic data repositories

Good

  • Allow publication of any kind of data Assign PIDs / DOIs
  • Long-term accessible
  • Very simple to use

Intermediate

  • Only generic / high-level metadata schema
  • Limited reusability

Peristent Identifiers (PIDs)

Spot the PIDs

https://doi.org/10.1093/plcell/koab243

Globally unique, stable, persistent identifiers (PIDs)

  • Long-term findability
  • Make data, digital objects, people, … uniquely identifiable
  • Diminish “dead links”
  • Cope with name changes

Properties of a PID

Ideally, PIDs are

  • Stable and permanent
  • Location-independent
  • Globally unique and valid
  • Addressable (citable)
  • Clickable (resolvable)

Adapted from https://www.ebi.ac.uk/rdf/documentation/good_practice_uri/

Additional resources

Data stores

Backup vs. Archive


Backup Archive
Storage type Short-, mid-term Long-term
Purpose Disaster recovery Long-term storage, compliance
Reason Duplication Migration
Usage Work in progress Cold, Unused data
Changes Short-term updates No updates
Trend Cyclic, Replacement Growing
Latency Short/Costly High/Cheaper

3-2-1 backup rule

Version control and track changes

It’s good practice to document:

  • What was changed?
  • Who is responsible?
  • When did it happen?
  • Why the changes?

Types of Version Control

  • by file name (_v1, _v2)
  • cloud services
    • dropbox, icloud, gdrive
  • distributed version control system
    • e.g. Git

Data Sharing

Cloud Services

✓ Documents
✓ Small data
✓ Presentations

X Code
X Data analytical projects
X Big (“raw”) data

Overview of Institutional services at UoC and HHU

UoC

HHU

Contributors

Slides presented here include contributions by