Check-in and ARC Commander Hands-on

Dominik Brilhaus – CEPLAS Data Science

Registration

Everyone signed-up at the DataHUB?

Check your installation

Open a terminal and one after the other execute

git --version
git-lfs --version
arc --version

💡 If you see a warning at any of these, let us know.

Config

git config --global --get-regexp user

💡 If this does not display your user name and email, you need to configure git.

Have a simple text editor ready

  • Windows Notepad
  • MacOS TextEdit

Recommended text editor with code highlighting, git support, terminal, etc: Visual Studio Code

Create a fresh folder for your ARCs

For this workshop, create a new folder somewhere on your machine where you want to store ARCs, e.g. in your documents folder:

  • C:\Users\<username>\Documents\workshop-arcs (windows)
  • ~/Documents/workshop-arcs (mac)

⚠️ Ideally this folder is not "watched" by any cloud service (Sciebo, google drive, iCloud, etc.)

Hands-on with demo data

First steps towards your ARC using the ARC Commander

Download the demo data

git clone "https://demo-user:1_eznikmzxzARAbUxxnF@git.nfdi4plants.org/teaching/demo-arc_level0.git"

You just received your data

Goal

  • Structure,
  • Annotate, and
  • Share your experimental data.

💡 We'll talk about data annotation later

Structure your data

Your fresh ARC folder

  1. Create a new folder, which you want to initialize as an ARC.
  2. Open the command line inside the folder or navigate via command line to that folder.

For example:

mkdir -p ~/Documents/workshop-arcs/arc-demo
cd ~/Documents/workshop-arcs/arc-demo

Initiate the ARC folder structure

arc init

Create an investigation

arc investigation create -i TalinumPhotosynthesis --title TalinumPhotosynthesis --description "This is a very interesting investigation about life and photosynthesis"

Add (at least one) person

arc investigation person register --lastname Brilhaus --firstname Dominik --email brilhaus@hhu.de --affiliation CEPLAS

💡 For each person added, the minimum information is
lastname | firstname | email | affiliation

Add a study

arc study add -s talinum_drought

Add assays

arc assay add -s talinum_drought -a rnaseq
arc assay add -s talinum_drought -a metabolomics

Collaborate and share

Upload your local ARC to the DataHUB

arc sync -r https://git.nfdi4plants.org/<username>/arc-demo

Received two emails from "GitLab" about a failed pipeline?

🔥 Don't worry 😄

Pipeline Failed

  • a "continuous quality control" (CQC) pipeline validates your ARC

  • This fails if one of the following metadata items is missing:

    Investigation Identifier
    Investigation Title
    Investigation Description
    Investigation Person Last Name
    Investigation Person First Name
    Investigation Person Email
    Investigation Person Affiliation
    

Sort the demo data into the ARC

Identify "raw dataset(s)" and "protocols" and move them to the proper subfolders in the ARC.

Sync your ARC to the DataHUB

To save the changes, sync the ARC to the DataHUB including a message.

arc sync -m "sorted the demo data"

Check the ARC in the DataHUB

  • Navigate to https://git.nfdi4plants.org/<username>/arc-demo to visit your ARC in the DataHUB

Your ARC is ready

Contributors

Slides presented here include contributions by