Skip to content

Step by Step

Here we wrap a minimal example script into a CWL document. We use a python script (heatmap.py), that creates a heatmap file (heatmap.svg) based on an input table containing sugar abundance assay data (sugar_result.csv). Wrappig the script in CWL makes it reusable in another ARC to generate the same type of heatmap based on another input table. For more details, checkout the introduction to CWL.

This guide builds on the ARC created in the Start here guide. If you have not yet followed the guide, you can download the ARC (e.g. via ARCitect) from the DataHUB.

This is what the ARC looks like
  • Directoryassays
    • DirectoryProteomics_DataAnalysis
      • README.md
      • Directorydataset
        • DirectoryMSFraggerOutput
          • combined_protein.csv
        • combined_protein.fasta
      • isa.assay.xlsx
      • Directoryprotocols
        • AssayTemplate_Proteomics_DataAnalysis.json
    • DirectoryProteomics_MS
      • README.md
      • Directorydataset
        • DirectoryMS_Raw
          • WT_Cold_1_Measured.d
          • WT_Cold_2_Measured.d
          • WT_Cold_3_Measured.d
          • WT_RT_1_Measured.d
          • WT_RT_2_Measured.d
          • WT_RT_3_Measured.d
        • isa.assay.xlsx
      • Directoryprotocols
        • AssayTemplate_Proteomics_MS.json
    • DirectorySugarMeasurement
      • README.md
      • Directorydataset
        • sugar_result.csv
      • isa.assay.xlsx
      • Directoryprotocols
        • sugar_extraction_protocol.md
    • DirectoryVisualization
      • README.md
      • Directorydataset
        • heatmap.svg
      • isa.assay.xlsx
      • Directoryprotocols
        • heatmap.py
  • isa.investigation.xlsx
  • Directoryruns
  • Directorystudies
    • DirectoryAthalianaColdStress
      • README.md
      • isa.study.xlsx
      • Directoryprotocols
        • growth_protocol.md
      • resources
  • Directoryworkflows

Isolate run parameters and workflow

Section titled Isolate run parameters and workflow

We add the following three files to the ARC. You can download the files here.

import pandas as pd
import plotly.express as px
import sys
# Read command line arguments
MeasurementTableCSV=sys.argv[1]
FigureFileName=sys.argv[2]
# Read the CSV file
data = pd.read_csv(MeasurementTableCSV, index_col=0, on_bad_lines='skip')
# Create a heatmap
fig = px.imshow(data,
labels=dict(x="Columns", y="Rows", color="Value"),
x=data.columns,
y=data.index)
# Save heatmap to file
fig.write_image(FigureFileName + ".svg")

Briefly summarized,

  • the heatmap.py is the example data analysis script, which creates a heatmap based on a CSV table input
  • the workflow.cwl is a CWL document binds the heatmap.py
    • It requires two inputs
      1. MeasurementTableCSV: the file name of the CSV table
      2. FigureFileName: how the user wants to name the output file
    • And it generates one output: an .svg file named according to FigureFileName
  • the job.yml provides the required input parameters for workflow.cwl
    • the relative path to the CSV table input: sugar_result.csv
    • the desired file name: heatmap
Loading diagram...
Source
flowchart LR
subgraph "workflow.cwl"
py@{ shape: doc, label: "heatmap.py" }
end
sugar_result.csv -.- job.yml --o workflow.cwl--> heatmap.svg

Using the workflow in your ARC

Section titled Using the workflow in your ARC
  1. Open the example ARC
  2. Add a folder “heatmap” to workflows
    • Import workflow.cwl into workflows/heatmap
    • Import heatmap.py into workflows/heatmap
  3. Add a folder “heatmap-run” to runs
    • Import job.yml into runs/heatmap-run

The ARC should now look like this:

  • Directoryassays
    • DirectorySugarMeasurement
      • Directorydataset
        • sugar_result.csv
  • isa.investigation.xlsx
  • Directoryruns
    • Directoryheatmap-run
      • job.yml
  • studies
  • Directoryworkflows
    • Directoryheatmap
      • heatmap.py
      • workflow.cwl
  1. In the ARC, navigate to the heatmap-run folder:

    Terminal window
    cd runs/heatmap-run
  2. Use the cwltool to run the workflow:

    Terminal window
    cwltool ../../workflows/heatmap job.yml