CWL Fragpipe
FragPipe
Section titled FragPipeFragPipe is a software suite for the analysis of mass spectrometry-based proteomics data. It integrates multiple state-of-the-art software tools and algorithms into a unified, user-friendly graphical interface, allowing researchers to process raw MS data efficiently.
Setting up the FragPipe ARC-CWL workflow
Section titled Setting up the FragPipe ARC-CWL workflowThe FragPipe ARC-CWL workflow requires several files and directories to be set up correctly. The following sections will guide you through the setup process.
You need an ARC to run this workflow, since necessary information about the experiment is retrieved from the assay and study files by the workflow. How to set up an ARC is described in this Getting Started Guide.
Special requirements for the ARC:
- Assay and study files containing columns describing the data:
- Experiment (e.g., hot/cold)
- Replicate (e.g., 1, 2, 3, …)
- Acquisition mode (DDA or DIA)
- One assay sheet named “MassSpectrometry,” which is linked to previous assays and studies containing the aforementioned columns through inputs and outputs (standard assay/study setup, this is described in the Getting Started Guide):
- The output of this sheet must be of type
Dataand contain the names of your mass spectrometry files or folders (just the name, not the path, e.g.,MyRun.d).
- The output of this sheet must be of type
- One folder in the
workflowsdirectory named “Fragpipe.”
Dockerfile
Section titled DockerfileThe current official Docker image is missing curl. This can be circumvented by using your own Dockerfile based on the official one and adding the curl installation command to the file.
- Download the Dockerfile here by clicking on the download button in the upper right corner.

- Place the Dockerfile in the
Fragpipefolder within theworkflowsdirectory.
ARC Parameter Collection Script
Section titled ARC Parameter Collection ScriptThis workflow uses a script to collect the required parameters for the FragPipe ARC-CWL workflow.
- Download the script here.
- Place the script in the
scriptsfolder within theFragpipedirectory in theworkflowsdirectory.
This script will create the required manifest file and adapt the workflow file for the run.
CommandLineTool and Workflow Descriptions
Section titled CommandLineTool and Workflow Descriptions- Download the FragPipe CommandLineTool.
- Download the ManifestAndWorkflow CommandLineTool.
- Download the Workflow.
- Place all three files in the
Fragpipedirectory within theworkflowsdirectory.
The workflow executes both CommandLineTool descriptions in the correct order.
FragPipe Tools
Section titled FragPipe ToolsThe FragPipe tools must be downloaded manually due to licensing restrictions. The easiest way is to download all required tools through the
FragPipe GUI and mount them into the Docker container. For this, the tools must be placed in the tools directory within the workflow directory.
Downloading the Tools with the FragPipe GUI
Section titled Downloading the Tools with the FragPipe GUI- Download the FragPipe GUI from the FragPipe releases and unzip it.
- Navigate to the
binfolder and run thefragpipe.exe. - In the Config tab, click on the
Download / Updatebutton and follow the steps.
- Copy the
toolsdirectory into your workflow directory.
Directory structure
Section titled Directory structureAfter obtaining all required files, your workflows directory structure should look like this:
Directoryworkflows
DirectoryFragpipe
- Dockerfile
- Fragpipe.cwl
- ManifestAndWorkflow.cwl
- Workflow.cwl
- tools
Directoryscripts
- manifestAndWorkflow.fsx
Setting up the run
Section titled Setting up the runThe run requires a run.cwl and a run.yml file. The run.cwl file content could look like this:
cwlVersion: v1.2class: Workflow
requirements: - class: MultipleInputFeatureRequirement - class: SubworkflowFeatureRequirement# you can specify resources here:# - class: ResourceRequirement# coresMin: 20# ramMin: 40960inputs: arcDirectory: Directory runName: string assayName: string experimentColumn: string replicateColumn: string acquisitionColumn: string fastAPath: string ddaOnly: string headless: boolean workdir: string toolsFolder: string threads: int
steps: FragpipeAll: run: ../../workflows/Fragpipe/workflow.cwl in: arcDirectory: arcDirectory runName: runName assayName: assayName experimentColumn: experimentColumn replicateColumn: replicateColumn acquisitionColumn: acquisitionColumn fastAPath: fastAPath ddaOnly: ddaOnly headless: headless workdir: workdir toolsFolder: toolsFolder threads: threads out: [result]
outputs: FragpipeAllResult: type: Directory outputSource: FragpipeAll/result
# The following is metadata for the workflow
arc:author: - class: arc:Person arc:first name: "Caroline" arc:last name: "Ott" arc:email: "caroline.ott@rptu.de" arc:affiliation: "RPTU Kaiserslautern/Landau"The run.yml file contains all the necessary information for your run. The run.yml file content could look like this:
An example run.yml file could look like this:
arcDirectory: class: Directory path: ../../runName: MyRunassayName: MyAssayexperimentColumn: "Condition, Parameter"replicateColumn: "Replicate, Characteristic"acquisitionColumn: "scan mode, Parameter"fastAPath: ./arc/studies/MyOrganism/resources/ProteinsFastaWithDecoys.fastaddaOnly: "TRUE"headless: trueworkdir: ./arc/runs/MyRuntoolsFolder: ./arc/workflows/Fragpipe/toolsthreads: 20
# The following is metadata for the workflow
arc:performer: - class: arc:Person arc:first name: "Your" arc:last name: "Name" arc:email: "your@email" arc:affiliation: "Your institution"- Copy the content for the
run.cwlandrun.ymlblocks into a text editor (e.g., Notepad). - Save them as
run.cwlandrun.ymlin theMyRun(example name) folder within therunsdirectory. - Update the paths and names in the
run.ymlfile to match your ARC:arcDirectorydoesn’t change.- Paths always start with
./arc, since the ARC is mounted into the Docker container with that name. - The
runNameis the name you want to give your run.- The
workdirname is usually the same as therunName.
- The
- The
assayNameis the name of the assay you want to analyze.
- Update the
experimentColumn,replicateColumn, andacquisitionColumnvalues to match your assay or study files:- The first part is the column name, the second part is the column type (Characteristic, Parameter or Factor).
- These three columns must exist in your assay or study files.
- These columns are the ones you specified here.
ddaOnlycan be set to “TRUE” or “FALSE”:- If you are analyzing DIA files, set it to “FALSE”.
- The number of threads can be set to the number of files you want to analyze in parallel:
- You also need to specify the number of threads in the
run.cwlfile underResourceRequirement.- Remove the
#in front of the lines and set the number of cores and RAM you want to use.
- Remove the
- You also need to specify the number of threads in the
FragPipe Workflow
Section titled FragPipe WorkflowFragPipe requires a fragpipe.workflow file specifying the workflow settings. You can use a predesigned workflow from the FragPipe repository
or create your own (this is easier with the GUI).
Using the GUI
Section titled Using the GUI- Open the FragPipe GUI and navigate to the
Workflowtab. - Select a workflow and press the
Load workflowbutton.
- Follow the instructions on the FragPipe website for your use case, or keep the preset workflow you selected.
- Click on the
Save to custom folderbutton in theWorkflowtab and save it to therunsdirectory in the folder namedMyRun(example name) under the namefragpipe.workflow. - If you have a protein FASTA withouth decoys, you can add them in the
Databasetab:- Click on the
Browsebutton and select your FASTA file. - Click on the
Add decoysbutton to add decoys. - The FASTA file with decoys will be saved in the same folder as the original FASTA file.
- Click on the
Directory structure
Section titled Directory structureAfter you obtained those files, the folder structure should look like this:
Directoryruns
DirectoryMyRun
- run.cwl
- run.yml
- fragpipe.workflow
Directoryworkflows
DirectoryFragpipe
- Dockerfile
- Fragpipe.cwl
- ManifestAndWorkflow.cwl
- Workflow.cwl
- tools
Directoryscripts
- manifestAndWorkflow.fsx
Running the Workflow
Section titled Running the WorkflowA CWL runner is required to run the workflow.
- Install the CWL reference runner as described here: CWL Runner Installation.
- Open the command-line tool.
- Activate the environment in which you installed the CWL runner:
- Instructions for this can be found here.
- Navigate to the
runsfolder:- Use the
cdcommand to navigate there:Terminal window cd path/to/your/runs
- Use the
- Run the workflow with the following command:
Terminal window cwltool ./MyRun/run.cwl ./MyRun/run.yml