Skip to content

Dev Containers with CWL

Here we introduce Dev Containers and how to use them for data analysis with Common Workflow Language (CWL).

A container is a standardised software unit that bundles program code and all necessary dependencies so that applications can run quickly and reliably in different IT environments. You can think of it as a box that includes everything needed for your project.

A Docker container image is a lightweight, self-contained, executable software package that contains everything necessary to run an application: the code, the runtime environment, system tools, libraries, and configurations. Using Docker containers can, for example, improve reproducibility, which is a significant advantage over non-isolated projects.

A Dev Container is based on a container concept such as Docker. Docker is not strictly required for its execution, but a container engine that is compatible with the Dev Container standard is necessary, such as Docker, Podman, or a corresponding runtime environment.

How to benefit from using containers

Section titled How to benefit from using containers

Dev or Docker containers create a clean, uniform development environment, regardless of the operating system. They prevent classic “it works for me” problems, reliably isolate projects, and enable the parallel use of different software versions. Thanks to their portability, they run the same everywhere, locally, on the server, or in the cloud. They also support the FAIR principles: developments become findable, accessible, interoperable, and reusable. In short, containers save time, minimize sources of error, and promote a transparent, sustainable workflow.

  1. Install VS Code here
  2. Install the Dev Containers extension here
  3. Install other necessary extensions for your project (For example: .NET Install Tool or Ionide for F#)
  4. Install Docker Desktop here
  1. Create a new folder with your project name and open it in VS Code.
  2. In the VS Code menu bar, click View** > Command Palette and run the following command: Dev Containers: Add Dev Container Configuration Files...

There are two options for setting up your environment:

Option A - specified Dockerfile

Section titled Option A - specified Dockerfile
  1. Start with “Add configuration to workspace”
  2. Choose .NET (C#), Node.js (TypeScript) & MS SQL
  3. Select 8.0-bookworm (default)
  4. Select -lts (default)
  5. Then press OK, or select additional features if desired

Option B - creating your own Dockerfile

Section titled Option B - creating your own Dockerfile
  1. Start with “Add configuration to workspace”
  2. Choose F# (.NET) and press OK

Example

This example uses Option B, which requires creating a Dockerfile. You can create it by running the command touch Dockerfile in the VS Code terminal. The image below shows a simple Dockerfile:

Dockerfile
FROM mcr.microsoft.com/dotnet/sdk:8.0
ENV DOTNET.DockerScoutOptOut=1
COPY *.devcontainer/settings.vscode.json /root/.vscode-remote/data/Machine/setting.json
RUN apt-get update && apt-get -y install git procps

Now we have everything we need to start our project.

This tutorial starts with two scripts, each of which creates a column chart and a target folder to store the results. To better understand the file structure and the CWL workflow, the example with all files is available in an ARC. You can find it here

  1. Create your scripts

    exampleChart1.fsx
    //first example chart for creating a simple column chart,
    //with Chart.saveHtml, where you can define the
    //location where the chart should be saved.
    #r "nuget: Plotly.NET, 5.0.0"
    #r "nuget: Plotly.NET.Interactive, 5.0.0"
    #r "nuget: Plotly.NET.ImageExport, 6.1.0"
    open Plotly.NET
    open Plotly.NET.Interactive
    open System
    open System.IO
    //example: simple column chart for Devcontainer Tutorial
    let x = 25
    let y = 85
    let combine = [|x, y|]
    let buildChart =
    Chart.Column (
    keysValues = combine,
    Name = "Example Chart for Devcontainer Tutorial"
    )
    buildChart |> Chart.saveHtml (String.concat "" [|"./devcontainerTutorial/runs/execution"; "/Results/exampleChart1"|])
  2. The second example Script is identical

    exampleChart2.fsx
    //second example chart, which is identical to the one above
    #r "nuget: Plotly.NET, 5.0.0"
    #r "nuget: Plotly.NET.Interactive, 5.0.0"
    #r "nuget: Plotly.NET.ImageExport, 6.1.0"
    open Plotly.NET
    open Plotly.NET.Interactive
    open System
    open System.IO
    //example Chart 2 for Devocontainer tutorial
    let x = 34
    let y = 26
    let combine = [| x, y|]
    let buildChart =
    Chart.Column (
    keysValues = combine,
    Name = "Example Chart for Devcontainer Tutorial"
    )
    buildChart |> Chart.saveHtml (String.concat "" [|"./devcontainerTutorial/runs/execution"; "/Results/exampleChart2"|])
  3. After that we need the CWL Command-Line-Tools that calls the scripts.

    exampleChart1.cwl
    cwlVersion: v1.2
    class: CommandLineTool
    hints:
    DockerRequirement:
    dockerImageId: "devcontainertutorial"
    dockerFile: {$include: "./Dockerfile"}
    requirements:
    - class: InitialWorkDirRequirement
    listing:
    - entryname: devcontainerTutorial
    entry: $(inputs.devcontainerTutorial)
    writable: true
    - class: EnvVarRequirement
    envDef:
    - envName: DOTNET_NOLOGO
    envValue: "true"
    - class: NetworkAccess
    networkAccess: true
    baseCommand: [dotnet, fsi, "./devcontainerTutorial/workflows/exampleChart1.fsx"]
    inputs:
    devcontainerTutorial:
    type: Directory
    outputs:
    output_exampleChart1:
    type: Directory
    outputBinding:
    glob: ./devcontainerTutorial/runs/execution/Results

    This was for the Commnad-Line-Tool for first script (exampleChart1.fsx). Following is the second Command-Line-Tool for the second script (exampleChart2.fsx)

    exampleChart2.cwl
    cwlVersion: v1.2
    class: CommandLineTool
    hints:
    DockerRequirement:
    dockerImageId: "devcontainertutorial"
    dockerFile: {$include: "./Dockerfile"}
    requirements:
    - class: InitialWorkDirRequirement
    listing:
    - entryname: devcontainerTutorial
    entry: $(inputs.devcontainerTutorial)
    writable: true
    - class: EnvVarRequirement
    envDef:
    - envName: DOTNET_NOLOGO
    envValue: "true"
    - class: NetworkAccess
    networkAccess: true
    baseCommand: [dotnet, fsi, "./devcontainerTutorial/workflows/exampleChart2.fsx"]
    inputs:
    devcontainerTutorial:
    type: Directory
    outputs:
    output_exampleChart2:
    type: Directory
    outputBinding:
    glob: ./devcontainerTutorial/runs/execution/Results
  4. With the Command-Line-Tools we can create the CWL workflow, that calls the two Command-Line-Tools

    mainWorkflow.cwl
    cwlVersion: v1.2
    class: Workflow
    hints:
    DockerRequirement:
    dockerImageId: "devcontainerTutorial"
    dockerFile: {$include: "./Dockerfile"}
    requirements:
    - class: InitialWorkDirRequirement
    listing:
    - entryname: tutorialDevcontainer
    entry: $(inputs.tutorialDevcontainer)
    writable: true
    - class: MultipleInputFeatureRequirement
    inputs:
    devcontainerTutorial: Directory
    folder: string
    steps:
    exampleChart1:
    run: exampleChart1.cwl
    in:
    devcontainerTutorial: devcontainerTutorial
    out: [output_exampleChart1]
    exampleChart2:
    run: exampleChart2.cwl
    in:
    devcontainerTutorial: devcontainerTutorial
    out: [output_exampleChart2]
    expressionTool:
    run: exTool.cwl
    in:
    directory_array: [exampleChart1/output_exampleChart1, exampleChart2/output_exampleChart2]
    newname: folder
    out: [pool_directory]
    outputs:
    outputMain:
    type: Directory
    outputSource: expressionTool/pool_directory

    The five scripts shown above are located in an ARC in the workflow directory. The next step is to create a run.cwl which is together with the run.yml located in the runs directory.

  5. To follow the ARC specification we need to create a run.cwl in the runs directory, that is calling our main workflow. The files are then stored in the ARC as follows:

    • Directoryworkflows
      • Directoryscripts
        • exampleChart1.fsx
        • exampleChart2.fsx
      • exampleChart1.cwl
      • exampleChart2.cwl
      • workflow.cwl
    • Directoryruns
      • Directoryexecution
        • run.cwl
        • run.yml

    The run.cwl looks like this:

    run.cwl
    cwlVersion: v1.2
    class: Workflow
    hints:
    DockerRequirement:
    dockerImageId: "devcontainerTutorial"
    dockerFile: {$include: "./Dockerfile"}
    requirements:
    - class: InitialWorkDirRequirement
    listing:
    - entryname: tutorialDevcontainer
    entry: $(inputs.tutorialDevcontainer)
    writable: true
    - class: MultipleInputFeatureRequirement
    - class: SubworkflowFeatureRequirement
    inputs:
    devcontainerTutorial: Directory
    folder: string
    steps:
    mainWorkflow:
    run: ../../workflows/workflow.cwl
    in:
    devcontainerTutorial: devcontainerTutorial
    folder: folder
    out: [outputMain]
    outputs:
    outputMain:
    type: Directory
    outputSource: mainWorkflow/outputMain
  6. To finish our journey and to execute our CWL workflow we need the run.yml which is also located in the runs directory.

    run.yml
    devcontainerTutorial:
    class: Directory
    path: ../../
    folder: "Results"

And now you have everything you need to run a CWL workflow within a Docker container. Just open your Powershell with WSL (or terminal, if you’re using Linux) and run the following command:

Terminal window
cwltool ./workflow.cwl ./run.yml