automating workflows in Life Sciences
CWL is a way to describe command line tools and connect them together to create workflows. Because CWL is a specification and not a specific piece of software, tools and workflows described using CWL are portable across a variety of platforms that support the CWL standard.
CWL has roots in “make” and many similar tools that determine order of execution based on dependencies between tasks. However unlike “make”, CWL tasks are isolated and you must be explicit about your inputs and outputs. The benefit of explicitness and isolation are flexibility, portability, and scalability: tools and workflows described with CWL can transparently leverage technologies such as Docker and be used with CWL implementations from different vendors. CWL is well suited for describing large-scale workflows in cluster, cloud and high performance computing environments where tasks are scheduled in parallel across many nodes.
The simplest “hello world” program. This accepts one input parameter, writes a message to the terminal or job log, and produces no permanent output. CWL documents are written in [JSON][json] or [YAML][yaml], or a mix of the two. We will use YAML throughout this guide. If you are not familiar with YAML, you may find it helpful to refer to this quick tutorial for the subset of YAML used in CWL.
First, create a file called echo.cwl
, containing the boxed text below. It will help you to use a text editor that can be specified to produce text in YAML or JSON. Whatever text editor you use, the indents you see should not be created using tabs.
echo.cwl
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: echo
inputs:
message:
type: string
inputBinding:
position: 1
outputs: []
Next, create a file called echo-job.yml
, containing the following boxed text, which will describe the input of a run:
echo-job.yml
message: Hello world!
Now, invoke cwl-runner
(or cwltool
) with the tool wrapper echo.cwl
and the input object echo-job.yml on the command line. The command
is cwl-runner echo.cwl echo-job.yml
. The boxed text below shows this command and the expected output.
$ cwl-runner echo.cwl echo-job.yml
[job echo.cwl] /tmp/tmpmM5S_1$ echo \
'Hello world!'
Hello world!
[job echo.cwl] completed success
{}
Final process status is success