question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

final_workflow_outputs_dir ignored for CWL workflow run in server mode

See original GitHub issue

On Jira as BA-5890.

Environment Ubuntu 18.04 in Docker container (on Ubuntu 18.04 host). Server mode with Local backend.

I’m listing this as low priority because it might be too specific to my use case.

When I specify final_workflow_outputs_dir in the workflowOptions file in my request to the API, the value is ignored and the output files remain in their default location relative to the execution path after the workflow execution succeeds.

For example, this is the results of a workflow with one File output named csvFile: workflowOptions file (I have tried this with and without “file://” with same results):

{
  "final_workflow_outputs_dir": "file:///data/external/workflow_1/4fd344c8-a228-421b-b561-9ed516e2316c",
  "use_relative_output_paths": true
}

Final outputs

{
 "cwl_temp_file_ad3d3e78-d6a6-421a-9111-86fdefe14b80.cwl.csvFile": "\"/cromwell-executions/cwl_temp_file_ad3d3e78-d6a6-421a-9111-86fdefe14b80.cwl/ad3d3e78-d6a6-421a-9111-86fdefe14b80/call-getdataframe/execution/glob-aae5e4d226234858387812bc5d30218c/217.csv\""
}

After the workflow successfully executes, the specified output directory remains empty.

Environment notes I’m using Cromwell in server mode in a Docker container as a service to be consumed by other Docker applications on the same host. The client applications communicate with the Cromwell container using the python requests library. The specified final_workflow_outputs_dir is located in a bind mount accessible from both containers at the same location (e.g. /data/external is the default directory “external” to the containers which is mounted on all containers at that location). I have a workaround with a workflow step that makes a request back to the client service, but this is not ideal because it requires the users to modify the workflows. The client software includes an Angular application for editing workflows using Rabix’s cwl-svg.

Full workflow

$namespaces: {sbg: https://www.sevenbridges.com}
class: Workflow
cwlVersion: v1.0
doc: A test workflow to demonstrate the editor.
id: workflow1
inputs:
- {id: omics_url, sbg:x: -158.51063537597656, sbg:y: 29.940061569213867, type: string}
- {id: omics_auth_token, sbg:x: -214.89361572265625, sbg:y: 170.31314086914062, type: string}
- {id: collection_id, sbg:x: -152.0425567626953, sbg:y: 306.3538818359375, type: int}
label: Test Workflow
outputs:
- id: csvFile
  outputSource: [getdataframe/csvFile]
  sbg:x: 523.4833374023438
  sbg:y: 191.5
  type: File
requirements:
- {class: MultipleInputFeatureRequirement}
steps:
- id: getcollection
  in:
  - {id: collection_id, source: collection_id}
  - id: omics_url
    source: [omics_url, omics_url, omics_url, omics_url]
  - id: omics_auth_token
    source: [omics_auth_token, omics_auth_token, omics_auth_token, omics_auth_token]
  label: Get Collection
  out:
  - {id: collection_file}
  run:
    baseCommand: [getcollection.py]
    class: CommandLineTool
    cwlVersion: v1.0
    doc: Get a collection as an HDF5 file.
    id: getcollection
    inputs:
    - id: collection_id
      inputBinding: {position: 0}
      type: int
    - id: omics_url
      inputBinding: {position: 1}
      type: string
    - id: omics_auth_token
      inputBinding: {position: 2}
      type: string
    label: Get Collection
    outputs:
    - id: collection_file
      outputBinding: {glob: '*.h5'}
      type: File
  sbg:x: 32.978721618652344
  sbg:y: 166.41757202148438
- id: getdataframe
  in:
  - {id: inputFile, source: getcollection/collection_file}
  label: Get DataFrame
  out:
  - {id: csvFile}
  run:
    baseCommand: [getdataframe.py]
    class: CommandLineTool
    cwlVersion: v1.0
    doc: Get an Pandas DataFrame as a CSV file.
    id: getdataframe
    inputs:
    - doc: A collection.
      id: inputFile
      inputBinding: {position: 0}
      type: File
    - default: true
      doc: Whether the column names should be just the x value or Y_x
      id: numericColumns
      inputBinding: {position: 1}
      type: boolean
    - default: true
      doc: Whether to include label columns
      id: includeLabels
      inputBinding: {position: 2}
      type: boolean
    - default: false
      doc: Whether to only include label columns. Overrides includeLabels.
      id: includeOnlyLabels
      inputBinding: {position: 3}
      type: boolean
    label: Get DataFrame
    outputs:
    - doc: A CSV file containing a Pandas DataFrame.
      id: csvFile
      outputBinding: {glob: '*.csv'}
      type: File
  sbg:x: 340
  sbg:y: 190

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
azzaeacommented, Jan 11, 2020

I like to report that this is also an issue when running CWL scripts via cromwell in run mode too: the options argument is ignored.

$ java -jar ${crom} --version
cromwell 47
$
$ cat workflow.options.json
{
    "final_workflow_outputs_dir": "results.cromwell",
    "use_relative_output_paths": true
}
$
$ java -jar ${crom} run example.cwl -i inputs.yml --type cwl -o workflow.options.json
:
: # workflow runs normally, logs and other files in `cromwell-executions` folder as expected
: 
$ ls results.cromwell
$ # folder is empty
$

1reaction
bolton-labcommented, May 27, 2021

Same here. If you run in -t wdl is works but not for -t cwl. How does changing the type change the code for final_workflow_outputs_dir or is this completely removed from that engine?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Common Workflow Language User Guide
This guide will introduce you to writing workflows using the Common Workflow Language (CWL) open standards. This guide describes the latest specification ...
Read more >
Common Workflow Language (CWL) Command Line Tool ...
A workflow is a process characterized by multiple subprocess steps, where step outputs are connected to the inputs of downstream steps to form...
Read more >
Common Workflow Language (CWL) Workflow Description, v1 ...
Defines the input parameters of the workflow step. The process is ready to run when all required input parameters are associated with concrete...
Read more >
Development Tools | Common Workflow Language (CWL)
The Common Workflow Language (CWL) is an open standard for describing analysis workflows and tools in a way that makes them portable and...
Read more >
Common Workflow Language (CWL) Workflow Description, v1.2
Workflow run fields cross-reference other processes in the document ... A cwlVersion field appearing anywhere other than the top level must be ignored....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found