question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Toil launches infinity jobs in Slurm

See original GitHub issue

Dear devs,

I have been running a CWL pipeline with Toil, recently I am using a Slurm cluster and I was expecting no changes or errors after using the same pipeline in LSF.

When I run the pipeline, one of the jobs seems to replicate and never marked as “done”, I was using the stable version for Toil (5.2.0) when I noticed this as my disk was filling up, so I upgrade to 5.3.0a and observe the same behavior.

Here is one example of the logs, as you can see at the bottom, the same task “prodigal” was send 55 times to the queue, I had to kill the process to avoid filling again the disk.


CWL: /data/juan/mgnify-lr/cwl/workflows/long_read_assembly.cwl
YML: /data/juan/runs/results/PRJNA487927_SRR7768934/PRJNA487927_SRR7768934.yml
Toil start: Mon Mar 22 18:26:22 UTC 2021
launching TOIL/CWL job with Docker as PRJNA487927_SRR7768934
[2021-03-22T18:26:25+0000] [MainThread] [I] [cwltool] Resolved '/data/juan/mgnify-lr/cwl/workflows/long_read_assembly.cwl' to 'file:///data/juan/mgnify-lr/cwl/workflows/
long_read_assembly.cwl'
[2021-03-22T18:26:40+0000] [MainThread] [W] [cwltool] Workflow checker warning:
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:22:5: Source 'raw_reads_report' of
                                                                     type ["null", "string"] may be
                                                                     incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:63:7:   with sink 'outReport' of
                                                                       type "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:18:5: Source 'min_read_size' of type
                                                                     ["null", "int"] may be
                                                                     incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:71:7:   with sink 'minLength' of
                                                                       type "int"
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:30:5: Source 'reads_filter_bysize'
                                                                     of type ["null", "string"] may
                                                                     be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:72:7:   with sink 'name' of type
                                                                       "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:26:5: Source 'align_preset' of type
                                                                     ["null", "string"] may be
                                                                     incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:82:7:   with sink 'alignMode' of
                                                                       type "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:38:5: Source 'host_unmapped_reads'
                                                                     of type ["null", "string"] may
                                                                     be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_preprocessing_long.cwl:83:7:   with sink 'outReadsName'
                                                                       of type "string"
[2021-03-22T18:26:45+0000] [MainThread] [W] [cwltool] Workflow checker warning:
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:31:5:  Source 'align_polish' of type ["null",
                                                            "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:81:7:    with sink 'alignMode' of type
                                                              "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:35:5:  Source 'polish_paf' of type ["null",
                                                            "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:84:7:    with sink 'outPAFname' of type
                                                              "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:19:5:  Source 'long_read_tech' of type
                                                            ["null", "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:73:7:    with sink 'readType' of type
                                                              "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:43:5:  Source 'medaka_model' of type ["null",
                                                            "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:103:7:   with sink 'medakaModel' of type
                                                              "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:19:5:  Source 'long_read_tech' of type
                                                            ["null", "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:104:7:   with sink 'tech' of type "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:51:5:  Source 'min_contig_size' of type
                                                            ["null", "int"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:121:7:   with sink 'minSize' of type "int"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:55:5:  Source 'final_assembly' of type
                                                            ["null", "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:123:7:   with sink 'outName' of type "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:27:5:  Source 'align_preset' of type ["null",
                                                            "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:111:7:   with sink 'alignMode' of type
                                                              "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:47:5:  Source 'host_unmapped_contigs' of type
                                                            ["null", "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:112:7:   with sink 'outReadsName' of type
                                                              "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:39:5:  Source 'polish_assembly_racon' of type
                                                            ["null", "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:94:7:    with sink 'outName' of type "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:31:5:  Source 'align_polish' of type ["null",
                                                            "string"] may be incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_assembly.cwl:130:7:   with sink 'alignMode' of type
                                                              "string"
[2021-03-22T18:26:59+0000] [MainThread] [W] [cwltool] Workflow checker warning:
../../mgnify-lr/cwl/workflows/mgnify_lr_postprocessing.cwl:29:5: Source 'ideel_out' of type
                                                                 ["null", "string"] may be
                                                                 incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_postprocessing.cwl:69:7:   with sink 'outFigName' of type
                                                                   "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_postprocessing.cwl:25:5: Source 'diamond_out' of type
                                                                 ["null", "string"] may be
                                                                 incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_postprocessing.cwl:59:7:   with sink 'outName' of type
                                                                   "string"
../../mgnify-lr/cwl/workflows/mgnify_lr_postprocessing.cwl:18:5: Source 'predict_proteins' of type
                                                                 ["null", "string"] may be
                                                                 incompatible
../../mgnify-lr/cwl/workflows/mgnify_lr_postprocessing.cwl:52:7:   with sink 'outProtName' of
                                                                   type "string"
[2021-03-22T18:27:00+0000] [MainThread] [I] [toil.job] Saving graph of 2 jobs, 2 new
[2021-03-22T18:27:00+0000] [MainThread] [I] [toil.job] Processing job 'ResolveIndirect' kind-ResolveIndirect/instance-d5rpnfnu
[2021-03-22T18:27:00+0000] [MainThread] [I] [toil.job] Processing job 'CWLWorkflow' kind-CWLWorkflow/instance-dbmqzdum
[2021-03-22T18:27:00+0000] [MainThread] [I] [toil] Running Toil version 5.3.0a1-1f0930b7b9ecc31ca556d15ab07ff836ba85eb23 on host frontend001.
[2021-03-22T18:27:02+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpdqrou_jj/worker_log.txt
[2021-03-22T18:27:02+0000] [MainThread] [W] [toil.leader] Job failed with exit value 139: 'CWLWorkflow' kind-CWLWorkflow/instance-dbmqzdum
Exit reason: None
[2021-03-22T18:27:02+0000] [MainThread] [W] [toil.leader] No log file is present, despite job failing: 'CWLWorkflow' kind-CWLWorkflow/instance-dbmqzdum
[2021-03-22T18:27:02+0000] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'CWLWorkflow' kind-CWLWorkflow/instance-dbmqzdum wit
h ID kind-CWLWorkflow/instance-dbmqzdum to 5
[2021-03-22T18:27:02+0000] [MainThread] [W] [toil.job] We have increased the disk of the failed job 'CWLWorkflow' kind-CWLWorkflow/instance-dbmqzdum to the default of 21
47483648 bytes
[2021-03-22T18:27:02+0000] [MainThread] [I] [toil.leader] 1 jobs are running, -1 jobs are issued and waiting to run
[2021-03-22T18:27:03+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 18:27:03: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-CWLWorkflow/instance-dbmqzdum/cleanup/file-c5060c32dd2
c4bf9ad6fb81f4b57000e/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T18:27:04+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpt7ruf_v5/worker_log.txt
[2021-03-22T18:27:04+0000] [MainThread] [W] [toil.leader] A result seems to already have been processed for job 0
[2021-03-22T18:27:05+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpu_1mkofd/worker_log.txt
[2021-03-22T18:27:06+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 18:27:06: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-CWLWorkflow/instance-lwrqrjzo/cleanup/file-7d7abc739e8
7469fbd8874c4b119ef7e/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T18:27:08+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' calc_stats.pl kind-CWLJob/instance-mq3oxqzj with job batch system ID: 4 and cores: 1, disk:
 2.0 G, and memory: 1.0 G
[2021-03-22T18:27:08+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' fastp kind-CWLJob/instance-ehh2ub5k with job batch system ID: 5 and cores: 8, disk: 2.0 G, 
and memory: 7.0 G
[2021-03-22T18:27:40+0000] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'CWLJob' fastp kind-CWLJob/instance-ehh2ub5k
Exit reason: None
[2021-03-22T18:27:40+0000] [MainThread] [W] [toil.leader] No log file is present, despite job failing: 'CWLJob' fastp kind-CWLJob/instance-ehh2ub5k
[2021-03-22T18:27:40+0000] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'CWLJob' fastp kind-CWLJob/instance-ehh2ub5k with ID
 kind-CWLJob/instance-ehh2ub5k to 5
[2021-03-22T18:27:40+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' fastp kind-CWLJob/instance-ehh2ub5k with job batch system ID: 6 and cores: 8, disk: 2.0 G, 
and memory: 7.0 G
[2021-03-22T18:27:46+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' minimap2_filter.sh kind-CWLJob/instance-rmepy7h4 with job batch system ID: 7 and cores: 8, 
disk: 2.0 G, and memory: 1.0 G
[2021-03-22T18:27:53+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' fastp kind-CWLJob/instance-ehh2ub5k with job batch system ID: 8 and cores: 8, disk: 2.0 G, 
and memory: 7.0 G
[2021-03-22T18:27:59+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmp7sjk4hyn/worker_log.txt
[2021-03-22T18:27:59+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 18:27:59: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-ResolveIndirect/instance-sqpvv5em/cleanup/file-9ec348d
827c94187a67e6d8e9b4db179/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T18:28:01+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpvy_esv42/worker_log.txt
[2021-03-22T18:28:03+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 18:28:03: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-CWLWorkflow/instance-qotcb8ti/cleanup/file-e1c8acb0f4d
c46249ff03bc691492608/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T18:28:05+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpbvrkyc1u/worker_log.txt
[2021-03-22T18:28:06+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 18:28:06: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-CWLWorkflow/instance-kxz4tzsb/cleanup/file-ed3c08a68fb
54fe2b594356b9bc87817/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T18:28:08+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' flye kind-CWLJob/instance-ujy3gggb with job batch system ID: 12 and cores: 8, disk: 2.0 G, 
and memory: 195.0 G
[2021-03-22T18:28:08+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' flye kind-CWLJob/instance-14jwry8k with job batch system ID: 13 and cores: 8, disk: 2.0 G, 
and memory: 195.0 G
[2021-03-22T18:54:18+0000] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'CWLJob' flye kind-CWLJob/instance-ujy3gggb
Exit reason: None
[2021-03-22T18:54:18+0000] [MainThread] [W] [toil.leader] Despite the batch system claiming failure the job 'CWLJob' flye kind-CWLJob/instance-ujy3gggb seems to have fin
ished and been removed
[2021-03-22T18:54:21+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmprlkcrale/worker_log.txt
[2021-03-22T18:54:22+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 18:54:22: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-ResolveIndirect/instance-humvqdau/cleanup/file-e6c22d4
9c86246eda3ac6e227f84ca04/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T18:54:22+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' minimap2 kind-CWLJob/instance-hdiajgj4 with job batch system ID: 15 and cores: 8, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T18:54:46+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' racon kind-CWLJob/instance-sjcwvoh_ with job batch system ID: 16 and cores: 8, disk: 2.0 G,
 and memory: 7.0 G
[2021-03-22T18:56:15+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpkbak50hu/worker_log.txt
[2021-03-22T18:56:16+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 18:56:16: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-CWLWorkflow/instance-t41i_e59/cleanup/file-83b842e54c1
743b094f1f2697b092976/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T18:56:17+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' medaka_consensus kind-CWLJob/instance-u4s7bukj with job batch system ID: 18 and cores: 8, d
isk: 2.0 G, and memory: 7.0 G
[2021-03-22T18:56:17+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' cp kind-CWLJob/instance-emlw1dcg with job batch system ID: 19 and cores: 1, disk: 2.0 G, an
d memory: 1000.0 M
[2021-03-22T19:20:01+0000] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'CWLJob' medaka_consensus kind-CWLJob/instance-u4s7bukj
Exit reason: None
[2021-03-22T19:20:01+0000] [MainThread] [W] [toil.leader] Despite the batch system claiming failure the job 'CWLJob' medaka_consensus kind-CWLJob/instance-u4s7bukj seems
 to have finished and been removed
[2021-03-22T19:20:03+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmp9ihuse__/worker_log.txt
[2021-03-22T19:20:03+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 19:20:03: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-ResolveIndirect/instance-y_y8qz_c/cleanup/file-38d8bec
de3b8442fb44f76f5ed6fe4ac/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T19:20:05+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' minimap2_filter.sh kind-CWLJob/instance-kmmij8gg with job batch system ID: 21 and cores: 8,
 disk: 2.0 G, and memory: 1.0 G
[2021-03-22T19:20:12+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' filterContigs.pl kind-CWLJob/instance-n1vzba93 with job batch system ID: 22 and cores: 1, d
isk: 2.0 G, and memory: 1.0 G
[2021-03-22T19:20:18+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' gen_stats_json.pl kind-CWLJob/instance-5ifx01bc with job batch system ID: 23 and cores: 1, 
disk: 2.0 G, and memory: 7.0 G
[2021-03-22T19:25:54+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' filterContigs.pl kind-CWLJob/instance-n1vzba93 with job batch system ID: 24 and cores: 1, d
isk: 2.0 G, and memory: 1.0 G
[2021-03-22T19:25:59+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' minimap2_filter.sh kind-CWLJob/instance-kmmij8gg with job batch system ID: 25 and cores: 8,
 disk: 2.0 G, and memory: 1.0 G
[2021-03-22T19:26:06+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmp51t_nfmk/worker_log.txt
[2021-03-22T19:26:08+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpdz8unwnt/worker_log.txt
[2021-03-22T19:26:08+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' racon kind-CWLJob/instance-sjcwvoh_ with job batch system ID: 28 and cores: 8, disk: 2.0 G,
 and memory: 7.0 G
[2021-03-22T19:26:12+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' minimap2 kind-CWLJob/instance-hdiajgj4 with job batch system ID: 29 and cores: 8, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:26:17+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmprba30nen/worker_log.txt
[2021-03-22T19:26:19+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmp8gsl9hyk/worker_log.txt
[2021-03-22T19:26:23+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpd252bb3f/worker_log.txt
[2021-03-22T19:26:24+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 19:26:24: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-ResolveIndirect/instance-5ez9tpia/cleanup/file-3b228eb
38ebc4b409ed17b94380f6a1a/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T19:26:25+0000] [MainThread] [I] [toil.worker] Redirecting logging to /data/juan/runs/work-dir/tmp/PRJNA487927_SRR7768934/node-65f25cc1-ffcf-42c0-bbe0-c25c2a5
f9665-f5e2056f480b4dd881408fa7cd0ec9ac/tmpet8h9sus/worker_log.txt
[2021-03-22T19:26:27+0000] [Thread-8  ] [W] [toil.statsAndLogging] Got message from job at time 03-22-2021 19:26:27: Job used more disk than requested. For CWL, consider
 increasing the outdirMin requirement, otherwise, consider increasing the disk requirement. Job files/for-job/kind-CWLWorkflow/instance-w52j270m/cleanup/file-aec1f2943da
e4a8e847c8a6a63f8dcf9/stream used 0.00% disk (8.0 KB [8192B] used, 0.0 B [0B] requested).
[2021-03-22T19:26:28+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 34 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:27:04+0000] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2021-03-22T19:27:30+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 35 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:28:32+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 36 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:29:34+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 37 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:30:34+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 38 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:31:35+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 39 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:32:37+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 40 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:33:38+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 41 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:34:38+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 42 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:35:40+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 43 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:36:41+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 44 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:37:41+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 45 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:38:43+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 46 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:39:44+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 47 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:40:45+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 48 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:41:47+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 49 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:42:48+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 50 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:43:49+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 51 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:44:51+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 52 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:45:53+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 53 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:46:53+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 54 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:47:55+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 55 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:48:56+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 56 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:49:58+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 57 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:50:58+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 58 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:51:59+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 59 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:53:01+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 60 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:54:01+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 61 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:55:03+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 62 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:56:04+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 63 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:57:06+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 64 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:58:06+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 65 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T19:59:08+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 66 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:00:10+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 67 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:01:12+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 68 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:02:12+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 69 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:03:13+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 70 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:04:15+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 71 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:05:16+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 72 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:06:16+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 73 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:07:18+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 74 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:08:19+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 75 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:09:20+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 76 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:10:21+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 77 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:11:23+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 78 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:12:24+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 79 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:13:26+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 80 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:14:27+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 81 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:15:28+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 82 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:16:30+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 83 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:17:30+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 84 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:18:31+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 85 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:19:33+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 86 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:20:34+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 87 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:21:34+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 88 and cores: 1, disk: 2.0
 G, and memory: 1.0 G
[2021-03-22T20:22:36+0000] [MainThread] [I] [toil.leader] Issued job 'CWLJob' prodigal kind-CWLJob/instance-8k5mk5d3 with job batch system ID: 89 and cores: 1, disk: 2.0
 G, and memory: 1.0 G

Slurm was running the jobs:

$ qstat
Job id              Name             Username        Time Use S Queue          
------------------- ---------------- --------------- -------- - ---------------
793                 toil_job_83_CWLJ jcaballero      00:00:01 C main           
794                 toil_job_84_CWLJ jcaballero      00:00:01 C main           
795                 toil_job_85_CWLJ jcaballero      00:00:01 C main           
796                 toil_job_86_CWLJ jcaballero      00:00:01 C main           
797                 toil_job_87_CWLJ jcaballero      00:00:01 C main           
798                 toil_job_88_CWLJ jcaballero      00:00:01 C main           
799                 toil_job_89_CWLJ jcaballero      00:00:00 R main

Please let me know if there is any additional information you need to debug this.

JC

┆Issue is synchronized with this Jira Task ┆Issue Number: TOIL-828

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
caballerocommented, Jan 4, 2022

Hi @adamnovak, yes, 5.5.0 can launch jobs correctly, we’re still using 5.4.0 in production, but not for this issue. You can close this ticket.

1reaction
caballerocommented, Oct 8, 2021

Hi, I need to check it, I will inform any results

Read more comments on GitHub >

github_iconTop Results From Across the Web

Frequently Asked Questions - Slurm Workload Manager
A Slurm job is just a resource allocation. You can execute many job steps within that allocation, either in parallel or sequentially. Some...
Read more >
Introduction — Toil 5.8.0a1 documentation
The file job store is for use locally, and keeps the workflow information in a directory on the machine where the workflow is...
Read more >
[slurm-users] how can users start their worker daemons using ...
I am trying to figure out how to advise users on starting worker daemons in their allocations using srun. That is, I want...
Read more >
Slurm quickstart - LUMI Documentation
These computing resources are allocated to the user by the resource manager. This is achieved through the submission of jobs by the user....
Read more >
Running jobs - Stanford Sherlock cluster
Slurm commands# ; salloc, Request resources and allocates them to a job, Starts a new shell, but does not execute anything ; srun,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found