question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

'File is missing' error accessing a file in a Directory, Toil expects a file:// IRI prefix which failed to be added

See original GitHub issue

We are experiencing failures using Toil to run our CWL workflow, and suspect a bug in Toil’s handling of Directory inputs. The symptom is an error message, “File is missing: /the/path/to/some/file” (note that this is a file path, without an IRI schema prefix).

On code inspection of uploadFile() in src/toil/cwl/cwltoil.py, it is clear the logic expects an IRI with file:// schema prefix within the contents of uf["location"]. trimming the first 7 characters in the check if not os.path.isfile(uf["location"][7:]):. That prefix should have been added when resolving the IRI relative to the CWL on the filesystem, and clearly the error message shows it was not. Since the first 7 characters of the file path get stripped regardless of the schema prefix, the path fails to represent a valid file in the filesystem.

The code paths for resolving IRIs for File vs Directory are different, and I suspect a bug in the latter. In particular, File IRIs have logic to resolve schema-relative locations, via a call to schema_salad.ref_resolver.file_uri(); suspiciously, such call isn’t present in the code path for resolving Directory objects.

To reproduce this failure, you may try to run the NCBI PGAP workflow. Please note that this pipeline is still a work in progress, and has not yet been formally announced or released; the URL is subject to move: https://github.com/ncbi-gpipe/pgap

┆Issue is synchronized with this Jira Story ┆Issue Number: TOIL-276

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
whlavinacommented, Jul 26, 2018

I’ll try to come up with a minimalist example that reproduces the failure. If there’s any specific debugging/logging you’d like me to add, feel free to ask. Here’s the Python stack trace at the point of failure; if you want full logs, I can attach those too.

WARNING:toil.leader:D/u/job_ugwte    Traceback (most recent call last):
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte    Traceback (most recent call last):
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/worker.py", line 313, in workerScript
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/worker.py", line 313, in workerScript
WARNING:toil.leader:D/u/job_ugwte        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/job.py", line 1337, in _runner
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/job.py", line 1337, in _runner
WARNING:toil.leader:D/u/job_ugwte        returnValues = self._run(jobGraph, fileStore)
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte        returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/job.py", line 1282, in _run
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/job.py", line 1282, in _run
WARNING:toil.leader:D/u/job_ugwte        return self.run(fileStore)
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte        return self.run(fileStore)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 480, in run
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 480, in run
WARNING:toil.leader:D/u/job_ugwte        index, existing))
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte        index, existing))
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 55, in adjustFileObjs
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 55, in adjustFileObjs
WARNING:toil.leader:D/u/job_ugwte        visit_class(rec, ("File",), op)
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte        visit_class(rec, ("File",), op)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 48, in visit_class
2018-07-25 23:06:29,871 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 48, in visit_class
WARNING:toil.leader:D/u/job_ugwte        visit_class(rec[d], cls, op)
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte        visit_class(rec[d], cls, op)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 48, in visit_class
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 48, in visit_class
WARNING:toil.leader:D/u/job_ugwte        visit_class(rec[d], cls, op)
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte        visit_class(rec[d], cls, op)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 51, in visit_class
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 51, in visit_class
WARNING:toil.leader:D/u/job_ugwte        visit_class(d, cls, op)
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte        visit_class(d, cls, op)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 46, in visit_class
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/cwltool/pathmapper.py", line 46, in visit_class
WARNING:toil.leader:D/u/job_ugwte        op(rec)
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte        op(rec)
WARNING:toil.leader:D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 334, in uploadFile
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte      File "/Users/whlavina/pgap/venv2.7-pristine/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 334, in uploadFile
WARNING:toil.leader:D/u/job_ugwte        raise cwltool.errors.WorkflowException("File is missing: %s" % uf["location"])
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte        raise cwltool.errors.WorkflowException("File is missing: %s" % uf["location"])
WARNING:toil.leader:D/u/job_ugwte    WorkflowException: File is missing: /Users/whlavina/pgap/pgap-2018-07-05.build2884/out_tmpdirInEt_Q/sequence_cache/asn_cache.idx
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte    WorkflowException: File is missing: /Users/whlavina/pgap/pgap-2018-07-05.build2884/out_tmpdirInEt_Q/sequence_cache/asn_cache.idx
WARNING:toil.leader:D/u/job_ugwte    ERROR:toil.worker:Exiting the worker because of a failed job on host Wratkos-Mac-Pro.home
2018-07-25 23:06:29,872 - toil.leader - WARNING - D/u/job_ugwte    ERROR:toil.worker:Exiting the worker because of a failed job on host Wratkos-Mac-Pro.home
0reactions
mr-ccommented, Oct 5, 2021

This bug is no longer present in toil-cwl-runner 5.5.0 (but probably sooner)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Suggested conformance test to be added, testing ... - GitHub
I am proposing an additional conformance test to be added to the Jenkins ... in a Directory, Toil expects a file:// IRI prefix...
Read more >
c# - File access error with FileSystemWatcher when multiple ...
I want to parse the file as soon as it is placed in the directory. Typically, the first file parses fine, but adding...
Read more >
Create ML: Testing Error expected directory at URL
Hi, I'm new to Create ML. I was trying to create a simple sentiment analysis model. My input file is cleaned JSON data...
Read more >
Common Run Submission Errors - ENA Training Modules
When you submit read data to ENA, we store and accession your files within Runs. ... Error: File Integrity Check Failed; Error: Missing...
Read more >
Configuration fails to load due to missing file or directory - AskF5
This issue can occur during any of the following circumstances: Objects in /config/bigip.conf are renamed. File or directory is missing.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found