question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Toil is unable to import HPRC assembly

See original GitHub issue

I can download it just fine from the command line

aws s3 cp s3://human-pangenomics/submissions/fc721e22-1ce2-4d38-bf88-72cf231bcb90--UCSC_2020AUG_assembly/2020AUG26_upload/HG02109/HG02109.mat.fa.gz .

but Toil can’t import it my script

from toil.common import Toil
from toil.job import Job
if __name__=="__main__":
    options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
    options.logLevel = "OFF"
    options.clean = "always"

    with Toil(options) as toil:
        toil.importFile("s3://human-pangenomics/submissions/fc721e22-1ce2-4d38-bf88-72cf231bcb90--UCSC_2020AUG_assembly/2020AUG26_upload/HG02109/HG02109.mat.fa.gz")

my command

python import_test.py

the result

Traceback (most recent call last):
  File "import_test.py", line 12, in <module>
    toil.importFile("s3://human-pangenomics/submissions/fc721e22-1ce2-4d38-bf88-72cf231bcb90--UCSC_2020AUG_assembly/2020AUG26_upload/HG02109/HG02109.mat.fa.gz")
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/toil/common.py", line 1028, in importFile
    return self._jobStore.importFile(srcUrl, sharedFileName=sharedFileName)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/toil/jobStores/abstractJobStore.py", line 301, in importFile
    return self._importFile(otherCls, srcUrl, sharedFileName=sharedFileName, hardlink=hardlink)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/toil/jobStores/fileJobStore.py", line 302, in _importFile
    sharedFileName=sharedFileName)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/toil/jobStores/abstractJobStore.py", line 322, in _importFile
    size = otherCls._readFromUrl(url, writable)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/toil/jobStores/aws/jobStore.py", line 444, in _readFromUrl
    srcKey.get_contents_to_file(writable)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/boto/s3/key.py", line 1662, in get_contents_to_file
    response_headers=response_headers)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/boto/s3/key.py", line 1494, in get_file
    query_args=None)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/boto/s3/key.py", line 1526, in _get_file_internal
    override_num_retries=override_num_retries)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/boto/s3/key.py", line 355, in open
    override_num_retries=override_num_retries)
  File "/home/hickey/dev/work/toil-import/venv/lib/python3.6/site-packages/boto/s3/key.py", line 315, in open_read
    self.resp.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>822E12E6DAE082D9</RequestId><HostId>e0oz7J3ehvimfEgUiYZshZmze8xBNyz+QacFq/LJ1jG9hsZWsR3WtOko6rMTGb8AoPqmLLaSMtk=</HostId></Error>

┆Issue is synchronized with this [Jira Task](https://ucsc-cgl.atlassian.net/browse/TOIL-743)
┆Issue Number: TOIL-743

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
DailyDreamingcommented, Dec 16, 2020

This and #3367 should both be fixed by the move from boto to boto3 in the AWS jobstore. Will be addressed next sprint.

0reactions
adamnovakcommented, Jan 19, 2022

@glennhickey It sounds like there’s no actual issue here anymore. I just tried running import_file.py, and it prompted me for MFA and then finished without complaining. You also say import in 5.3+ isn’t actually notably slower at it compared to 4.2. So I’m going to close this out.

Read more comments on GitHub >

github_iconTop Results From Across the Web

An assembly line for an improved human reference genome
The Human Pangenome Reference Consortium (HPRC) seeks to provide the community with a much more accurate and complete human reference genome ...
Read more >
Complete genomic and epigenetic maps of human ...
Previous genome sequencing efforts have been unable to generate complete assemblies of satellite-rich regions because of their scale and ...
Read more >
From the reference human genome to human pangenome
assemblies of human genomes present us with a unique and unprecedented ... (HPRC) published a first draft human pangenome reference. of.
Read more >
Untitled
Century excalibur hprc curve, Onefishtwofish salisbury, Taking no prisoners what ... Unican 1000 lock instructions, Rb-seven reviews, Liticia cerqueira, ...
Read more >
Untitled
Fao schwarz big piano dance mat canada, Arsenal transfer hoilett, Mountain dew ... trieste prestito sociale, Banvel label, Rszdelete unable to delete query!...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found