question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Unable to reproduce results in running.md

See original GitHub issue

Current Behavior
I downloaded the sequencing data for SARS-CoV-2 from GISAID and I’m trying to use the metadata from this repo to reproduce the results (I was able to reproduce the results of the Zika tutorial). I made a directory called data with the fasta file and the metadata file and tried running

snakemake -p -s Snakefile --cores 2 auspice/ncov.json

but I’m getting the error

Error: Snakefile "Snakefile" not found.

Expected behavior
Reproduce the results described here https://github.com/nextstrain/ncov/blob/master/docs/running.md

How to reproduce
Steps to reproduce the current behavior:

  1. Download fasta sequences from GISAID
  2. Run command from above

Possible solution
(optional)

Your environment: if browsing Nextstrain online

  • Operating system: MacOS
  • Browser: Chrome

Your environment: if running Nextstrain locally

  • Operating system:
  • Browser:
  • Version (e.g. auspice 2.7.0):
➜  nextstrain_local_ncov nextstrain check-setup                
nextstrain-cli is up to date!

Testing your setup…

# docker is supported
✔ yes: docker is installed
✔ yes: docker run works
⚑ warning: containers have access to >2 GiB of memory

  Containers appear to be limited to 2.0 GiB of memory. This
  may not be enough for some Nextstrain builds.  On Windows or
  a Mac, you can increase the memory available to containers
  in the Docker preferences.                        
✔ yes: image is new enough for this CLI version

# native is not supported
✔ yes: snakemake is installed
✔ yes: augur is installed
✘ no: auspice is installed

# aws-batch is not supported
✘ no: job description "nextstrain-job" exists
✘ no: job queue "nextstrain-job-queue" exists
✘ no: S3 bucket "nextstrain-jobs" exists

Supported Nextstrain environments: docker

Additional context
Add any other context about the problem here.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
brianpardycommented, Apr 12, 2020

Is mafft installed/accessible by the user running snakemake? I had a few problems with old versions, but it is working for me now with:

% mafft --version
v7.453 (2019/Nov/8)

I don’t recall that truncated looking error message though. You may also need to confirm the version of nextstrain/augur you are running, just in case.

% augur --version
augur 7.0.2

It doesn’t seem like error messages are making it to your screen so you may also have to dig through the logs generated in ./.snakemake/log/ to see the root causes of some of these.

1reaction
brianpardycommented, Apr 12, 2020

Hi @cornhundred, it’s not clear from your last post what errors occurred as none seem to appear in the log. But for your prior post with the indication of a duplicate key “hCoV-19/Hong”, it looks like you are using the GISAID sequences download file and have not normalized the sequence names from the format provided by GISAID to the format expected by the nextstrain/ncov pipeline. Specifically what you are hitting there are embedded spaces in the strain names from GISAID that are stripped out in the metadata.tsv strain names.

If you run scripts/normalize_gisaid_fasta.sh path-to-GISAID-download-file data/sequences.fasta then the normalize_gisaid_fasta.sh script should make all the necessary adjustments to strain names in the GISAID fasta file and place the results in data/sequences.fasta.

You may need to run snakemake clean to clean up your working directory from these errored-out attempts.

Regarding the ‘download’ Snakefile rule, that rule is only run if no sequences.fasta file exists, so once you have produced the sequences.fasta file either by hand or from normalize_gisaid_fasta.sh script, the download rule will not be run.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[Bug] Unable to reproduce the results of some heterograph ...
Bug The results of running RGCN multiple times are not always consistent and are not always the same as the reported results.
Read more >
How to Reproduce a Non-Reproducible Defect and Make ...
Clear all cache and cookies while performing the scenario. Watch and observe every step. Sometimes looking for similar bug or patterns can be ......
Read more >
ReCrash: Making Software Failures Reproducible by ...
Abstract. It is very hard to fix a software failure without being able to reproduce it. However, reproducing a failure is often difficult...
Read more >
Providing Reproduction Steps - Phabricator
We can not help you with bugs we can not reproduce, and will not accept reports which omit reproduction steps or have incomplete...
Read more >
What do I do with software failures which I can't reproduce?
Based on my experience during multiple projects and domains, “non-reproducible software bug” is failure for software tester and we treated non-reproducible ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found