[BUG] Unable to reproduce results in running.md
See original GitHub issueCurrent Behavior
I downloaded the sequencing data for SARS-CoV-2 from GISAID and I’m trying to use the metadata from this repo to reproduce the results (I was able to reproduce the results of the Zika tutorial). I made a directory called data with the fasta file and the metadata file and tried running
snakemake -p -s Snakefile --cores 2 auspice/ncov.json
but I’m getting the error
Error: Snakefile "Snakefile" not found.
Expected behavior
Reproduce the results described here https://github.com/nextstrain/ncov/blob/master/docs/running.md
How to reproduce
Steps to reproduce the current behavior:
- Download fasta sequences from GISAID
- Run command from above
Possible solution
(optional)
Your environment: if browsing Nextstrain online
- Operating system: MacOS
- Browser: Chrome
Your environment: if running Nextstrain locally
- Operating system:
- Browser:
- Version (e.g.
auspice 2.7.0
):
➜ nextstrain_local_ncov nextstrain check-setup
nextstrain-cli is up to date!
Testing your setup…
# docker is supported
✔ yes: docker is installed
✔ yes: docker run works
⚑ warning: containers have access to >2 GiB of memory
Containers appear to be limited to 2.0 GiB of memory. This
may not be enough for some Nextstrain builds. On Windows or
a Mac, you can increase the memory available to containers
in the Docker preferences.
✔ yes: image is new enough for this CLI version
# native is not supported
✔ yes: snakemake is installed
✔ yes: augur is installed
✘ no: auspice is installed
# aws-batch is not supported
✘ no: job description "nextstrain-job" exists
✘ no: job queue "nextstrain-job-queue" exists
✘ no: S3 bucket "nextstrain-jobs" exists
Supported Nextstrain environments: docker
Additional context
Add any other context about the problem here.
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (10 by maintainers)
Top Results From Across the Web
[Bug] Unable to reproduce the results of some heterograph ...
Bug The results of running RGCN multiple times are not always consistent and are not always the same as the reported results.
Read more >How to Reproduce a Non-Reproducible Defect and Make ...
Clear all cache and cookies while performing the scenario. Watch and observe every step. Sometimes looking for similar bug or patterns can be ......
Read more >ReCrash: Making Software Failures Reproducible by ...
Abstract. It is very hard to fix a software failure without being able to reproduce it. However, reproducing a failure is often difficult...
Read more >Providing Reproduction Steps - Phabricator
We can not help you with bugs we can not reproduce, and will not accept reports which omit reproduction steps or have incomplete...
Read more >What do I do with software failures which I can't reproduce?
Based on my experience during multiple projects and domains, “non-reproducible software bug” is failure for software tester and we treated non-reproducible ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Is
mafft
installed/accessible by the user running snakemake? I had a few problems with old versions, but it is working for me now with:I don’t recall that truncated looking error message though. You may also need to confirm the version of nextstrain/augur you are running, just in case.
It doesn’t seem like error messages are making it to your screen so you may also have to dig through the logs generated in
./.snakemake/log/
to see the root causes of some of these.Hi @cornhundred, it’s not clear from your last post what errors occurred as none seem to appear in the log. But for your prior post with the indication of a duplicate key “hCoV-19/Hong”, it looks like you are using the GISAID sequences download file and have not normalized the sequence names from the format provided by GISAID to the format expected by the nextstrain/ncov pipeline. Specifically what you are hitting there are embedded spaces in the strain names from GISAID that are stripped out in the metadata.tsv strain names.
If you run
scripts/normalize_gisaid_fasta.sh path-to-GISAID-download-file data/sequences.fasta
then the normalize_gisaid_fasta.sh script should make all the necessary adjustments to strain names in the GISAID fasta file and place the results in data/sequences.fasta.You may need to run
snakemake clean
to clean up your working directory from these errored-out attempts.Regarding the ‘download’ Snakefile rule, that rule is only run if no sequences.fasta file exists, so once you have produced the sequences.fasta file either by hand or from normalize_gisaid_fasta.sh script, the download rule will not be run.