Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incompatibility issues with custom reference genome

See original GitHub issue

Incompatibility issues with custom reference genome

Macs2 peak calling fails if the FASTA file used to build a custom genome database does not follow the chr[\dXY] naming convention. For example, I am using the Ensembl masked version of the rat genome (rn6, release 94) found here, which does not prepend ‘chr’ to chromosome names. The error is produced by the following call:

[2018-10-31 18:06:20,887 ERROR] Unknown exception caught. Killing process group 72093...
Traceback (most recent call last):
  File "/users/nicolerg/anaconda2/envs/encode-atac-seq-pipeline/bin/", line 224, in run_shell_cmd
    p.returncode, cmd)
CalledProcessError: Command 'cat /mnt/lab_data/montgomery/nicolerg/motrpac/atac/pipeline-output/cromwell-executions/atac/03a28d2a-f364-4ce1-bfd5-f10488cf42a9/call-macs2/shard-0/inputs/-78707573/rn6_masked.chrom.sizes | grep -P 'chr[\dXY]+[ \t]' > 20180725_2_Gastroc_002_powder_S1_L001_R1_001.trim.merged.nodup.tn5.pval0.01.300K.bfilt.chrsz.tmp' returned non-zero exit status 1

OS/Platform and dependencies

  • Platform: Ubuntu 16.04.4
  • Cromwell: cromwell-34
  • Conda version: conda 4.5.11

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

leepc12commented, Nov 7, 2018

Sure, working on it and will be fixed in the next release.

leepc12commented, Jan 22, 2019

Sure, I am working on parameterizing ``atac.mito_chr_name : "any_mito_chr_name" in an input JSON. This will be fixed in the next release.

BTW regex-filter-reads is sort of a wrapper variable for --regex-grep-v-ta in bam2ta.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Identity and compatibility of reference genome resources
Here, we address each of these issues. Our approach can guarantee identity, relationships, and compatibility among reference genome assets, which we have ...
Read more >
Limitations of the Human Reference Genome for Personalized ...
Some regions with known high variability, like the MHC, already have alternative assemblies because a single reference sequence causes too many ...
Read more >
Troubleshooting Custom Genome fasta - Galaxy Training!
If a custom genome/transcriptome/exome dataset is producing errors, double check the format and that the chromosome identifiers between ALL inputs.
Read more >
Personalized and graph genomes reveal missing signal in ...
Epigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence.
Read more >
A complete reference genome improves analysis of human ...
Consequently, human genetics and genomics benefit from the availability of a high-quality reference genome, ideally without gaps or errors that ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Post

No results found

github_iconTop Related Hashnode Post

No results found