question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

building hg38 genome

See original GitHub issue

Hi,

Previously had no issues building the hg38 genome with the provided builder.

Now running an issue where it abruptly stops at the same point. Below is the tail end of the output. Basically it stops at this point tar: unrecognized option ‘–sort=name’ which I don’t know how to interpret. I am running the builder with encode-atac-seq-pipeline activated.

Any insight would be greatly appreciated. Thanks!

Sorting block of length 248290277 for bucket 13 (Using difference cover) Sorting block time: 00:01:17 Returning block of 268462555 for bucket 12 Getting block 14 of 15 Reserving size (275144667) for bucket 14 Calculating Z arrays for bucket 14 Entering block accumulator loop for bucket 14: bucket 14: 10% bucket 14: 20% bucket 14: 30% bucket 14: 40% bucket 14: 50% bucket 14: 60% bucket 14: 70% bucket 14: 80% Sorting block time: 00:01:12 Returning block of 248290278 for bucket 13 bucket 14: 90% Getting block 15 of 15 Reserving size (275144667) for bucket 15 Calculating Z arrays for bucket 15 Entering block accumulator loop for bucket 15: bucket 14: 100% Sorting block of length 252608371 for bucket 14 (Using difference cover) bucket 15: 10% bucket 15: 20% bucket 15: 30% bucket 15: 40% bucket 15: 50% bucket 15: 60% bucket 15: 70% bucket 15: 80% bucket 15: 90% bucket 15: 100% Sorting block of length 76591070 for bucket 15 (Using difference cover) Sorting block time: 00:00:22 Returning block of 76591071 for bucket 15 Sorting block time: 00:01:13 Returning block of 252608372 for bucket 14 Exited Ebwt loop fchr[A]: 0 fchr[C]: 866420001 fchr[G]: 1465103434 fchr[T]: 2065958374 fchr[$]: 2934876451 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 982525150 bytes to primary EBWT file: GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.rev.1.bt2 Wrote 733719120 bytes to secondary EBWT file: GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.rev.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 2934876451 bwtLen: 2934876452 sz: 733719113 bwtSz: 733719113 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 183429779 offsSz: 733719116 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 15285815 numLines: 15285815 ebwtTotLen: 978292160 ebwtTotSz: 978292160 color: 0 reverse: 1 Total time for backward call to driver() for mirror index: 00:19:08 tar: unrecognized option ‘–sort=name’ Try tar --help' or tar --usage’ for more information.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
leepc12commented, Nov 25, 2019

I will add both GNU tar and grep to the Conda environment.

0reactions
leepc12commented, Feb 20, 2020

Fixed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

GRCh38 - hg38 - Genome - Assembly - NCBI
GRCh38 Genome Reference Consortium Human Build 38 Organism: Homo sapiens (human) Submitter: Genome Reference Consortium Date: 2013/12/17 Assembly type: ...
Read more >
UCSC Genome Browser Gateway
The GRCh38 assembly is the first major revision of the human genome released in more than four years. As with the previous GRCh37...
Read more >
Demystifying the versions of GRCh38/hg38 Reference ...
The hg38-alt-masked-graph hash table is compatible with pre-3.9 versions of DRAGEN. DRAGEN does not support the users building their own custom graph genomes....
Read more >
Human Genome version 38 FAQ - Thermo Fisher Scientific
GRCh Build 38 stands for “Genome Reference. Consortium Human Reference 38” and it is the primary genome assembly in GenBank; hg38 is the...
Read more >
Improvements and impacts of GRCh38 human reference on ...
According to The GRC press release, GRCh38 is the most accurately sequenced human genome in the world. It was constructed from many donors...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found