question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Low sum in ragoo.fasta

See original GitHub issue

Hello,

I am attempting to run Ragoo using a long-read assembly as the ‘reference’. After running Ragoo with the following command:

ragoo.py -t 4 -b -C ${assembly} ${ref}

My output ragoo.fasta file seems to be missing a lot of bases. The original assembly is ~2.7Gb, but the output fasta file has ~736 Mb only.

Any idea about what is happening to the outstanding sequences, or is this expected behaviour? The chimera.broken.fa file is the correct size, so it seems that things are being lost after that stage somewhere.

Thanks! Lauren

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:16 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
malongecommented, Mar 2, 2020

Hi there,

After testing the code with your data, I believe I understand the problem.

When -C is invoked, a single file for each of the unplaced contigs is written in the intermediate output directory. Since your contigs had roughly 1M unplaced contigs, I assume this became a problem for your file system, thus leading to the truncated ragoo.fasta file.

Indeed, if one does not use -C, ragoo.fasta contains the expected amount of sequence.

In future versions of RaGOO, the intermediate output will be restricted to exactly 2 files regardless of the -C option. I believe that should solve the “low sum” problem.

Additionally, it is true that RaGOO was not designed for more fragmented assemblies of larger genomes. To address this, future versions of ragoo will allow the user to lower the minimum alignment length, thus allowing for more contigs to be placed.

I will test out your data again when these features are implemented.

Thanks

0reactions
lcoombecommented, Sep 23, 2019

My ‘reference’ is another human assembly using a different assembler. The original ragoo.fasta file has ~727 Mbp in it, vs ~2.4 Gbp in the file with manually concatenating sequence.

Read more comments on GitHub >

github_iconTop Results From Across the Web

ntJoin: Fast and lightweight assembly-guided scaffolding ...
Here, we introduce ntJoin, an assembly-guided scaffolder, which uses a lightweight, alignment-free mapping strategy in lieu of alignments to quickly contiguate ...
Read more >
Untitled
Dreamcast console cheap, Pesbukers antv 2012, Portal dimensional no iraque, ... Video cars 2, Dirbiniai is elnio ragu, K line sailing schedule port...
Read more >
Construction and integration of three de novo Japanese ...
In meta-assembly strategies, individual assemblies are aligned, and one best assembly is selected for each aligned segment based on the absence ...
Read more >
Genome assembly and association tests identify interacting ...
generate the bimodal distribution of tree size with undesirable small trees observed by. 53 growers. We identified candidate genes within ...
Read more >
Chromosomal-level genome assembly of the semi-dwarf rice ...
To form the chromosomes, the RaGOO [22] assembler was used to align the assembly scaffolds against the R498 genome. Table 1. QUAST statistics...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found