question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Chimeric contigs still present?

See original GitHub issue

Following on from another issue (Issue #29 ), I thought I would add this here.

I’m finding that ragtag is not able to resolve chimeric contigs even after doing the ragtag correct protocol. You will see in the attached image different versions of a genome’s chromosome AFTER scaffolding.

The pertinent rows are the 3rd, 4th and 5th rows. The 3rd row is the RaGOO only process, the 4th RagTag only and 5th is RagTag then RaGOO process.

As you can see RaGOO was able to resolve the chromosome somewhat while RagTag could not. Admittedly, this particular chromosome is weirdly tricky for unknown reasons but I found that in other chromosomes that the combination of both provided the best consensus scaffolding. Although the downside is that the use of RaGOO resulted in loss of some of what I’m assuming to be telomeric regions which RagTag often seems to get.

Commands used: RaGOO: ragoo.py isolate_masked.fasta reference.fasta.gz -t 24 -b RagTag: ragtag correct reference.fasta isolate_masked.fasta -t 24 ragtag scaffold reference.fasta isolated_masked.corrected.fasta -t 24 Combined: same commands as above but ran RagTag first then RaGOO on the output of RagTag

image

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:19 (8 by maintainers)

github_iconTop GitHub Comments

3reactions
damioreseguncommented, Jan 6, 2021

Very late update on my end here. Apologies. Thanks to @malonge we kind of figured out what happened here. Basically it seems that its an issue with minimap2. For some reason there were repetitive alignments between the ‘correct’ and ‘scaffold’ functions so it ended up retaining sequences and placing them in the wrong scaffolds. @malonge suggested switching to the nucmer aligner which ended up working great. No issues downstream. At least none that I can see. The parameters for the nucmer I used were “-maxmatch -l 100 -c 500”.

So all sorted on this end. Closing this issue now.

0reactions
damioreseguncommented, Oct 22, 2020

Update: I’ve retried the data – the same dataset I sent you using different tools and parameters. See screenshot below of one of my chromosomes. This chromosome has proven particularly problematic so I think its an apt example. To explain the rows: Row 1: Reference chromosome 2 Row 2: Another reference chromosome 2 Row 3: Prior Ragtag generated chromosome 2 – generated July 2020. Default parameters for correct and scaffold functions Row 4: Prior idea of carrying out RaGOO then Ragtag – generated July 2020 Row 5: Newly generated RaGOO only chromosome 2 Row 6: Idea of using the output of the chimera_break step for my manual scaffolding rather than Ragoo scaffolded output Row 7: RagOut – another similar tool but takes SIGNIFICANTLY longer. Default parameters and concatenated unplaced sequences Row 8: RagOut – Default parameters without unplaced sequences Row 9: Newly generated RagTag – inferred gap size estimation off Row 10: Newly generated RagTag – inferred gap size estimation on Row 11: New idea of running RaGOO first and using the output of the chimera_break as RagTag input as a way of retaining the smaller sequences which RaGOO would lose.

image

I also tried changing the aligners used or the minimap parameters and in all cases, the outputs were worse then default. In many cases, adding >500kb of Ns! So decided not to carry them forward. Do you have any ideas on that front?

Generally, across these chromosomes, RagTag alone is just not working, the combination of the chimeric break of Ragoo and then Ragtag gets the best outcomes though I think RagOut is slightly better at retaining the smaller sequences. The speed of Ragoo and Ragtag completely blows it out the water though.

In my particular use-case, it just seems like the chimeric detection/breaking within RagTag isn’t working as expect but RaGOO is.

Did you manage to find anything on your end?

Read more comments on GitHub >

github_iconTop Results From Across the Web

An illustration of two common types of chimeric contigs
An illustration of two common types of chimeric contigs: (a) sequence of two clones incorrectly joined over a relatively short segment they had...
Read more >
De novo sequence assembly requires bioinformatic checking ...
Chimera presence (sequences formed by two or more biological sequences incorrectly joined) corresponding to a joining of HPV and other non-human ...
Read more >
Alvis: a tool for contig and read ALignment VISualisation and ...
Here we present Alvis, a tool for visualising alignments of long reads and assemblies which can generate four different types of publication ...
Read more >
Part 4: Coverage plotting, chimera detection and inspection
This will be assembled into chimeric contigs which can be quite hard to distinguish from correct contigs. One way of getting around this...
Read more >
automatic Improvement of Long Read Assemblies (ILRA)
Although there is a promise to obtain “perfect genomes” with long ... contigs are considered inadequate reads, for example chimeric reads.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found