question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug in --info-file with --revcomp

See original GitHub issue
$ cutadapt 2>&1 | head -n1
This is cutadapt 3.3 with Python 3.8.5

It seems that there is a problem with the --info-file when the adapter is found in revcomp:

$ cat test.fna 
>test
TATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCG
>test_rv
CGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATA
$
$ cutadapt --report=minimal --info-file test.info --revcomp -g TATCAGCTCACT test.fna -o /dev/null
[8<----------] 00:00:00             2 reads  @    237.0 µs/read;   0.25 M reads/minute
status	in_reads	in_bp	too_short	too_long	too_many_n	out_reads	w/adapters	qualtrim_bp	out_bp
OK	2	200	0	0	0	2	2	0	176
$
$ cat test.info 
test		0	0	12		TATCAGCTCACT	CAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCG	1			
test_rv rc	0	0	12		CGGTTCCTGGCC	TTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATA	1	

Above, there are two identical reads differing in orientation. cutadapt correctly identifies the adapter in both, but in info-file the sequences in columns 5-7 are incorrect for the read with a match in the reverse complement, as they contain substrings extracted from the wrong strand. The coordinates in columns 3-4 are in principle usable as they can be understood by a parser as referring to the opposite strand based on the flag " rc" (although false positives can emerge if input reads for some other reason already have this very string “rc” in the definition, e.g. >my-read read with rc), but columns 5-7 are plainly unreliable.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
alephreishcommented, Mar 16, 2021

I’m personally totally happy. My idea is actually to locate spliced leader RNA which is in a way merely a (potentially truncated) adapter added at 5’ to transcripts e.g. in dinoflagellates.

1reaction
alephreishcommented, Mar 16, 2021

@marcelm Thanks a lot!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Changes — Cutadapt 3.4 documentation
#438: The info file now contains the `` rc`` suffix that is added to the names of reverse-complemented reads (with --revcomp ).
Read more >
cutadapt - remove adapter sequences from high-throughput ...
Replace "ADAPTER" with the actual sequence of your 3' adapter. IUPAC wildcard characters are supported. The reverse complement is *not* automatically searched.
Read more >
Reverse Complement - Bioinformatics.org
Reverse Complement converts a DNA sequence into its reverse, complement, or reverse-complement counterpart. You may want to work with the reverse-complement ...
Read more >
Hapsembler version 2.1 ( + Encore & Scarpa) Manual - University of ...
create a library info file where each line obeys the following format: ... has an option to reverse complement either of the reads....
Read more >
pyCRAC/Methods/__init__.py · master - GitLab
The chromosome info file should be a tab delimited file formatted as ... def reverse_complement(sequence): """ Returns the reverse complement of a DNA ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found