Sequence dictionaries differ
See original GitHub issueI’m running:
java -Xmx4g -jar picard.jar CollectRnaSeqMetrics \
REFERENCE_SEQUENCE=hg19.genome.fa \
REF_FLAT=gencode.v19.annotation.refFlat \
RIBOSOMAL_INTERVALS=gencode.v19.rRNA.interval_list \
STRAND_SPECIFICITY=NONE \
INPUT=file.bam \
ASSUME_SORTED=true \
OUTPUT=file.rnaseq_metrics \
CHART_OUTPUT=file.rnaseq.pdf
I get this error:
Exception in thread "main" picard.PicardException: Sequence dictionaries differ in file.bam and gencode.v19.rRNA.interval_list
However, I copied the sequence dictionary from the BAM file. So, the sequence dictionaries are identical.
Here’s the sequence dictionary along with the first 5 lines of the ribosomal intervals file:
@HD VN:1.4 SO:coordinate
@SQ SN:chrM LN:16571
@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr18 LN:78077248
@SQ SN:chr19 LN:59128983
@SQ SN:chr20 LN:63025520
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
chr1 9497728 9497837 - ENST00000517147.1
chr1 13949679 13949779 - ENST00000411020.1
chr1 34578550 34578664 + ENST00000364278.1
chr1 37730278 37730387 - ENST00000516559.1
chr1 39619836 39619968 - ENST00000410446.1
Issue Analytics
- State:
- Created 9 years ago
- Comments:11 (8 by maintainers)
Top Results From Across the Web
CollectRnaSeqMetrics shows "Sequence dictionaries differ ...
It appears that you used the same script for creating intervals file so it should be applicable in your case.
Read more >Picard CrosscheckFingerprint: Sequence dictionaries are not ...
Hi, I'm getting the error "Sequence dictionaries are not the same size" when using CrosscheckFingerprint. I suspect it's because of the ...
Read more >4061. Sequence dictionaries differ after running ... - Google Sites
Hello! I was trying to run the CollectRnaSeqMetrics from Picard tools. My command was: java -Xmx4g -jar picard.jar CollectRnaSeqMetrics \ I=/ ...
Read more >Sequence Definition & Meaning - Dictionary.com
British Dictionary definitions for sequence · an arrangement of two or more things in a successive order · a sequentially ordered set of...
Read more >What are differences between List and Dictionary in Python
List and dictionary are fundamentally different data structures . A list can store a sequence of objects in a certain order such that...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The problem: my sequence dictionary lines are space-delimited, and they must be tab-delimited, i.e.
\t
.I apologize for the confusion, and for wasting your time. I recommend that you document the format of the interval list file, or refer the user to an example of a properly-formatted file. It would be useful to point them to a ready-to-use file like this one.
Strangely, the current documentation for
CollectRnaSeqMetrics
refers the user to the IntervalList javadoc. The comment at the top of the page does not mention that the SAM-style header must be tab-delimited. In my opinion, developers – not users – should be referred to javadoc documentation.Wrong (each item is separated by a space
' '
):Right (each item is separated by a tab
'\t'
):https://github.com/broadinstitute/picard/issues/126#issuecomment-66618588