Multicooler matrix view does not reflect chromosome ordering
See original GitHub issueCommit: Recent
Hi,
I am trying to visualize a multicooler with HiGlass including addition of the search position. For this we first generated contact matrices with HiCExplorer (https://hicexplorer.readthedocs.io/en/latest/index.html) at different resolutions and converted them to a multicooler. Although, we anticipated the semantic ordering of the chromosomes and the view displaying the chromosome grid in the right way, the matrix itself is displayed as if the chromosomes are ordered as chr10 chr11 chr12, …, chr1, chr2, …, chrX, chrY.
Steps to reproduce
- Aligning paired end reads with HICUP pipeline (using bowtie2 index generated from not semantically ordered fasta of mm9)
- generating 1kb matrix with HiCExplorers hicBuildMatrix
hicBuildMatrix -s first.bam second.bam -o hicmatrix_1kb.h5 --skipDuplicationCheck --binSize 1000 --QCfolder hicQC -ga mm9 --minDistance $MIN --maxLibraryInsertSize $MAX --threads 16
- reordering chromosomes to reflect semantic order
hicAdjustMatrix -m hicmatrix_1kb.h5 --chromosomes chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chrX chrY --action keep -o hicmatrix_1kb_canonical.h5
- generating different resolutions (resolutions chosen according to the documentation of HiGlass)
for i in 5 10 25 50 100 500 1000;
do
hicMergeMatrixBins -m hicmatrix_1kb_canonical.h5 -o hicmatrix_${i}kb.h5 -nb ${i}
hicCorrectMatrix correct -m hicmatrix_${i}kb.h5 --correctionMethod KR -o hicmatrix_${i}kb_KR.h5
done
- Convert to multicooler
h5toCool.py -m *KR.h5 -o hicmatrix.mcool --merge
where h5toCool.py
is a lightweight version of hicConvertFormat
fixing a bug that would make the resulting multicooler file unreadable by HiGlass due to binary header entries (see also https://github.com/deeptools/HiCMatrix/issues/29)
- loading file to HiGlass server hosted by our institution with
curl
curl -u user:password \
-F "datafile=@chromosomes_mm9.tsv" \
-F "filetype=chromsizes-tsv" \
-F "datatype=chromsizes" \
-F "uid=chromosomes_mm9" \
-F "coordSystem=mm9" \
-F "name=Chromosomes (mm9)" \
<url2server>
curl -u user:password \
-F "datafile=@hicmatrix.mcool" \
-F "filetype=cooler" \
-F "datatype=matrix" \
-F "uid=CH12_shLacZ" \
-F "coordSystem=mm9" \
<url2server>
Observed behavior
Anticipating the semantic ordering of the chromosomes by reordering the generated matrix in the file and the HiGlass view displaying the chromosome grid in the right way but the matrix itself is displayed as if the chromosomes are ordered as chr10 chr11 chr12, …, chr1, chr2, …, chrX, chrY although examination of the multicooler indicates that the order is correct. (see file attached)
import cooler
c = cooler.Cooler('hicmatrix.mcool::resolutions/1000000')
c.matrix(balance = False, as_pixels = True, join = True)[:10, :10]
chrom1 start1 end1 chrom2 start2 end2 count
0 chr1 3000000 4000000 chr1 3000000 4000000 16494.401695
1 chr1 3000000 4000000 chr1 4000000 5000000 5958.713133
2 chr1 3000000 4000000 chr1 5000000 6000000 1839.512867
3 chr1 3000000 4000000 chr1 6000000 7000000 638.394715
4 chr1 3000000 4000000 chr1 7000000 8000000 773.033207
5 chr1 3000000 4000000 chr1 8000000 9000000 940.248242
6 chr1 3000000 4000000 chr1 9000000 10000000 474.213159
7 chr1 4000000 5000000 chr1 4000000 5000000 12823.700529
8 chr1 4000000 5000000 chr1 5000000 6000000 5314.330105
9 chr1 4000000 5000000 chr1 6000000 7000000 1552.412464
10 chr1 4000000 5000000 chr1 7000000 8000000 1036.752391
11 chr1 4000000 5000000 chr1 8000000 9000000 817.722504
12 chr1 4000000 5000000 chr1 9000000 10000000 761.359299
13 chr1 5000000 6000000 chr1 5000000 6000000 14334.216266
14 chr1 5000000 6000000 chr1 6000000 7000000 3877.311889
15 chr1 5000000 6000000 chr1 7000000 8000000 1529.875814
16 chr1 5000000 6000000 chr1 8000000 9000000 1144.271438
17 chr1 5000000 6000000 chr1 9000000 10000000 828.638743
18 chr1 6000000 7000000 chr1 6000000 7000000 15439.594502
19 chr1 6000000 7000000 chr1 7000000 8000000 4808.961148
20 chr1 6000000 7000000 chr1 8000000 9000000 980.361631
21 chr1 6000000 7000000 chr1 9000000 10000000 1414.547768
22 chr1 7000000 8000000 chr1 7000000 8000000 12438.445233
23 chr1 7000000 8000000 chr1 8000000 9000000 5837.967332
24 chr1 7000000 8000000 chr1 9000000 10000000 2996.634297
25 chr1 8000000 9000000 chr1 8000000 9000000 14551.815508
26 chr1 8000000 9000000 chr1 9000000 10000000 3897.124364
27 chr1 9000000 10000000 chr1 9000000 10000000 13381.554611
Expected behavior
We figured that the ordering of the matrix as chr10 chr11 chr12, …, chr1, chr2, …, chrX, chrY is due to the ordering in the index used to align the paired end reads. Therefore, we manually reordered the matrices produced by HiCExplorer and expected this to fix the problem (plotting the matrix with different methods also indicated that reordering worked fine)
Do you have a clue why this could be?
The view was generated by first adding hicmatrix.mcool as heatmap in the center and then overlaying it with the 2d-chromosome-grid information of the same file.
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (4 by maintainers)
Top GitHub Comments
Yes thats exactly what I wanted to know. Thanks, that totally explains why it looks so weird on top of the caching issue of the uuid. Thank you very much again.
I’m not entirely sure what you mean, but I think the answer is yes.
e.g. the command
will make sure that the data is loaded such that the internal chromosome order is identical to what is given in the
{chrom_sizes}
file. HiGlass respects this order.You can check what the chromosome order in a cooler file is with
cooler dump -t chroms
.