question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multicooler matrix view does not reflect chromosome ordering

See original GitHub issue

Commit: Recent

Hi,

I am trying to visualize a multicooler with HiGlass including addition of the search position. For this we first generated contact matrices with HiCExplorer (https://hicexplorer.readthedocs.io/en/latest/index.html) at different resolutions and converted them to a multicooler. Although, we anticipated the semantic ordering of the chromosomes and the view displaying the chromosome grid in the right way, the matrix itself is displayed as if the chromosomes are ordered as chr10 chr11 chr12, …, chr1, chr2, …, chrX, chrY.

Steps to reproduce

  1. Aligning paired end reads with HICUP pipeline (using bowtie2 index generated from not semantically ordered fasta of mm9)
  2. generating 1kb matrix with HiCExplorers hicBuildMatrix
hicBuildMatrix -s first.bam second.bam -o hicmatrix_1kb.h5 --skipDuplicationCheck --binSize 1000 --QCfolder hicQC -ga mm9 --minDistance $MIN --maxLibraryInsertSize $MAX --threads 16
  1. reordering chromosomes to reflect semantic order
hicAdjustMatrix -m hicmatrix_1kb.h5 --chromosomes chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chrX chrY --action keep -o hicmatrix_1kb_canonical.h5
  1. generating different resolutions (resolutions chosen according to the documentation of HiGlass)
for i in 5 10 25 50 100 500 1000;
do
    hicMergeMatrixBins -m hicmatrix_1kb_canonical.h5 -o hicmatrix_${i}kb.h5 -nb ${i}
    hicCorrectMatrix correct -m hicmatrix_${i}kb.h5 --correctionMethod KR -o hicmatrix_${i}kb_KR.h5
done
  1. Convert to multicooler
h5toCool.py -m *KR.h5 -o hicmatrix.mcool --merge

where h5toCool.py is a lightweight version of hicConvertFormat fixing a bug that would make the resulting multicooler file unreadable by HiGlass due to binary header entries (see also https://github.com/deeptools/HiCMatrix/issues/29)

  1. loading file to HiGlass server hosted by our institution with curl
curl -u user:password \
    -F "datafile=@chromosomes_mm9.tsv" \
    -F "filetype=chromsizes-tsv" \
    -F "datatype=chromsizes" \
    -F "uid=chromosomes_mm9" \
    -F "coordSystem=mm9" \
    -F "name=Chromosomes (mm9)" \
    <url2server>

curl -u user:password \
    -F "datafile=@hicmatrix.mcool" \
    -F "filetype=cooler" \
    -F "datatype=matrix" \
    -F "uid=CH12_shLacZ"  \
    -F "coordSystem=mm9" \
    <url2server>

Observed behavior

Anticipating the semantic ordering of the chromosomes by reordering the generated matrix in the file and the HiGlass view displaying the chromosome grid in the right way but the matrix itself is displayed as if the chromosomes are ordered as chr10 chr11 chr12, …, chr1, chr2, …, chrX, chrY although examination of the multicooler indicates that the order is correct. (see file attached)

import cooler
c = cooler.Cooler('hicmatrix.mcool::resolutions/1000000')
c.matrix(balance = False, as_pixels = True, join = True)[:10, :10]

   chrom1   start1      end1 chrom2   start2      end2         count
0    chr1  3000000   4000000   chr1  3000000   4000000  16494.401695
1    chr1  3000000   4000000   chr1  4000000   5000000   5958.713133
2    chr1  3000000   4000000   chr1  5000000   6000000   1839.512867
3    chr1  3000000   4000000   chr1  6000000   7000000    638.394715
4    chr1  3000000   4000000   chr1  7000000   8000000    773.033207
5    chr1  3000000   4000000   chr1  8000000   9000000    940.248242
6    chr1  3000000   4000000   chr1  9000000  10000000    474.213159
7    chr1  4000000   5000000   chr1  4000000   5000000  12823.700529
8    chr1  4000000   5000000   chr1  5000000   6000000   5314.330105
9    chr1  4000000   5000000   chr1  6000000   7000000   1552.412464
10   chr1  4000000   5000000   chr1  7000000   8000000   1036.752391
11   chr1  4000000   5000000   chr1  8000000   9000000    817.722504
12   chr1  4000000   5000000   chr1  9000000  10000000    761.359299
13   chr1  5000000   6000000   chr1  5000000   6000000  14334.216266
14   chr1  5000000   6000000   chr1  6000000   7000000   3877.311889
15   chr1  5000000   6000000   chr1  7000000   8000000   1529.875814
16   chr1  5000000   6000000   chr1  8000000   9000000   1144.271438
17   chr1  5000000   6000000   chr1  9000000  10000000    828.638743
18   chr1  6000000   7000000   chr1  6000000   7000000  15439.594502
19   chr1  6000000   7000000   chr1  7000000   8000000   4808.961148
20   chr1  6000000   7000000   chr1  8000000   9000000    980.361631
21   chr1  6000000   7000000   chr1  9000000  10000000   1414.547768
22   chr1  7000000   8000000   chr1  7000000   8000000  12438.445233
23   chr1  7000000   8000000   chr1  8000000   9000000   5837.967332
24   chr1  7000000   8000000   chr1  9000000  10000000   2996.634297
25   chr1  8000000   9000000   chr1  8000000   9000000  14551.815508
26   chr1  8000000   9000000   chr1  9000000  10000000   3897.124364
27   chr1  9000000  10000000   chr1  9000000  10000000  13381.554611

Expected behavior

We figured that the ordering of the matrix as chr10 chr11 chr12, …, chr1, chr2, …, chrX, chrY is due to the ordering in the index used to align the paired end reads. Therefore, we manually reordered the matrices produced by HiCExplorer and expected this to fix the problem (plotting the matrix with different methods also indicated that reordering worked fine)

Do you have a clue why this could be?

The view was generated by first adding hicmatrix.mcool as heatmap in the center and then overlaying it with the 2d-chromosome-grid information of the same file.

image

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
dmalzlcommented, Jun 18, 2020

Yes thats exactly what I wanted to know. Thanks, that totally explains why it looks so weird on top of the caching issue of the uuid. Thank you very much again.

1reaction
nvictuscommented, Jun 18, 2020

I’m not entirely sure what you mean, but I think the answer is yes.

e.g. the command

cooler cload pairs  -c1 2 -p1 3 -c2 4 -p2 5  {chrom_sizes}:1000 {mypairs} {output_cool}

will make sure that the data is loaded such that the internal chromosome order is identical to what is given in the {chrom_sizes} file. HiGlass respects this order.

You can check what the chromosome order in a cooler file is with cooler dump -t chroms.

Read more comments on GitHub >

github_iconTop Results From Across the Web

cooler Documentation - Read the Docs
Cooler is a Python support library for .cool/.mcool files: an efficient storage format for high resolution genomic interaction matrices.
Read more >
Integrated multi-omics reveals cellular and molecular ... - Nature
In physical proximity, we identify cancer-associated fibroblasts with extracellular matrix-remodeling features. Tumor cells strongly express the ...
Read more >
Experiments with Drosophila for Biology Courses
Such gene names do not reflect a mutant phenotype. In some cases, the gene symbol includes the known location of the gene on...
Read more >
D - Scopus
No information is available for this page.
Read more >
Potential health effects of exposure to electromagnetic fields ...
Examples of potential areas of activity include potential risks associated with interaction of risk factors, synergic effects, cumulative effects, antimicrobial ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found