question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

testing cooler on capture Hi-C data

See original GitHub issue

Hi @nvictus As quickly discussed, I tried to build a cooler object on a 3Mb Hi-C region from capture Hi-C data. I’m facing several issues according to the test I run …

I have a bed file with genomic intervals of 1kb, from chrX:150125000-153125000 Accordinly, I extracted my pairs within the same genomic range ;

>>zcat contacts.txt.gz | head -1
NB501764:1043:HM5FKBGXF:2:23302:4878:10125	chrX	150125462	chrX	150163625	+	+
>>zcat contacts.txt.gz | tail -1
NB501764:1043:HM5FKBGXF:3:21505:23621:19037	chrX	153121783	chrX	153123912	-	-

Then, I simply try to ingest the data with cload pairs

>>cooler cload pairs -c1 2 -p1 3 -c2 4 -p2 5 target.bed contacts.txt.gz test.cool
INFO:cooler.create:Writing chunk 0: /data/kdi_prod/.kdi/project_workspace_0/1309/acl/01.00/downstreamAnalysis/scripts/tmpenmswn78.multi.cool::0
INFO:cooler.create:Creating cooler at "/data/kdi_prod/.kdi/project_workspace_0/1309/acl/01.00/downstreamAnalysis/scripts/tmpenmswn78.multi.cool::/0"
INFO:cooler.create:Writing chroms
INFO:cooler.create:Writing bins
INFO:cooler.create:Writing pixels
INFO:cooler.create:Writing indexes
INFO:cooler.create:Writing info
INFO:cooler.create:Done
INFO:cooler.create:Merging into test.cool
INFO:cooler.create:Creating cooler at "test.cool::/"
INFO:cooler.create:Writing chroms
INFO:cooler.create:Writing bins
INFO:cooler.create:Writing pixels
INFO:cooler.reduce:nnzs: [377101]
INFO:cooler.reduce:current: [0]
Traceback (most recent call last):
  File "/data/users/nservant/projects_analysis/kdi_home/conda/nf-core-hic/bin/cooler", line 10, in <module>
    sys.exit(cli())
  File "/data/users/nservant/projects_analysis/kdi_home/conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/data/users/nservant/projects_analysis/kdi_home/conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/data/users/nservant/projects_analysis/kdi_home/conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/users/nservant/projects_analysis/kdi_home/conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/users/nservant/projects_analysis/kdi_home/conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/cli/cload.py", line 492, in pairs
    ordered=False
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_create.py", line 944, in create_cooler
    max_merge=max_merge)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_create.py", line 687, in create_from_unordered
    **kwargs)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_create.py", line 577, in create
    file_path, target, meta.columns, iterable, h5opts, lock)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_create.py", line 213, in write_pixels
    for i, chunk in enumerate(iterable):
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/reduce.py", line 163, in __iter__
    ignore_index=True)
  File "conda/nf-core-hic/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 255, in concat
    sort=sort,
  File "conda/nf-core-hic/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 304, in __init__
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

Then, I had a try with cooler pairx

cooler csort -c1 2 -p1 3 -s1 6 -c2 4 -p2 5 -s2 7 -o contacts.csort.txt.gz contacts.txt.gz chrX.sizes
cooler cload pairix target.bed contacts.csort.txt.gz test.cool       
INFO:cooler.cli.csort:Enumerating requested chromosomes...
INFO:cooler.cli.csort:chrX	1
INFO:cooler.cli.csort:Input: '../Jarid_NPC_B1/cooler/Jarid_NPC_B1_contacts.txt.gz'
INFO:cooler.cli.csort:Output: '../Jarid_NPC_B1/cooler/Jarid_NPC_B1_contacts.csort.txt.gz'
INFO:cooler.cli.csort:Reordering pair mates and sorting pair records...
INFO:cooler.cli.csort:Sort order: block (chrom1, chrom2, pos1, pos2)
INFO:cooler.cli.csort:sort -k2,2 -k4,4 -k3,3n -k5,5n --parallel=8 --buffer-size=50%
INFO:cooler.cli.csort:Indexing...
INFO:cooler.cli.csort:Indexer: pairix
INFO:cooler.cli.csort:pairix -f -s2 -d4 -b3 -e3 -u5 -v5 ../Jarid_NPC_B1/cooler/Jarid_NPC_B1_contacts.csort.txt.gz
INFO:cooler.cli.cload:Using 8 cores
INFO:cooler.create:Creating cooler at "../Jarid_NPC_B1/cooler/Jarid_NPC_B1_1000.cool::/"
INFO:cooler.create:Writing chroms
INFO:cooler.create:Writing bins
INFO:cooler.create:Writing pixels
INFO:cooler.create:Binning chrX:150125000-151625000|*
INFO:cooler.create:Binning chrX:151625000-153125000|*
INFO:cooler.create:Finished chrX:151625000-153125000|*
INFO:cooler.create:Finished chrX:150125000-151625000|*
Traceback (most recent call last):
  File "conda/nf-core-hic/bin/cooler", line 10, in <module>
    sys.exit(cli())
  File "conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "conda/nf-core-hic/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/cli/cload.py", line 238, in pairix
    ordered=True)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_create.py", line 925, in create_cooler
    lock=lock)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_create.py", line 577, in create
    file_path, target, meta.columns, iterable, h5opts, lock)
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_create.py", line 213, in write_pixels
    for i, chunk in enumerate(iterable):
  File "conda/nf-core-hic/lib/python3.7/site-packages/cooler/create/_ingest.py", line 292, in _validate_pixels
    "Found a bin ID that exceeds the declared number of bins. "
cooler.create._ingest.BadInputError: Found a bin ID that exceeds the declared number of bins. Check whether your bin table is correct.

The csort command works (although I put here the entire chrX size ? not sure what to put otherwise …), but the cload pairix also crashed …

Of note, I also reported the same error in cooltools earlier, when trying to bin a small genome https://github.com/open2c/cooltools/issues/237

cooler -V cooler, version 0.8.6.post0

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
Phlyacommented, Apr 19, 2021

You don’t even need to use them for balancing, just storing them in the cooler so you can see them if you want… In my limited experience balancing small regions is not 100% reliable in general unfortunately, need to modify filtering sometimes. But would be good to assemble some test data to check how the tools work with it.

0reactions
Phlyacommented, Aug 9, 2021

I just wanted to say that I always use whole-genome binning, as if it was whole-genome Hi-C, and never had any issues like that. Just provide a blacklist to balancing, which would make it ignore most of the genome.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Captured Hi-C data analysis - HiCExplorer - Read the Docs
Aggregate data for differential test​​ It selects the original data based on the target locations and returns one hdf5 based file.
Read more >
FAN-C: a feature-rich framework for the analysis and ...
Here, we present FAN-C, a Framework for the ANalysis of Chromatin Conformation Capture data, an easy-to-use command-line tool and powerful ...
Read more >
Hi-C analysis: from data generation to integration - PMC - NCBI
Hi-C data allows examining the genome 3D organization at multiple scales ... For example, promoter capture Hi-C is designed to enrich for ...
Read more >
Hi‐C 3.0: Improved Protocol for Genome‐Wide Chromosome ...
Hi-C is a chromosome conformation capture (3C)-based technology to detect ... 6-bp DNA sequences, limiting data resolution to ∼10 kb.
Read more >
Galaxy HiCExplorer 3: a web server for reproducible Hi-C ...
Capture Hi-C data cannot be analysed with established Hi-C ... Hi-C and cHi-C supports HiCExplorer's h5 and cool interaction matrix file ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found