question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Interest in HTO/ADT demultiplexing and analysis?

See original GitHub issue

I’m starting to regularly analyze samples multiplexed with HTOs a la Stoeckius et al and multimodal CITE-seq ADTs. I have some rough ports of Seurat’s HTODemux and some other functionality.

Is anyone else analyzing these kinds of experiments using scanpy and is there interest to bring functionality to do so into the main scanpy branch?

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:10
  • Comments:22 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
njbernsteincommented, Jan 13, 2021

@wflynny on my to-do list is to add solo to scVI directly and then scanpy will be a nice play to do both types of doublet finding.

1reaction
wflynnycommented, Oct 17, 2019

@gokceneraslan Hey, sorry for my long silence on this.

I’ve been using @Hoohm’s https://github.com/Hoohm/CITE-seq-Count for ADT/HTO tag counting which produces (in recent versions) a 10X v3 style mtx directory for both reads and UMIs. In some cases, I’ll load these in as their own AnnData object with reads and counts as different layers which is helpful in computing per-cell or per-tag “sequencing saturation” and other metrics involving both reads and counts. This is especially helpful for investigating some pilot experiments (lipid tags, cholesterol tags, etc.) we’ve been doing.

However, most of the time I’ll just load the tags matrix in as a pandas dataframe and run them through a demuxing function that’ll modify adata.obs.

A couple challenges/ideas to consider:

  • at our facility, we’re typically building the same Illumina i7 index (ATTACTCG) into all tag libraries. This leads to some tricky situations when using a NovaSeq for sequencing since the multiple tag libraries (with disjoint sets of tags) may be run on the same sequencing flowcell lane. This results in a single set of FASTQ files and thus a single barcode-tag matrix for all tag libraries on that lane. Therefore, the mapping between transcriptome AnnData objects <-> tag library matrices is not always 1-to-1.
  • in my experience, HTO libraries have a large variance in quality, so for the most part I’ve been using the transcriptome as my “ground truth” as to what is a cell. However, I imagine others use HTOs to “rescue” cells that were not called by their pipeline of choice (and I hope to do this once I build enough trust in the data). In that case, one would want to intersect the HTO classifications with the raw cell-gene matrix.
  • not all tags are antibody based, so I’d vote for naming all related functions *hashtags().

I’d therefore vote for something like the following design:

# htos is a AnnData object
htos = sc.read_hashtags(filename) 

# classify_hashtags adds a classification to the hto AnnData object
# kwargs might involve things like `use_tags=["tag1", "tag2", "tag3"]`
sc.pp.classify_hashtags(htos, **kwargs)
print(htos.obs.classification) 

# demuxing cell-gene matrix(es) could then be done like
rna1 = sc.read_10x_h5(...)
rna2 = sc.read_10x_h5(...)
# sc.pp.demux_by_hashtag(adata_hto, *adata_rna, tag_groups=None, ...)
sc.pp.demux_by_hashtag(
    htos, 
    rna1, rna2, 
    tag_groups=[("tag1", "tag3", "tag5"), ("tag2", "tag4", "tag6")]
)

@gokceneraslan This is more complex than what you suggested, but I think is sufficiently general to cover my needs as listed above. Let me know what you think—I’ll have some development time next week to possible contribute to this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Comprehensive evaluation of deconvolution methods for ...
Most deconvolution methods have been developed for, or assessed on, blood/immune and tumour samples, with limited assessment of their ...
Read more >
THUNDER: A reference-free deconvolution method to infer ...
We conducted extensive simulations to test THUNDER based on combining two published single-cell Hi-C (scHi-C) datasets.
Read more >
Defining the Healthy Fecal “Core Microbiome” in Pet Domestic ...
For this, we conducted analyses comparing the fecal microbiomes of healthy house cats, FIV negative shelter cats, and FIV positive shelter cats.
Read more >
Genetic analysis of adult leukoencephalopathy patients using ...
To examine this genetic contribution, we analyzed genomic DNA from 60 Japanese patients with adult leukoencephalopathy of unknown cause by ...
Read more >
mdozmorov/HiC_tools: A collection of tools for Hi-C ... - GitHub
Analysis and Visualization (contact distance decay, A/B compartment detection, ... Difference detection by a paired t-test of normalized interactions within ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found