question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is there a way to merge adata objects with vars ordered differently

See original GitHub issue

I am trying to integrate and batch correct scRNA data from multiple experiments. I have found a package called scanorama that provides a convenient wrapper for anndata objects. The method seems to generate low dimensional embeddings that suggest it is performing relatively well. The problem comes because scanorama re-orders the var axis to be sorted alphabetically in the corresponding corrected anndata objects it produces, while the original anndata objs are NOT sorted.

I have a ton of important information in the original anndata objs that is NOT retained in the corrected anndata objects scanpy produces as discussed here (https://github.com/brianhie/scanorama/issues/57). In addition to this the index of the originals obs table is obliterated and replaced with numbers (but the order here is NOT changed).

I believe that I can simply do this to faithfully regain annotations in the original obs axis

corrected.obs = original.obs

because this order is not altered but I have important data saved about the genes in var as well as the other attributes in the original that I really want/need in the corrected anndata object. The author suggested ensuring that the var axis be sorted prior to being provided to the scanorama function but I must admit that I am not sure how to do this even. My explorations lead me to believe that I do not understand the “guts” of these objects well enough to know if I am on the right track.

The following

adata[:,["OR4F5", "FAM138A"]].copy().X != adata[:,["FAM138A","OR4F5"]].X.copy()

shows no alteration to the X attributes but the corresponding var attributes reflect the altered order.

Is there a way to import the corrected anndata object as a “layer” or “view” accessible from the original? Or perhaps is there a better way than I have thought of so far to integrate these two very important versions of my data?

Thank you for your tool and any advice/guidance you are able to provide!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
xgusecommented, Nov 19, 2020

@ivirshup

Ok. that was the problem. I just happened to randomly choose two genes that are zero across the board. So reordering them is undetectable via equality testing. Choosing populated genes shows the expected behavior.

image

1reaction
brianhiecommented, Nov 18, 2020

I just wrote up a new version of Scanorama that preserves .obs in the corrected output.

You can also now just do integration (which benchmarks better than “batch correction”) if you are only concerned with clustering/visualization. Scanorama v1.7 just adds this to the adatas as adata.obsm['X_scanorama'].

Hope that helps!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Concatenation — anndata 0.9.0.dev37+g312e6ff documentation
Concatenation is when we keep all sub elements of each object, and stack these elements in an ordered way. · Merging is combining...
Read more >
merging with a loom file · Issue #36 · theislab/scvelo - GitHub
I am trying to use scv.utils.merge(adata, adata_loom) to merge my dataset used with scanpy and the related loom dataset opened with scvelo.
Read more >
Merge two data frames while keeping the original row order
You just need to create a variable which gives the row number in df.2. Then, once you have merged your data, you sort...
Read more >
Combining data - Xarray
For combining datasets or data arrays along a single dimension, see concatenate. For combining datasets with different variables, see merge.
Read more >
Data Wrangling in R: Combining, Merging and Reshaping Data
There are typically # three ways to do this: (1) stack on top of each other, (2) place side-by-side, # or (3) merge...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found