question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

indexing of AnnData

See original GitHub issue

The task is to write the following preprocessing sequence using an AnnData instance adata.

meanFilter = 0.01
cvFilter = 2
nr_pcs = 50

ddata = adata.to_dict()
X = ddata['X']
# row normalize                                                                                                                                                                  
X = row_norm(X, max_fraction=0.05, mult_with_mean=True)
# filter out genes with mean expression < 0.1 and coefficient of variance <                                                                                                      
# cvFilter                                                                                                                                                                       
X, gene_filter = filter_genes_cv(X, meanFilter, cvFilter)
# compute zscore of filtered matrix                                                                                                                                              
Xz = zscore(X)
# PCA                                                                                                                                                                            
Xpca = pca(Xz, nr_comps=nr_pcs)
# update dictionary                                                                                                                                                              
ddata['X'] = X
ddata['Xpca'] = Xpca
ddata['var_names'] = ddata['var_names'][gene_filter]
sett.m(0, 'Xpca has shape',
    ddata['Xpca'].shape[0], 'x', ddata['Xpca'].shape[1])
from ..ann_data import AnnData
adata = AnnData(ddata)
print(adata.X)

While the previous snippet works just as expected, when I want to do the same without a ddata object, some uncontrolled behavior comes up. Indexing doesn’t work as expected anymore. @flying-sheep: could you have a look at why adata['Xpca'] = Xpca in the following throws an

>>> adata['Xpca'] = Xpca
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

in the following snippet

X = adata.X
# row normalize                                                                                                                                                                  
X = row_norm(X, max_fraction=0.05, mult_with_mean=True)
# filter out genes with mean expression < 0.1 and coefficient of variance <                                                                                                      
# cvFilter                                                                                                                                                                       
X, gene_filter = filter_genes_cv(X, meanFilter, cvFilter)
# compute zscore of filtered matrix                                                                                                                                              
Xz = zscore(X)
# PCA                                                                                                                                                                            
Xpca = pca(Xz, nr_comps=nr_pcs)
# update adata                                                                                                                                                                   
adata.X = X
adata = adata.var_names[gene_filter] # filter genes                                                                                                                              
adata['Xpca'] = Xpca
sett.m(0, 'Xpca has shape',
    adata['Xpca'].shape[0], 'x', adata['Xpca'].shape[1])
print(adata.X)

I played around quite some bit, but the only solution that I got running then had the numerically incorrect result. It’s quite to hard to keep this sequence of steps nicely organized.

PS: the snippet appears in scanpy/preprocess/advanced.py and an example would be ./scanpy.py nestorowa16 diffmap -r pp.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
falexwolfcommented, Feb 9, 2017

ok, of course this right, let’s discuss in person.

0reactions
flying-sheepcommented, Feb 9, 2017

something like adata.var = adata.var[gene_filter] should work

i disagree. adata = adata[:, gene_filter] should work. the object shouldn’t be able to be in an invalid state.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Login - AnnData - Read the Docs
Login. In order to view this documentation, you must log in first. Username: Password: Login. Or. Do you have a password? Access here ......
Read more >
Introducing anndata: indexing, views and HDF5-backing
Indexing and Views. Similar to numpy arrays, AnnData objects can either hold actual data or reference another AnnData object. In the later case, ......
Read more >
Indexing anndata for plots - Help - Scanpy
Is it possible to pass a subset of the anndata object to the plotting interface? I am looking for something like this (to...
Read more >
indexing of AnnData · Issue #4 · scverse/scanpy - GitHub
The task is to write the following preprocessing sequence using an AnnData instance adata. meanFilter = 0.01 cvFilter = 2 nr_pcs = 50...
Read more >
Create an Annotated Data Matrix - anndata
Indexing into an AnnData object can be performed by relative position with numeric indices, or by labels. To avoid ambiguity with numeric indexing...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found