indexing of AnnData
See original GitHub issueThe task is to write the following preprocessing sequence using an AnnData instance adata.
meanFilter = 0.01
cvFilter = 2
nr_pcs = 50
ddata = adata.to_dict()
X = ddata['X']
# row normalize
X = row_norm(X, max_fraction=0.05, mult_with_mean=True)
# filter out genes with mean expression < 0.1 and coefficient of variance <
# cvFilter
X, gene_filter = filter_genes_cv(X, meanFilter, cvFilter)
# compute zscore of filtered matrix
Xz = zscore(X)
# PCA
Xpca = pca(Xz, nr_comps=nr_pcs)
# update dictionary
ddata['X'] = X
ddata['Xpca'] = Xpca
ddata['var_names'] = ddata['var_names'][gene_filter]
sett.m(0, 'Xpca has shape',
ddata['Xpca'].shape[0], 'x', ddata['Xpca'].shape[1])
from ..ann_data import AnnData
adata = AnnData(ddata)
print(adata.X)
While the previous snippet works just as expected, when I want to do the same without a ddata object, some uncontrolled behavior comes up. Indexing doesn’t work as expected anymore. @flying-sheep: could you have a look at why adata['Xpca'] = Xpca
in the following throws an
>>> adata['Xpca'] = Xpca
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
in the following snippet
X = adata.X
# row normalize
X = row_norm(X, max_fraction=0.05, mult_with_mean=True)
# filter out genes with mean expression < 0.1 and coefficient of variance <
# cvFilter
X, gene_filter = filter_genes_cv(X, meanFilter, cvFilter)
# compute zscore of filtered matrix
Xz = zscore(X)
# PCA
Xpca = pca(Xz, nr_comps=nr_pcs)
# update adata
adata.X = X
adata = adata.var_names[gene_filter] # filter genes
adata['Xpca'] = Xpca
sett.m(0, 'Xpca has shape',
adata['Xpca'].shape[0], 'x', adata['Xpca'].shape[1])
print(adata.X)
I played around quite some bit, but the only solution that I got running then had the numerically incorrect result. It’s quite to hard to keep this sequence of steps nicely organized.
PS: the snippet appears in scanpy/preprocess/advanced.py
and an example would be ./scanpy.py nestorowa16 diffmap -r pp
.
Issue Analytics
- State:
- Created 7 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Login - AnnData - Read the Docs
Login. In order to view this documentation, you must log in first. Username: Password: Login. Or. Do you have a password? Access here ......
Read more >Introducing anndata: indexing, views and HDF5-backing
Indexing and Views. Similar to numpy arrays, AnnData objects can either hold actual data or reference another AnnData object. In the later case, ......
Read more >Indexing anndata for plots - Help - Scanpy
Is it possible to pass a subset of the anndata object to the plotting interface? I am looking for something like this (to...
Read more >indexing of AnnData · Issue #4 · scverse/scanpy - GitHub
The task is to write the following preprocessing sequence using an AnnData instance adata. meanFilter = 0.01 cvFilter = 2 nr_pcs = 50...
Read more >Create an Annotated Data Matrix - anndata
Indexing into an AnnData object can be performed by relative position with numeric indices, or by labels. To avoid ambiguity with numeric indexing...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
ok, of course this right, let’s discuss in person.
i disagree.
adata = adata[:, gene_filter]
should work. the object shouldn’t be able to be in an invalid state.