ValueError: Can only assign numpy ndarrays to .obsm['X_scanorama'], not objects of class <class 'scipy.sparse.csr.csr_matrix'>
See original GitHub issueHi! I would like to perform batch correction on data which has already been pre-processed with library size normalization and log-transformation. The input is a list of AnnData object with the adata.X field storing the aforementioned data as data_type: np.ndarray. Yet when running this through scanorama.correct_scanpy() I get the Value Error:
“ValueError: Can only assign numpy ndarrays to .obsm[‘X_scanorama’], not objects of class <class ‘scipy.sparse.csr.csr_matrix’>”.
Code:
import scanorama
import scanpy as sc
from scanpy import *
import numpy as np
#increase width of cells
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:70% !important; }</style>"))
from scipy import sparse
#imoprt data
train4ds_adata = sc.read("./tests/data/pancreas.h5ad", backup_url="https://goo.gl/V29FNk")
#scanormam takes seperate adata object therefore separate the data into subsets.
def subsetBatches(adata):
"""
Arguments:
adata(obj): Annotation data object from scanpy package containing all batches, cell-labels and batch labels.
note: batches should be stored under andata.obs['sample']. Celltype should be stored under andata.obs['celltype'].
returns:
batches(dict): A dict of {batchName: batchSubset}.
"""
print("Seperating into Batches...")
batches = {}
samples = np.unique(adata.obs['sample'])
for sample in samples:
#subset the data
data = adata[adata.obs['sample'] == sample]
#deepcopy
data.uns = data._adata_ref._uns
batches.update({sample: data})
for batchName, batchData in batches.items():
print(batchName)
print(batchData)
print()
print("Complete.")
return batches
#subset
batches = subsetBatches(train4ds_adata)
print()
#generate list of adatas
adatas = [batch for batch in batches.values()]
#check input type is ndarray
print("The data type for each adata.X field is: ")
for adata in adatas:
print(adata)
print()
print(type(adata.X))
print()
print(adatas[0].X)
#Batch correction
#important note, gene order is not preserved in scanorama
corrected = scanorama.correct_scanpy(adatas)
Output:
Seperating into Batches… Baron AnnData object with n_obs × n_vars = 8569 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’ Muraro AnnData object with n_obs × n_vars = 2126 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’ Segerstolpe AnnData object with n_obs × n_vars = 3363 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’ Wang AnnData object with n_obs × n_vars = 635 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’
Complete.
The data type for each adata.X field is: AnnData object with n_obs × n_vars = 8569 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’
<class ‘numpy.ndarray’>
AnnData object with n_obs × n_vars = 2126 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’
<class ‘numpy.ndarray’>
AnnData object with n_obs × n_vars = 3363 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’
<class ‘numpy.ndarray’>
AnnData object with n_obs × n_vars = 635 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’
<class ‘numpy.ndarray’>
[[-0.18548188 1.2636875 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] … [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018]] Found 2448 genes among all datasets [[0. 0.10959548 0.17633066 0.02362205] [0. 0. 0.55644403 0.06771654] [0. 0. 0. 0.36062992] [0. 0. 0. 0. ]] Processing datasets (1, 2) Processing datasets (2, 3) Processing datasets (0, 2) Processing datasets (0, 1)
ValueError Traceback (most recent call last) <ipython-input-69-d7805f11908e> in <module> 53 #Bacth correction 54 #important note, gene order is not preserved in scanorama —> 55 corrected = scanorama.correct_scanpy(adatas)
~/miniconda3/envs/keras2/lib/python3.6/site-packages/scanorama/scanorama.py in correct_scanpy(adatas, **kwargs) 216 new_adatas = [] 217 for i, adata in enumerate(adatas): –> 218 adata.obsm[‘X_scanorama’] = datasets[i] 219 new_adatas.append(adata) 220
~/miniconda3/envs/keras2/lib/python3.6/site-packages/anndata/base.py in setitem(self, key, arr) 117 raise ValueError( 118 ‘Can only assign numpy ndarrays to .{}[{!r}], not objects of class {}’ –> 119 .format(self._attr, key, type(arr)) 120 ) 121 if arr.ndim == 1:
ValueError: Can only assign numpy ndarrays to .obsm[‘X_scanorama’], not objects of class <class ‘scipy.sparse.csr.csr_matrix’>
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (4 by maintainers)
Top GitHub Comments
Hi brainhie
Apologies for the late reply. I shut down the kernel and reloaded everything and now the package is working correctly since the patch.
Thank you! best,
Dean.
Thank you! When I saved my 10x data as .txt files and loaded with read.table as in your script it actually worked. Interesting.
Thank you for the quick reply!
On Wed, Oct 2, 2019 at 8:00 AM brianhie notifications@github.com wrote:
– Edroaldo