question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: Can only assign numpy ndarrays to .obsm['X_scanorama'], not objects of class <class 'scipy.sparse.csr.csr_matrix'>

See original GitHub issue

Hi! I would like to perform batch correction on data which has already been pre-processed with library size normalization and log-transformation. The input is a list of AnnData object with the adata.X field storing the aforementioned data as data_type: np.ndarray. Yet when running this through scanorama.correct_scanpy() I get the Value Error:

“ValueError: Can only assign numpy ndarrays to .obsm[‘X_scanorama’], not objects of class <class ‘scipy.sparse.csr.csr_matrix’>”.

Code:

import scanorama
import scanpy as sc
from scanpy import *
import numpy as np
#increase width of cells
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:70% !important; }</style>"))
from scipy import sparse


#imoprt data
train4ds_adata = sc.read("./tests/data/pancreas.h5ad", backup_url="https://goo.gl/V29FNk")

#scanormam takes seperate adata object therefore separate the data into subsets.
def subsetBatches(adata):
    """
        Arguments:

        adata(obj): Annotation data object from scanpy package containing all batches, cell-labels and batch labels. 
                    note: batches should be stored under andata.obs['sample']. Celltype should be stored under andata.obs['celltype'].

        returns: 

        batches(dict): A dict of {batchName: batchSubset}.
    
    """
    print("Seperating into Batches...")
    batches = {}
    samples = np.unique(adata.obs['sample'])
    
    for sample in samples:
        #subset the data
        data = adata[adata.obs['sample'] == sample]
        #deepcopy
        data.uns = data._adata_ref._uns
        batches.update({sample: data})
    
    for batchName, batchData in  batches.items():
        print(batchName)
        print(batchData)
    print()
    print("Complete.")    
    return batches

#subset
batches = subsetBatches(train4ds_adata)
print()
#generate list of adatas
adatas = [batch for batch in batches.values()]

#check input type is ndarray
print("The data type for each adata.X field is: ")

for adata in adatas:
    print(adata)
    print()
    print(type(adata.X))
    print()

print(adatas[0].X)

#Batch correction
#important note, gene order is not preserved in scanorama
corrected = scanorama.correct_scanpy(adatas)

Output:

Seperating into Batches… Baron AnnData object with n_obs × n_vars = 8569 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’ Muraro AnnData object with n_obs × n_vars = 2126 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’ Segerstolpe AnnData object with n_obs × n_vars = 3363 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’ Wang AnnData object with n_obs × n_vars = 635 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’

Complete.

The data type for each adata.X field is: AnnData object with n_obs × n_vars = 8569 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’

<class ‘numpy.ndarray’>

AnnData object with n_obs × n_vars = 2126 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’

<class ‘numpy.ndarray’>

AnnData object with n_obs × n_vars = 3363 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’

<class ‘numpy.ndarray’>

AnnData object with n_obs × n_vars = 635 × 2448 obs: ‘celltype’, ‘sample’, ‘n_genes’, ‘batch’, ‘n_counts’, ‘louvain’ var: ‘n_cells-0’, ‘n_cells-1’, ‘n_cells-2’, ‘n_cells-3’ uns: ‘celltype_colors’, ‘louvain’, ‘neighbors’, ‘pca’, ‘sample_colors’ obsm: ‘X_pca’, ‘X_umap’ varm: ‘PCs’

<class ‘numpy.ndarray’>

[[-0.18548188 1.2636875 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] … [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018] [-0.18548188 -0.41321668 -0.27988526 … -0.1822604 -0.12235923 -0.11705018]] Found 2448 genes among all datasets [[0. 0.10959548 0.17633066 0.02362205] [0. 0. 0.55644403 0.06771654] [0. 0. 0. 0.36062992] [0. 0. 0. 0. ]] Processing datasets (1, 2) Processing datasets (2, 3) Processing datasets (0, 2) Processing datasets (0, 1)


ValueError Traceback (most recent call last) <ipython-input-69-d7805f11908e> in <module> 53 #Bacth correction 54 #important note, gene order is not preserved in scanorama —> 55 corrected = scanorama.correct_scanpy(adatas)

~/miniconda3/envs/keras2/lib/python3.6/site-packages/scanorama/scanorama.py in correct_scanpy(adatas, **kwargs) 216 new_adatas = [] 217 for i, adata in enumerate(adatas): –> 218 adata.obsm[‘X_scanorama’] = datasets[i] 219 new_adatas.append(adata) 220

~/miniconda3/envs/keras2/lib/python3.6/site-packages/anndata/base.py in setitem(self, key, arr) 117 raise ValueError( 118 ‘Can only assign numpy ndarrays to .{}[{!r}], not objects of class {}’ –> 119 .format(self._attr, key, type(arr)) 120 ) 121 if arr.ndim == 1:

ValueError: Can only assign numpy ndarrays to .obsm[‘X_scanorama’], not objects of class <class ‘scipy.sparse.csr.csr_matrix’>

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
Sum02deancommented, Jul 9, 2019

Hi brainhie

Apologies for the late reply. I shut down the kernel and reloaded everything and now the package is working correctly since the patch.

Thank you! best,

Dean.

0reactions
edroaldocommented, Oct 2, 2019

Thank you! When I saved my 10x data as .txt files and loaded with read.table as in your script it actually worked. Interesting.

Thank you for the quick reply!

On Wed, Oct 2, 2019 at 8:00 AM brianhie notifications@github.com wrote:

Hi @edroaldo https://github.com/edroaldo, I remember there being an issue with some of the R matrix types being incompatible. Here is an R script that does work: https://github.com/brianhie/scanorama/blob/master/bin/R/scanorama.R Let me know if there’s still an issue!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brianhie/scanorama/issues/47?email_source=notifications&email_token=ACNJIR7C62KRNW5GFSYNMRLQMSEPPA5CNFSM4H33KOUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAEP24Q#issuecomment-537460082, or mute the thread https://github.com/notifications/unsubscribe-auth/ACNJIR54ENNYAKBCZC32LSTQMSEPPANCNFSM4H33KOUA .

– Edroaldo

Read more comments on GitHub >

github_iconTop Results From Across the Web

scipy.sparse.csr_matrix — SciPy v1.9.3 Manual
Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power. Advantages of the CSR ...
Read more >
TypeError: Failed to convert object of type <class 'scipy. ...
You need to make a TensorFlow sparse matrix from your SciPy one. Since your matrix seems to be in CSR format, you can...
Read more >
Converting scipy sparse csr_matrix to dask array
We are getting OOM error on scipy sparse csr matrix on slicing the matrix. So to avoid OOM we are planning on converting...
Read more >
How to Create a Sparse Matrix in Python
Computing time: Computing time can be saved by logically designing a data structure traversing only non-zero elements. Sparse matrices are ...
Read more >
sklearn.datasets.load_svmlight_file
Load datasets in the svmlight / libsvm format into sparse CSR matrix. This format is a text-based format, with one sample per line....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found