Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

setting gene_symbol to select symbol from adata.var fails in sc.pl.umap()

See original GitHub issue

I would like to color the umap representation using gene expression values. For ease of use I’d like to display the Gene name instead of gene_id which are the adata.var_names in my case. Setting gene_symbols = 'Symbol' doesn’t seem to work for me or I am using it the wrong way.

When running sc.pl.umap(adata, gene_symbols = 'Symbol', color = ['Tnnt2'])

I get the follwoing error message:

 ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-116-e09d49f2528c> in <module>
----> 1 sc.pl.umap(adata, gene_symbols = 'Symbol', color = ['Tnnt2'])

/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in umap(adata, **kwargs)
     27     If `show==False` a `matplotlib.Axis` or a list of it.
     28     """
---> 29     return plot_scatter(adata, basis='umap', **kwargs)
     30 
     31 

/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in plot_scatter(adata, color, use_raw, sort_order, edges, edges_width, edges_color, arrows, arrows_kwds, basis, groups, components, projection, color_map, palette, size, frameon, legend_fontsize, legend_fontweight, legend_loc, ncols, hspace, wspace, title, show, save, ax, return_fig, **kwargs)
    275         color_vector, categorical = _get_color_values(adata, value_to_plot,
    276                                                       groups=groups, palette=palette,
--> 277                                                       use_raw=use_raw)
    278 
    279         # check if higher value points should be plot on top

/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in _get_color_values(adata, value_to_plot, groups, palette, use_raw)
    665         raise ValueError("The passed `color` {} is not a valid observation annotation "
    666                          "or variable name. Valid observation annotation keys are: {}"
--> 667                          .format(value_to_plot, adata.obs.columns))
    668 
    669     return color_vector, categorical

ValueError: The passed `color` Tnnt2 is not a valid observation annotation or variable name. Valid observation annotation keys are: Index(['Sample', 'n_counts', 'n_genes', 'percent_mito', 'log_counts',
       'louvain'],
      dtype='object')

adata.var contains the column “Symbol” and “Tnnt2” is present:

adata.var[adata.var['Symbol'] == 'Tnnt2']

Symbol	type	highly_variable	means	dispersions	dispersions_norm
Tnnt2	protein_coding	True	0.923869	4.090601	11.370244

run with: scanpy==1.3.7 anndata==0.6.17 numpy==1.14.6 scipy==1.1.0 pandas==0.23.4 scikit-learn==0.19.1 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1

Issue Analytics

State:
Created 5 years ago
Comments:17 (10 by maintainers)

Top GitHub Comments

3reactions

ivirshupcommented, Mar 16, 2019

Yes (definitely) and yes (I think)

Passing an argument for gene_symbols means that instead of searching .var_names, the column of .var whose name was passed will be searched.

For example, if you had an AnnData object adata with ensembl ids as adata.var_name, and hgnc symbols under the column adata.var[“gene_name”], the following calls should plot similar things (different titles):

sc.pl.umap(adata, color=[“ENSG00000261371”])
sc.pl.umap(adata, color=[“PECAM1”], gene_symbols=“gene_name”)

1reaction

hemantgujarcommented, May 4, 2022

adata.var["gene_name"]

Traceback (most recent call last): File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3361, in get_loc return self._engine.get_loc(casted_key) File “pandas/_libs/index.pyx”, line 76, in pandas._libs.index.IndexEngine.get_loc File “pandas/_libs/index.pyx”, line 108, in pandas._libs.index.IndexEngine.get_loc File “pandas/_libs/hashtable_class_helper.pxi”, line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File “pandas/_libs/hashtable_class_helper.pxi”, line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: ‘gene_name’

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py”, line 3458, in getitem indexer = self.columns.get_loc(key) File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3363, in get_loc raise KeyError(key) from err KeyError: ‘gene_name’

Did something change in scanpy ?

Top Results From Across the Web

setting gene_symbol to select symbol from adata.var fails in sc ...

I would like to color the umap representation using gene expression values. For ease of use I'd like to display the Gene name...

scanpy.pl.umap — Scanpy 1.9.1 documentation

var DataFrame that stores gene symbols. By default var_names refer to the index column of the .var DataFrame. Setting this option allows alternative...

new-10kPBMC-Scanpy

If starting from typical Cellranger output, it's possible to choose if you want to use Ensemble ID ( gene_ids ) or gene symbols...

zh542370159/SCP source: R/SCP-analysis.R - Rdrr.io

Gene ID conversion function using biomart #' #' This function can convert different gene ID types within one species or bewteen two species...

Processing single-cell RNA-seq data for dimension reduction ...

Single-cell RNA sequencing data require several processing procedures to arrive at ... sc.pl.umap(adata,color=['gene','leiden_labels' ...