question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

setting gene_symbol to select symbol from adata.var fails in sc.pl.umap()

See original GitHub issue

I would like to color the umap representation using gene expression values. For ease of use I’d like to display the Gene name instead of gene_id which are the adata.var_names in my case. Setting gene_symbols = 'Symbol' doesn’t seem to work for me or I am using it the wrong way.

When running sc.pl.umap(adata, gene_symbols = 'Symbol', color = ['Tnnt2'])

I get the follwoing error message:

 ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-116-e09d49f2528c> in <module>
----> 1 sc.pl.umap(adata, gene_symbols = 'Symbol', color = ['Tnnt2'])

/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in umap(adata, **kwargs)
     27     If `show==False` a `matplotlib.Axis` or a list of it.
     28     """
---> 29     return plot_scatter(adata, basis='umap', **kwargs)
     30 
     31 

/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in plot_scatter(adata, color, use_raw, sort_order, edges, edges_width, edges_color, arrows, arrows_kwds, basis, groups, components, projection, color_map, palette, size, frameon, legend_fontsize, legend_fontweight, legend_loc, ncols, hspace, wspace, title, show, save, ax, return_fig, **kwargs)
    275         color_vector, categorical = _get_color_values(adata, value_to_plot,
    276                                                       groups=groups, palette=palette,
--> 277                                                       use_raw=use_raw)
    278 
    279         # check if higher value points should be plot on top

/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in _get_color_values(adata, value_to_plot, groups, palette, use_raw)
    665         raise ValueError("The passed `color` {} is not a valid observation annotation "
    666                          "or variable name. Valid observation annotation keys are: {}"
--> 667                          .format(value_to_plot, adata.obs.columns))
    668 
    669     return color_vector, categorical

ValueError: The passed `color` Tnnt2 is not a valid observation annotation or variable name. Valid observation annotation keys are: Index(['Sample', 'n_counts', 'n_genes', 'percent_mito', 'log_counts',
       'louvain'],
      dtype='object')

adata.var contains the column “Symbol” and “Tnnt2” is present:

adata.var[adata.var['Symbol'] == 'Tnnt2']

Symbol type highly_variable means dispersions dispersions_norm
Tnnt2 protein_coding True 0.923869 4.090601 11.370244

run with: scanpy==1.3.7 anndata==0.6.17 numpy==1.14.6 scipy==1.1.0 pandas==0.23.4 scikit-learn==0.19.1 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:17 (10 by maintainers)

github_iconTop GitHub Comments

3reactions
ivirshupcommented, Mar 16, 2019

Yes (definitely) and yes (I think)

Passing an argument for gene_symbols means that instead of searching .var_names, the column of .var whose name was passed will be searched.

For example, if you had an AnnData object adata with ensembl ids as adata.var_name, and hgnc symbols under the column adata.var[“gene_name”], the following calls should plot similar things (different titles):

sc.pl.umap(adata, color=[“ENSG00000261371”])
sc.pl.umap(adata, color=[“PECAM1”], gene_symbols=“gene_name”)
1reaction
hemantgujarcommented, May 4, 2022

adata.var["gene_name"]

Traceback (most recent call last): File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3361, in get_loc return self._engine.get_loc(casted_key) File “pandas/_libs/index.pyx”, line 76, in pandas._libs.index.IndexEngine.get_loc File “pandas/_libs/index.pyx”, line 108, in pandas._libs.index.IndexEngine.get_loc File “pandas/_libs/hashtable_class_helper.pxi”, line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File “pandas/_libs/hashtable_class_helper.pxi”, line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: ‘gene_name’

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py”, line 3458, in getitem indexer = self.columns.get_loc(key) File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3363, in get_loc raise KeyError(key) from err KeyError: ‘gene_name’

Did something change in scanpy ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

setting gene_symbol to select symbol from adata.var fails in sc ...
I would like to color the umap representation using gene expression values. For ease of use I'd like to display the Gene name...
Read more >
scanpy.pl.umap — Scanpy 1.9.1 documentation
var DataFrame that stores gene symbols. By default var_names refer to the index column of the .var DataFrame. Setting this option allows alternative...
Read more >
new-10kPBMC-Scanpy
If starting from typical Cellranger output, it's possible to choose if you want to use Ensemble ID ( gene_ids ) or gene symbols...
Read more >
zh542370159/SCP source: R/SCP-analysis.R - Rdrr.io
Gene ID conversion function using biomart #' #' This function can convert different gene ID types within one species or bewteen two species...
Read more >
Processing single-cell RNA-seq data for dimension reduction ...
Single-cell RNA sequencing data require several processing procedures to arrive at ... sc.pl.umap(adata,color=['gene','leiden_labels' ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found