question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

recipe_monocle on kb_python output

See original GitHub issue

Hi Xiaojie!

When running dyn.pp.recipe_monocle on the output from kb_python (for conventional scRNAseq), some input ENSEMBL IDs find duplicate hits for gene symbols, or find no hit, as shown in the error below:

45 input query terms found dup hits:
	[('ENSG00000229425', 2), ('ENSG00000182378', 2), ('ENSG00000178605', 2), ('ENSG00000226179', 2), ('E
192 input query terms found no hit:
	['ENSG00000278198', 'ENSG00000273496', 'ENSG00000278782', 'ENSG00000277761', 'ENSG00000277726', 'ENS
Pass "returnall=True" to return complete lists of duplicate or missing query terms.

To avoid errors from dynamo in subsequent steps, a workaround is to use adata[:, adata.var_names.notnull()]

after running dyn.pp.recipe_monocle. But I was wondering if there is a more permanent solution? It looks like this is an issue of the mygene dependency that contains a different ENSEMBL ID dataset than the one used in kb_python. @Lioscro might have some thoughts too from kb_python?

Thanks, Jorge

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

2reactions
Lioscrocommented, Feb 26, 2022

@jmartinrufino Let’s discuss this on the kb-python repo, since it seems it may be a bug. Here’s the issue: https://github.com/pachterlab/kb_python/issues/160

2reactions
Lioscrocommented, Feb 22, 2022

Is working with gene names directly an option? You can provide --gene-names to kb count to have all the vars be gene names (gene names that have multiple gene IDs will be summed).

Read more comments on GitHub >

github_iconTop Results From Across the Web

GitHub - pachterlab/kb_python: A wrapper for the kallisto
kb-python is a python package for processing single-cell RNA-sequencing. ... library on a NextSeq Illumina sequencer usually results in two FASTQ files.
Read more >
kb-python - PyPI
kb-python is a python package that wraps the kallisto | bustools single-cell ... The output can be saved in a variety of formats...
Read more >
Package Recipe 'kb-python' — Bioconda documentation
recipe kb-python. A wrapper for the kallisto | bustools workflow for single-cell RNA-seq pre-processing. Homepage: https://github.com/pachterlab/kb_python.
Read more >
kb-python: discovering fusion genes from paired-end RNA ...
ffq: A tool to find sequencing data and metadata from public databases. References: Melsted, P., Booeshaghi, A.S., et al. Modular, efficient and ...
Read more >
kb-python - Release 0.27.3 Kyung Hoi (Joseph) Min
This section walks you through, step-by-step, how to release a new version. 1. Make sure you are on the up-to-date master branch.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found