recipe_monocle on kb_python output
See original GitHub issueHi Xiaojie!
When running dyn.pp.recipe_monocle on the output from kb_python (for conventional scRNAseq), some input ENSEMBL IDs find duplicate hits for gene symbols, or find no hit, as shown in the error below:
45 input query terms found dup hits:
[('ENSG00000229425', 2), ('ENSG00000182378', 2), ('ENSG00000178605', 2), ('ENSG00000226179', 2), ('E
192 input query terms found no hit:
['ENSG00000278198', 'ENSG00000273496', 'ENSG00000278782', 'ENSG00000277761', 'ENSG00000277726', 'ENS
Pass "returnall=True" to return complete lists of duplicate or missing query terms.
To avoid errors from dynamo in subsequent steps, a workaround is to use
adata[:, adata.var_names.notnull()]
after running dyn.pp.recipe_monocle. But I was wondering if there is a more permanent solution? It looks like this is an issue of the mygene dependency that contains a different ENSEMBL ID dataset than the one used in kb_python. @Lioscro might have some thoughts too from kb_python?
Thanks, Jorge
Issue Analytics
- State:
- Created 2 years ago
- Comments:5
Top Results From Across the Web
GitHub - pachterlab/kb_python: A wrapper for the kallisto
kb-python is a python package for processing single-cell RNA-sequencing. ... library on a NextSeq Illumina sequencer usually results in two FASTQ files.
Read more >kb-python - PyPI
kb-python is a python package that wraps the kallisto | bustools single-cell ... The output can be saved in a variety of formats...
Read more >Package Recipe 'kb-python' — Bioconda documentation
recipe kb-python. A wrapper for the kallisto | bustools workflow for single-cell RNA-seq pre-processing. Homepage: https://github.com/pachterlab/kb_python.
Read more >kb-python: discovering fusion genes from paired-end RNA ...
ffq: A tool to find sequencing data and metadata from public databases. References: Melsted, P., Booeshaghi, A.S., et al. Modular, efficient and ...
Read more >kb-python - Release 0.27.3 Kyung Hoi (Joseph) Min
This section walks you through, step-by-step, how to release a new version. 1. Make sure you are on the up-to-date master branch.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@jmartinrufino Let’s discuss this on the kb-python repo, since it seems it may be a bug. Here’s the issue: https://github.com/pachterlab/kb_python/issues/160
Is working with gene names directly an option? You can provide
--gene-names
tokb count
to have all the vars be gene names (gene names that have multiple gene IDs will be summed).