question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UMI Barcode Entry and Samtools error on 10X Cell Ranger GTF and BAM files

See original GitHub issue

Hello all! I think I have issues with the BAM and GTF files but do not know how to resolve them. I only started working with Python this fortnight so any help would be useful!

I have scRNA Seq data which was pushed through the 10X Genomics Cell Ranger Pipeline. The resulting BAM File and the provided GTF file mm10 genome (10X reference version 2.1.0, GRCm38, Ensembl 84) do not seem to be compatible with Velocyto.

After running the follwoing code(... is added here to shorten address of file for convenience in reading code): velocyto run -b /.../barcodes.tsv -o /.../b2het_out -m .../mm10_repeat_mask.gtf .../possorted_genome_bam.bam .../genes.gtf I get this error saying the UMI barcode entry is missing from the BAM file and that I need to run Samtools on the file to sort it. According to the tutorial on the Velocyto website, this should not be necessaary since Cell Ranger does the sorting already.

Any idea how to resolve this issue? Or how to format my GTF files if needed?

2021-07-28 16:37:44,746 - INFO - No SAMPLEID specified, the sample will be called possorted_genome_bam_S19RV (last 5 digits are a random-id to avoid overwriting some other file by mistake)
2021-07-28 16:37:44,746 - DEBUG - Using logic: Default
2021-07-28 16:37:44,749 - INFO - Read 3281 cell barcodes from /home/ali/Dokumente/RPractise/Run_alle_features/Velocity/PythonCodes/B2_HET/barcodes.tsv
2021-07-28 16:37:44,749 - DEBUG - Example of barcode: AAACCTGAGGCTATCT and cell_id: possorted_genome_bam_S19RV:AAACCTGAGGCTATCT-1
2021-07-28 16:37:44,774 - DEBUG - Peeking into /home/ali/Dokumente/RPractise/Run_alle_features/Velocity/PythonCodes/B2_HET/possorted_genome_bam.bam
[E::idx_find_and_load] Could not retrieve index file for '/home/ali/Dokumente/RPractise/Run_alle_features/Velocity/PythonCodes/B2_HET/possorted_genome_bam.bam'
2021-07-28 16:37:44,776 - WARNING - Not found cell and umi barcode in entry 12 of the bam file
2021-07-28 16:37:44,776 - WARNING - Not found cell and umi barcode in entry 19 of the bam file
2021-07-28 16:37:44,776 - WARNING - Not found cell and umi barcode in entry 23 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 25 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 82 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 137 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 138 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 218 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 280 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 281 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 282 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 283 of the bam file
2021-07-28 16:37:44,779 - WARNING - Not found cell and umi barcode in entry 349 of the bam file
2021-07-28 16:37:44,780 - WARNING - Not found cell and umi barcode in entry 558 of the bam file
2021-07-28 16:37:44,780 - WARNING - Not found cell and umi barcode in entry 564 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 649 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 654 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 696 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 697 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 796 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 818 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 819 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 821 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 905 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 906 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 907 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 909 of the bam file
Traceback (most recent call last):
  File "/home/ali/anaconda3/bin/velocyto", line 8, in <module>
    sys.exit(cli())
  File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/ali/anaconda3/lib/python3.8/site-packages/velocyto/commands/run.py", line 113, in run
    return _run(bamfile=bamfile, gtffile=gtffile, bcfile=bcfile, outputfolder=outputfolder,
  File "/home/ali/anaconda3/lib/python3.8/site-packages/velocyto/commands/_run.py", line 178, in _run
    sorting_process[ni] = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
  File "/home/ali/anaconda3/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/home/ali/anaconda3/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'samtools'

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11

github_iconTop GitHub Comments

1reaction
msalaciakcommented, Aug 8, 2021

Looking through the output I noticed I also had this [E::idx_find_and_load] Could not retrieve index file for cellsorted_possorted genome bam.bam' error…not 100% sure it’s important though.

I found this in the samtools manual for the sort function.

Note that if the sorted output file is to be indexed with samtools index, the default coordinate sort must be used. Thus the -n and -t options are incompatible with samtools index.

Since we’re sorting by CB I don’t think we need to index it.

also for Not found cell and umi barcode in entry __ of the bam file error which I also have, I think it’s because we’re using the filtered barcodes as input as well as the bam file right? so I would imagine some being absent.

I’ll wait for someone else to chime in though but I think this makes the most sense!

0reactions
denvercal1234GitHubcommented, Nov 17, 2021

@AAA-3 — Did you end up using run or run10x? And, did you have to samtools sort the bam file separately, then index that cellsorted_possorted bam file to get the index, then run the “run” command?

I run into out-of-memory issue described in issue #320. Any thoughts would be really appreciated!

Thank you so much!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Barcoded BAM Tags -Software -Single Cell Gene Expression
The cellranger pipeline outputs an indexed BAM file containing position-sorted reads aligned to the genome and transcriptome, as well as unaligned reads. If ......
Read more >
Not found cell and umi barcode in entry xx of the bam file #276
I am testing amazon AWS to run Velocyto by using 10X PBMC_10K.bam file. Velocyte (one step without splitting a samtools step) seems to...
Read more >
BAM to gene expression matrix (UMI counts per gene per cell ...
The reason is that if you want to re-do the authors' analysis to get the gene expression (gene-cell / feature-barcode) matrix and they...
Read more >
10x single cell BAM files - Dave Tang's blog
Errors in the cell barcode and UMI sequence can occur during the PCR amplification and sequencing steps, so the CB and UB tags...
Read more >
CLI Usage Guide — velocyto 0.17.16 documentation
The bam file outputted by dropEst does not contain error-corrected but raw cell barcodes so we will have to make a new corrected...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found