UMI Barcode Entry and Samtools error on 10X Cell Ranger GTF and BAM files
See original GitHub issueHello all! I think I have issues with the BAM and GTF files but do not know how to resolve them. I only started working with Python this fortnight so any help would be useful!
I have scRNA Seq data which was pushed through the 10X Genomics Cell Ranger
Pipeline. The resulting BAM File and the provided GTF file mm10 genome (10X reference version 2.1.0, GRCm38, Ensembl 84)
do not seem to be compatible with Velocyto.
After running the follwoing code(...
is added here to shorten address of file for convenience in reading code): velocyto run -b /.../barcodes.tsv -o /.../b2het_out -m .../mm10_repeat_mask.gtf .../possorted_genome_bam.bam .../genes.gtf
I get this error saying the UMI barcode entry is missing from the BAM file and that I need to run Samtools on the file to sort it. According to the tutorial on the Velocyto website, this should not be necessaary since Cell Ranger does the sorting already.
Any idea how to resolve this issue? Or how to format my GTF files if needed?
2021-07-28 16:37:44,746 - INFO - No SAMPLEID specified, the sample will be called possorted_genome_bam_S19RV (last 5 digits are a random-id to avoid overwriting some other file by mistake)
2021-07-28 16:37:44,746 - DEBUG - Using logic: Default
2021-07-28 16:37:44,749 - INFO - Read 3281 cell barcodes from /home/ali/Dokumente/RPractise/Run_alle_features/Velocity/PythonCodes/B2_HET/barcodes.tsv
2021-07-28 16:37:44,749 - DEBUG - Example of barcode: AAACCTGAGGCTATCT and cell_id: possorted_genome_bam_S19RV:AAACCTGAGGCTATCT-1
2021-07-28 16:37:44,774 - DEBUG - Peeking into /home/ali/Dokumente/RPractise/Run_alle_features/Velocity/PythonCodes/B2_HET/possorted_genome_bam.bam
[E::idx_find_and_load] Could not retrieve index file for '/home/ali/Dokumente/RPractise/Run_alle_features/Velocity/PythonCodes/B2_HET/possorted_genome_bam.bam'
2021-07-28 16:37:44,776 - WARNING - Not found cell and umi barcode in entry 12 of the bam file
2021-07-28 16:37:44,776 - WARNING - Not found cell and umi barcode in entry 19 of the bam file
2021-07-28 16:37:44,776 - WARNING - Not found cell and umi barcode in entry 23 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 25 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 82 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 137 of the bam file
2021-07-28 16:37:44,777 - WARNING - Not found cell and umi barcode in entry 138 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 218 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 280 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 281 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 282 of the bam file
2021-07-28 16:37:44,778 - WARNING - Not found cell and umi barcode in entry 283 of the bam file
2021-07-28 16:37:44,779 - WARNING - Not found cell and umi barcode in entry 349 of the bam file
2021-07-28 16:37:44,780 - WARNING - Not found cell and umi barcode in entry 558 of the bam file
2021-07-28 16:37:44,780 - WARNING - Not found cell and umi barcode in entry 564 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 649 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 654 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 696 of the bam file
2021-07-28 16:37:44,781 - WARNING - Not found cell and umi barcode in entry 697 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 796 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 818 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 819 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 821 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 905 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 906 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 907 of the bam file
2021-07-28 16:37:44,782 - WARNING - Not found cell and umi barcode in entry 909 of the bam file
Traceback (most recent call last):
File "/home/ali/anaconda3/bin/velocyto", line 8, in <module>
sys.exit(cli())
File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ali/anaconda3/lib/python3.8/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/home/ali/anaconda3/lib/python3.8/site-packages/velocyto/commands/run.py", line 113, in run
return _run(bamfile=bamfile, gtffile=gtffile, bcfile=bcfile, outputfolder=outputfolder,
File "/home/ali/anaconda3/lib/python3.8/site-packages/velocyto/commands/_run.py", line 178, in _run
sorting_process[ni] = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
File "/home/ali/anaconda3/lib/python3.8/subprocess.py", line 858, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/home/ali/anaconda3/lib/python3.8/subprocess.py", line 1704, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'samtools'
Issue Analytics
- State:
- Created 2 years ago
- Comments:11
Top GitHub Comments
Looking through the output I noticed I also had this
[E::idx_find_and_load] Could not retrieve index file for cellsorted_possorted genome bam.bam'
error…not 100% sure it’s important though.I found this in the samtools manual for the sort function.
Note that if the sorted output file is to be indexed with samtools index, the default coordinate sort must be used. Thus the -n and -t options are incompatible with samtools index.
Since we’re sorting by CB I don’t think we need to index it.
also for
Not found cell and umi barcode in entry __ of the bam file
error which I also have, I think it’s because we’re using the filtered barcodes as input as well as the bam file right? so I would imagine some being absent.I’ll wait for someone else to chime in though but I think this makes the most sense!
@AAA-3 — Did you end up using run or run10x? And, did you have to samtools sort the bam file separately, then index that cellsorted_possorted bam file to get the index, then run the “run” command?
I run into out-of-memory issue described in issue #320. Any thoughts would be really appreciated!
Thank you so much!