Error in whitelist
See original GitHub issueHello, Thank you for such a fantastic tool. I am trying to extract the counts from one of our 10X Genomics datasets.
However, when I run the whitelist command I get the following error
# output generated by whitelist --stdin d1_2_S2_L005_R1_001_short.fastq --bc-pattern=CCCCCCCCCCCCCCCCNNNNNNNNNN --method=umis --log2stderr
# job started at Wed Nov 22 10:47:21 2017 on L-UL-C02QK5VLG8WN.local -- 7d4facbe-56fe-43b5-bbf3-ba91bb0a5ab9
# pid: 3172, system: Darwin 17.2.0 Darwin Kernel Version 17.2.0: Fri Sep 29 18:27:05 PDT 2017; root:xnu-4570.20.62~3/RELEASE_X86_64 x86_64
# blacklist_tsv : None
# cell_number : False
# compresslevel : 6
# error_correct_threshold : 1
# expect_cells : False
# extract_method : string
# filter_cell_barcodes : False
# log2stderr : True
# loglevel : 1
# method : umis
# pattern : CCCCCCCCCCCCCCCCNNNNNNNNNN
# pattern2 : None
# plot_prefix : None
# prime3 : None
# random_seed : None
# read2_in : None
# short_help : None
# stderr : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>
# stdin : <_io.TextIOWrapper name='d1_2_S2_L005_R1_001_short.fastq' mode='r' encoding='UTF-8'>
# stdlog : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>
# stdout : <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
# subset_reads : 100000000
# timeit_file : None
# timeit_header : None
# timeit_name : all
# whitelist_tsv : None
2017-11-22 10:47:21,331 INFO Starting barcode extraction
2017-11-22 10:47:21,385 INFO Parsed 0 reads
2017-11-22 10:47:24,746 INFO Parsed 100000 reads
Traceback (most recent call last):
File “myHome/anaconda/bin/umi_tools", line 6, in <module>
sys.exit(umi_tools.umi_tools.main())
File "myHome/anaconda/lib/python3.5/site-packages/umi_tools/umi_tools.py", line 59, in main
module.main(sys.argv)
File "myHome/anaconda/lib/python3.5/site-packages/umi_tools/whitelist.py", line 310, in main
barcode_values = ReadExtractor.getBarcodes(read1)
File "myHome/anaconda/lib/python3.5/site-packages/umi_tools/umi_methods.py", line 688, in _getBarcodesString
umi_quals = [bc_qual1[x] for x in self.umi_bases]
File "myHome/anaconda/lib/python3.5/site-packages/umi_tools/umi_methods.py", line 688, in <listcomp>
umi_quals = [bc_qual1[x] for x in self.umi_bases]
IndexError: string index out of range
The reads in Read1 are of 26bp long.
Could you please let me know how can I fix it.
Issue Analytics
- State:
- Created 6 years ago
- Comments:11 (6 by maintainers)
Top Results From Across the Web
Warzone Whitelist Error [FIXED] - eXputer.com
The Whitelist Error in Warzone isn't as disastrous as the Warzone current Profile is not allowed error, which caused bans for several ...
Read more >How to Fix Whitelist Failure Error in Call of Duty: Warzone
Apparently, the trick to fixing the Whitelist Failure error is to restart Call of ...
Read more >Warzone “Whitelist” error appears and players are confused
A Call of Duty: Warzone Whitelist Error is showing up and many are confused, while others think it proves the streamer hacking conspiracies....
Read more >"Error whitelisting" when trying to whitelist a domain from 'Stats ...
When I click to whitelist a corresponding blocked domain, I get a popup saying "error whitelisting", with no details.
Read more >Users are getting whitelist errors even though we allow all ...
Try use whitelist option and add atmx to the list save then get back to allow all extensions. Expand Post. LikeLikedUnlike
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I think the best solution here is to change the extract method to regex with
--extract-method=regex
and provide the following regex pattern:--bc-pattern="(?P<cell_1>.{16})(?P<umi_1>.{10})"
. For any read with less than 26 bp, they will not match the regex but will also not throw an error. The logfile will contain the following lines to describe how many reads matched and how many did not:Yes. Some reads are too short in our dataset.