Error in CollectGcBiasMetrics during ATAQC module
See original GitHub issueHello,
I am running the pipeline starting with nodup.bam files. The pipeline seemed to run successfully with previous version of the pipeline. However, when I downloaded the latest version of the pipeline couple of days ago, the pipeline breaks at picard’s CollectGcBiasMetrics command during ATAQC module with Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 130695116 out of bounds for length 130694994
error. After some investigation, I found where the out of bounds numbers are coming from. The length of chr10 is 130694993 and there is a read that maps to chr2 at 130695116. What is not clear is why picard tries to use length of chr10 for read mapping to chr2. I did little more investigation to find out the scripts that are calling picard’s CollectGcBiasMetrics. It looks like before the get_gc function, which runs CollectGcBiasMetrics, is called by encode_ataqc.py, the input bam file is processed to remove read groups. However, this feature seem to be missing in older version of encode_ataqc.py. I am not sure if this is the culprit or I am doing something wrong. Please advice.
Thank you for your time.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
Yes, you can keep using the TSV file
mm10.tsv
. In your input JSON file, define the following to override reference FASTA defined in the TSV file.That worked! Thanks @leepc12 and @vervacity.