Mean of windowed popgen stats
See original GitHub issueCurrently the windowed aggregation of statistics in popgen.py
is hard-coded to use np.sum
[1, 2, 3, 4]. Would it be possible to make this aggregation optional or have a span_normalise
argument as in Tskit?
Another option would be to record the number of loci in each window so that the user can manual average the sums. Or to return both a stat_sum and stat_mean variable by default.
Issue Analytics
- State:
- Created 2 years ago
- Comments:8
Top Results From Across the Web
Workshop 4: Population genomics.
It is a very intuitive and simple measure of genetic diversity, and is accurately estimated even with very few samples. A formal definition...
Read more >Statistics — Tskit manual
Many standard population genetics statistics are defined with respect to some number of groups of genomes, usually called “populations”.
Read more >Calculating Basic Population Genetic Statistics from SNP Data
In this vignette, you will calculate basic population genetic statistics ... The function basic.stats() provides the observed heterozygosity (Ho), mean gene ...
Read more >pixy: Unbiased estimation of nucleotide diversity and ... - NCBI
Many summary statistics are based on the comparison of DNA sequences. Two important summary statistics in this class are π, the average number ......
Read more >Maximum SNP FST Outperforms Full-Window Statistics for ...
Maximum SNP FST Outperforms Full-Window Statistics for Detecting Soft ... (high FST) may be too narrow to detect using a typical windowed genome...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, that’s better. Perhaps
window_start_position
andwindow_stop_position
to echo thevariant_position
variable?This will take us into coordinate system territory (#434). (I realised the code I posted also needs to clip start positions to be at least 0 or 1, depending on the coordinate system in use.)
Just a thought, would it be worth adding
window_base_start
andwindow_base_stop
instead? That would allow a more direct translation to and from BED, GFF and etc in future, if that’s useful?