question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Mean of windowed popgen stats

See original GitHub issue

Currently the windowed aggregation of statistics in popgen.py is hard-coded to use np.sum [1, 2, 3, 4]. Would it be possible to make this aggregation optional or have a span_normalise argument as in Tskit?

Another option would be to record the number of loci in each window so that the user can manual average the sums. Or to return both a stat_sum and stat_mean variable by default.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:8

github_iconTop GitHub Comments

1reaction
tomwhitecommented, Oct 5, 2021

Just a thought, would it be worth adding window_base_start and window_base_stop instead?

Yes, that’s better. Perhaps window_start_position and window_stop_position to echo the variant_position variable?

This will take us into coordinate system territory (#434). (I realised the code I posted also needs to clip start positions to be at least 0 or 1, depending on the coordinate system in use.)

0reactions
timothymillarcommented, Oct 4, 2021

Should the window functions add a window_base_length variable?

Just a thought, would it be worth adding window_base_start and window_base_stop instead? That would allow a more direct translation to and from BED, GFF and etc in future, if that’s useful?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Workshop 4: Population genomics.
It is a very intuitive and simple measure of genetic diversity, and is accurately estimated even with very few samples. A formal definition...
Read more >
Statistics — Tskit manual
Many standard population genetics statistics are defined with respect to some number of groups of genomes, usually called “populations”.
Read more >
Calculating Basic Population Genetic Statistics from SNP Data
In this vignette, you will calculate basic population genetic statistics ... The function basic.stats() provides the observed heterozygosity (Ho), mean gene ...
Read more >
pixy: Unbiased estimation of nucleotide diversity and ... - NCBI
Many summary statistics are based on the comparison of DNA sequences. Two important summary statistics in this class are π, the average number ......
Read more >
Maximum SNP FST Outperforms Full-Window Statistics for ...
Maximum SNP FST Outperforms Full-Window Statistics for Detecting Soft ... (high FST) may be too narrow to detect using a typical windowed genome...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found