sample frequency spectrum
See original GitHub issueAdd support for site frequency spectrum. Needs a specific C algorithm for efficiency.
I think we should call it the “sample_frequency_spectrum”, since it’s counting samples. site_frequency_spectrum is actively misleading I think, as well as leading to absurd function names like site_site_frequency_spectrum
and branch_site_frequency_spectrum
. I’m open to other names though.
See also #196.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:28 (26 by maintainers)
Top Results From Across the Web
Allele frequency spectrum - Wikipedia
In population genetics, the allele frequency spectrum, sometimes called the site frequency spectrum, is the distribution of the allele frequencies of a ...
Read more >Frequency Spectrum - an overview | ScienceDirect Topics
Frequency spectrum of a signal is the range of frequencies contained by a signal. For example, a square wave is shown in Fig....
Read more >Sampling Theorem and Frequency Spectrum Aliasing
B. The sampling theorem states that a real signal, f(t), which is band limited to B Hz can be reconstructed without error from...
Read more >Sampling strategies for frequency spectrum-based population ...
The allele frequency spectrum (AFS) consists of counts of the number of single nucleotide polymorphism (SNP) loci with derived variants present ...
Read more >Sample frequency | Frequently used terms - TiePie engineering
The rate at which samples are taken by the oscilloscope is called the sample frequency, the number of samples per second. A higher...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Uh-oh, you are right. I didn’t think about this hard enough, and suggested above a folding method that collapses
[i1,i2] + [n1-i1, i2] + [i1, n2-i2] + [n1-i1, n2-i2]
; but this is not correct, because (for instance)[i,i]
and[n1-i, i]
are clearly different, since the two pops agree in the first case but disagree in the second.But, about nonequivalence: if we don’t know the ancestral allele, all that can be said is that we can’t distinguish
[i1, i2, ..., ik]
from[n1-i1, n2-i2, ..., nk-ik]
; so isn’t the right thing to do to return a “lower triangular” array, e.g. with the upper triangle zeroed? I.e., the output hasout[i1, ..., ik] = 0
ifi1 + ... + ik > (n1 + ... + nk)/2
? (need to check that)Thanks for catching this, @KLohse!
I agree that both the 0th and the nth need to be included for consistency; both categories become relevant for the general case of the AFS for multiple pops. Thinking about implementing the folding of the AFS in the most general way; it seems that one would have to specify a ref population with respect to which ASF the folding happens. In other words, with n pops, there are n ways of folding (and the various folded spectra are not equivalent).