Add variant/sample summary statistic methods
See original GitHub issueAt a minimum, we need to be able to calculate these axis-wise aggregations (for GWAS QC):
- Variants (across samples)
- genotype counts
- allele counts and frequency
- call count and rate
- hwe p value (https://github.com/pystatgen/sgkit/issues/28)
- Samples (across variants)
- call count and rate
- genotype counts
I would propose that we start by making methods for each that take a single Dataset
and return Dataset
instances (it will be easier to define the frequencies/rates when the counts are defined in the same functions).
This should consider whatever the solution to https://github.com/pystatgen/sgkit/issues/3 ends up being.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:11 (1 by maintainers)
Top Results From Across the Web
1.4 - Example: Descriptive Statistics | STAT 505
Skill in interpreting the statistical analysis depends very much on the researcher's subject matter knowledge. The variance-covariance matrix is also copied ...
Read more >Types of Variables, Descriptive Statistics, and Sample Size
Descriptive statistics can be used to describe a single variable (univariate analysis) or more than one variable (bivariate/multivariate analysis).
Read more >7 Types of Statistical Analysis Techniques (And Process Steps)
You can use descriptive statistics to summarize the data from a sample or represent a whole sample in a research population. Descriptive ...
Read more >Variation - Why statistical methods are needed - YouTube
NOTE: This video has been updated and can be seen on the Statistics Learning Centre Channel. Teachers - please change your links.
Read more >Chapter 4, Descriptive Statistics and Graphic Displays - O'Reilly
Populations and Samples. The same data set may be considered as either a population or a sample, depending on the reason for its...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It would be best to focus on getting those ones merged first and at a glance some remaining todos are:
AN
is 0 inallele_frequency
I think #282 can stay on its own since it’s not an aggregation. I turned the original bullet points into a checklist. All that’s left is to add functions that call the some of the same internal functions with a different dimension for the sample-wise stats. I.e. we need a
sample_stats
function likevariant_stats
.