question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add variant/sample summary statistic methods

See original GitHub issue

At a minimum, we need to be able to calculate these axis-wise aggregations (for GWAS QC):

I would propose that we start by making methods for each that take a single Dataset and return Dataset instances (it will be easier to define the frequencies/rates when the counts are defined in the same functions).

This should consider whatever the solution to https://github.com/pystatgen/sgkit/issues/3 ends up being.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:11 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
eric-czechcommented, Aug 26, 2020

What still needs doing to merge #102 and are there any other aggregations we should add?

It would be best to focus on getting those ones merged first and at a glance some remaining todos are:

  • Count alleles should be based on https://github.com/pystatgen/sgkit/pull/114 instead
  • This line should just sum instead of stack
  • Some thought should be put into what happens when AN is 0 in allele_frequency
  • Those methods could use more testing
  • They also need documentation and examples
0reactions
eric-czechcommented, Oct 5, 2020

I believe #282 is related; @eric-czech as part of issue triage is it worth enumerating what’s left to close this issue out?

I think #282 can stay on its own since it’s not an aggregation. I turned the original bullet points into a checklist. All that’s left is to add functions that call the some of the same internal functions with a different dimension for the sample-wise stats. I.e. we need a sample_stats function like variant_stats.

Read more comments on GitHub >

github_iconTop Results From Across the Web

1.4 - Example: Descriptive Statistics | STAT 505
Skill in interpreting the statistical analysis depends very much on the researcher's subject matter knowledge. The variance-covariance matrix is also copied ...
Read more >
Types of Variables, Descriptive Statistics, and Sample Size
Descriptive statistics can be used to describe a single variable (univariate analysis) or more than one variable (bivariate/multivariate analysis).
Read more >
7 Types of Statistical Analysis Techniques (And Process Steps)
You can use descriptive statistics to summarize the data from a sample or represent a whole sample in a research population. Descriptive ...
Read more >
Variation - Why statistical methods are needed - YouTube
NOTE: This video has been updated and can be seen on the Statistics Learning Centre Channel. Teachers - please change your links.
Read more >
Chapter 4, Descriptive Statistics and Graphic Displays - O'Reilly
Populations and Samples. The same data set may be considered as either a population or a sample, depending on the reason for its...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found