Summary statistics IO and methods
See original GitHub issueThe MRC IEU at Bristol has a specification for storing GWAS summary statistics in a VCF file.
While I certainly have mixed feelings about using VCF files as a container format, they have done the hard work of providing tens of thousands of GWAS summary statistics VCFs at the OpenGWAS project.
There are more details in
- The MRC IEU OpenGWAS data infrastructure (2020)
- The variant call format provides efficient and robust storage of GWAS summary statistics (2021)
It would be great to figure out how to map the data in these GWAS VCF files to the sgkit
data model and to write some methods on top of them.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
SET STATISTICS IO (Transact-SQL) - Microsoft Learn
The setting of SET STATISTICS IO is set at execute or run time and not at parse time.
Read more >Simple Query tuning with STATISTICS IO and Execution plans
Lesson 1: Breakdown of 'STATISTICS IO' STATISTICS IO helps you to understand how your query performed by telling you what actually happened. ...
Read more >Query Tuning in SQL Server with Set Statistics IO
A summary of io statistics available from set statistics io ; physical reads, A read of a data page from disk when it...
Read more >Calculating Summary Statistics · Advanced SQL - SILOTA
For this purpose, we have summary statistics. Fortunately, SQL has a robust set of functions to do exactly that.
Read more >3 Tricks with STATISTICS IO and STATISTICS TIME in SQL ...
If you're running a procedure I believe you'll always get statement level results as well as a procedure level results, though. One option...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
So, looking at a few example GWAS-VCF files, they’re just putting per-variant sumstats into the SAMPLE fields. It appears some files use the INFO field for variant-specific metadata like minor allele frequency that we might want to pick up as well, but otherwise, I don’t think parsing is going to be too challenging.
The hard part for us is figuring out if we want to define a blessed data model for sumstats and start adding operations that operate upon it.
New standard for summary statistics https://ebispot.github.io/gwas-blog/new-standard-for-gwas-summary-statistics