Add sgkit-tskit repo?
See original GitHub issueAnyone object to us adding tskit as an import format, so we have an sgkit-tskit
repo? I’m happy to do the coding here, and I think it’ll be a useful way to crystallise our general data import strategy.
We could also do export for tskit, in principle, using tsinfer but I’m not sure there’s much point.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6
Top Results From Across the Web
How to add Existing project to a GitHub Repository? - YouTube
Blog: https://dotnethow.net Udemy: https://www.udemy.com/user/ervis-trupja/ Pluralsight: https://www.pluralsight.com/auth...
Read more >How to add a new project to an existing GitHub repository
Simply create a new, or use an existing, GitHub repository, create a local Git repository, and then after you add and commit, use...
Read more >Creating a Repo and Adding Text Files to GitHub - YouTube
Heyo! In this video, I show you how to create your own GitHub repository (otherwise known as ' repo ') and how to...
Read more >Creating a New GitHub Repository - YouTube
This is video #6 in the Data School series, "Introduction to Git and GitHub." Relevant links and the full transcript are below.
Read more >Adding notes to a project (classic) - GitHub Docs
You can add notes to a classic project to serve as task reminders or to add ... user, or repository that already has...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
BTW I pushed another commit to https://github.com/pystatgen/sgkit/compare/main...tomwhite:ts_to_zarr for reading genotypes from a TreeSequence in parallel using the approach I sketched out above. It passes tests, but I don’t know how well it performs for larger datasets. It’s probably worth having both the sequential and parallel versions for the moment.
Thanks a million for this @tomwhite, it’s super helpful. I’ll take a good look at the code tomorrow.
In the short term, I think the simplest thing is for me to create a standalone “ga4gh-variant-sim” repo or something, where we just put the code to do this. We’ll want to add extra fields and stuff that are extrinsic to the tree sequence, so it’s as easy just put all the code in one place for now. I might make a start on this tomorrow, and maybe we could add your code for doing the translation to sgkit in there? Over time, we can see what a more mature interface might look like, and perhaps add it to tsconvert.
In terms of parallelism by variants, this is definitely a weakness in the tskit API at the moment, we want to add some way of doing this well.