question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add sgkit-tskit repo?

See original GitHub issue

Anyone object to us adding tskit as an import format, so we have an sgkit-tskit repo? I’m happy to do the coding here, and I think it’ll be a useful way to crystallise our general data import strategy.

We could also do export for tskit, in principle, using tsinfer but I’m not sure there’s much point.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
tomwhitecommented, Jul 29, 2021

In terms of parallelism by variants, this is definitely a weakness in the tskit API at the moment, we want to add some way of doing this well.

BTW I pushed another commit to https://github.com/pystatgen/sgkit/compare/main...tomwhite:ts_to_zarr for reading genotypes from a TreeSequence in parallel using the approach I sketched out above. It passes tests, but I don’t know how well it performs for larger datasets. It’s probably worth having both the sequential and parallel versions for the moment.

1reaction
jeromekellehercommented, Jul 28, 2021

Thanks a million for this @tomwhite, it’s super helpful. I’ll take a good look at the code tomorrow.

In the short term, I think the simplest thing is for me to create a standalone “ga4gh-variant-sim” repo or something, where we just put the code to do this. We’ll want to add extra fields and stuff that are extrinsic to the tree sequence, so it’s as easy just put all the code in one place for now. I might make a start on this tomorrow, and maybe we could add your code for doing the translation to sgkit in there? Over time, we can see what a more mature interface might look like, and perhaps add it to tsconvert.

In terms of parallelism by variants, this is definitely a weakness in the tskit API at the moment, we want to add some way of doing this well.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to add Existing project to a GitHub Repository? - YouTube
Blog: https://dotnethow.net Udemy: https://www.udemy.com/user/ervis-trupja/ Pluralsight: https://www.pluralsight.com/auth...
Read more >
How to add a new project to an existing GitHub repository
Simply create a new, or use an existing, GitHub repository, create a local Git repository, and then after you add and commit, use...
Read more >
Creating a Repo and Adding Text Files to GitHub - YouTube
Heyo! In this video, I show you how to create your own GitHub repository (otherwise known as ' repo ') and how to...
Read more >
Creating a New GitHub Repository - YouTube
This is video #6 in the Data School series, "Introduction to Git and GitHub." Relevant links and the full transcript are below.
Read more >
Adding notes to a project (classic) - GitHub Docs
You can add notes to a classic project to serve as task reminders or to add ... user, or repository that already has...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found