question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Edge case issue with simplify and missing data

See original GitHub issue

I’ve just hit an edge case which is preventing me round-tripping some missing data examples. If we have an extreme edge of the genome in which only a single sample has non-missing data, then this can be represented by a tree at that point with only a single branch, connecting that sample to the root. However, if we run simplify() on such a tree sequence, the edge is removed (as it only contains unary nodes). That leaves the sample as an “isolated node”, and hence the missing data code in https://github.com/tskit-dev/tskit/pull/272/ flags it up as a case where the genotype should be set to -1, even though in this case, we do have information to properly encode the genotype.

I’m wondering if this is a issue with the missing data code, or the simplify() code? For example, in simplify() it might be considered reasonable not to drop unary nodes from a sample if they connect that sample to the root? But I’m not sure how the root would be identified in this case.

Ping @jeromekelleher and @petrelharp as they are the simplifying and missing data gurus 😃

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:17 (17 by maintainers)

github_iconTop GitHub Comments

2reactions
petrelharpcommented, Aug 8, 2019

Hm: I think that simplify is definately doing the right thing, as originally defined. That edge isn’t reflecting a genealogical relationship between the samples, which is how we’ve defined things.

1reaction
petrelharpcommented, Aug 8, 2019

I agree, although it was useful to think through.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Edge Cases Are Real and They're Hurting Your Users - Modus
That doesn't necessarily mean it's easy to design; it's just comparatively simplified. The second type of scenario is the edge case.
Read more >
Testing Your Edge Cases - Thoughtbot
A little combination math goes a long way to catching edge cases. ... Time to add the missing 2 tests to cover nil...
Read more >
How do you identify "edge" cases on algorithms?
So, to identify the edge cases of an algorithm, I first look at the input domain. Its edge values could lead to edge...
Read more >
Ask HN: How to get developers to care less about edge cases?
So, to simplify my question: do your devs feel punished by missing edge cases, or do they feel rewarded by providing functionality?
Read more >
Don't Forget The Edge Cases - GeeksforGeeks
Let's make it a bit more easy, by assuming that test for input data type is already in place, so you receive only...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found