Mutating Annotation data / JamsFrames
See original GitHub issueBefore making sawdust and creating a PR, I wanted to discuss the (my?) problem, and how I plan to address it. In the following, I use ann
to refer to an instance of an Annotation
, and obs
to refer to an instance of Observation
.
The situation I’m facing:
- I would like to transform the observations in an annotation; either in-place or returning a copy would do.
- AFAICT, the best (only?) way to change the data in an annotation is to (1) get a copy of the JamsFrame, modify the object separately, and reassign as
ann.data.from_dataframe
. - Modifying the contents of a pandas DataFrame (view vs copy) is perhaps the most frustrating / non-intuitive facets of the library.
A few observations:
- We don’t actually have an observation iterator (?), e.g. there’s no way to cleanly say
[fx(obs) for obs in ann]
- The
data
object is persistent and owned by the annotation, which makes mutating it directly seem like an option. - It’s not obvious to a user that
ann.data
is a subclassedJamsFrame
and not aDataFrame
. However, direct reassignment is currently allowed (ann.data = ann.data[:20]
), which changes the object type (JamsFrame -> DataFrame
) and breaking things later.
My proposal:
- Either (a) make an
Annotation
object directly iterable (with an__iter__
&__next__
), such that it returns references toObservation
objects like above, or (b) add an explicitann.observations(copy=False)
method that allows for one to iterate over the content in anAnnotation
. - Leave the internal (private) data structure of observations as a pure-Python object, i.e. a list.
- Make
data
an on-demand view ofann._observations
rather than an attribute, achieved by decoratingdata
as a@property
. We could also add a setter, which would coerceDataFrame
-like objects to be JamsFrames (viafrom_dataframe
), helping to avoid user error.
thoughts?
update: I see now there are two options: update a dataframe and reassign, or step through the iterrows of the JamsFrame and create a new annotation, like in Annotation.trim.
Issue Analytics
- State:
- Created 7 years ago
- Comments:16 (8 by maintainers)
Top Results From Across the Web
Mutate Resources | Kyverno
A mutate rule can be used to modify matching resources and is written as either a RFC 6902 JSON Patch or a strategic...
Read more >Mutating and non-mutating Swift contexts - Swift by Sundell
Essentially, a function that's been marked as mutating can change any property within its enclosing value. The word “value” is really key ...
Read more >Understanding the Mutating Keyword in Swift | by Herbie Dodge
Our data model is a simple Struct named User that takes a firstName and lastName at initializtion. Our apps logic requires us to...
Read more >Mutate target resource which is different from watched ...
Today, Kyverno only has the ability to mutate the same object that is incoming. There are use cases where existing object X needs...
Read more >Use of HGMD mutation data within popular variant annotation ...
Here we demonstrate the steps required to annotate an input sample with HGMD mutation data for three variant analysis tools: ANNOVAR, snpEff and ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I’ll be honest, I failed to appreciate the true depth of the situation, and yea, there are some major architectural considerations here. The Right Answer™️ is the one that only extends the current interface (because I agree it is powerful quick for many things).
Like I said, I am unblocked, but will continue mulling this over and look forward to chatting about this in the future (Rochester?).
Alright. I’ll close this one out then, but reopen it if something specific to data mutation crops up.