Next generation jams
See original GitHub issueThis issue is intended to consolidate many of the long-standing issues and offline discussions we’ve had around revising the jams specification for a variety of applications and use-cases.
Goals of revising the schema
- Migrate to a fully json-schema compliant spec #178 (instead of hybrid / dynamic namespaces).
- Add versioning to the schema definitions. This way, old files can still validate according to their specified jams version. This in turn makes it easier to evolve the schema without breaking compatibility.
- Simplify (and accelerate) the validation code from the python side.
Revision phase 1: full json-schema definition
The first step is to move all namespace definitions into full jsonschema definitions. In the proposed change, a namespace definition now becomes a secondary schema for the Annotation
object.
Annotation
objects must validate against both the template schema (our current annotation schema def), and exactly one of the pre-defined namespace schemas. Each namespace schema defines an exact match on the Annotation.namespace
field, in addition to whatever constraints placed on the value
and confidence
fields.
The is_sparse
flag will be removed, as this is not part of jsonschema. (We’ll come back to this later).
This phase will complete #178 .
Revision phase 2: hosted and versioned schema
Completing phase 1 will result in a fully json-schema compatible implementation of our specification, against which all current JAMS files should validate.
The next step (phase 2) is to place this schema under version control and host it remotely (e.g. `jams.github.io/schema/v0.3/schema.json`` or something). We can then revise the schema to include a version number in its definition, so that jams files can self-identify which version they are valid under.
With the remote schema implementation, it should be possible/easy to promote all jams definitions to top-level objects, so that you can independently validate an Annotation
or FileMetadata
object without having it belong to a full JAMS file.
This phase will complete #86 and facilitate #40 , by allowing partial storage.
Revision phase 3: extending the Annotation class
As mentioned in #24 , the current annotation structure might be a bit too rigid for more general media objects. @justinsalamon and I discussed this offline, and arrived at the following proposal:
- Rename
Annotation
def toIntervalAnnotation
, in which observations are(time, duration, value, confidence)
tuples - Add new annotation types
StaticAnnotation
: just(value, confidence)
BoundingBoxAnnotation
:(x, y, width, height, value, confidence)
TimeBoundingBoxAnnotation
:(time, x, y, duration, width, height, value, confidence)
- possibly others: polygons, instantaneous samples, etc…
Annotation
validation now becomesand(oneOf([Interval, Static, BoundingBox, ...]), oneOf([namespaces]))
This provides maximal flexibility in combining different annotation contents (tags etc) with annotation extents (time intervals, bounding boxes, etc). Including a StaticAnnotation
type also provides a way to resolve #206.
Phase 3 completes the proposed changes to the schema.
Alongside schema changes, we also want to generalize some things about the python implementation. Notably, it would be good to extend the search function to also support annotation contents. This way, we could find and excerpt annotations by value (eg time intervals labeled as guitar
or bounding boxes with face
). This isn’t a huge change from what the search function already does, but it will take a bit more implementation work.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:6 (5 by maintainers)
I’ve been taking a crack at this over the break. I’m most of the way there, though I’ve realized to make this work we may have to alter the JAMs schema a little, resulting in some currently existing valid jams data becoming invalid in the latest version…
Currently we have a list of annotations which can each contain a list of observations, and that list of observations can either be a sparse type or a dense type of list. I’m proposing that we change this to always be a list of observations (i.e., no dense / not dense distinction) and that observation type therein can either be a single observation (in the sparse case), or a observation containing lists of values (in the dense case). This will move all current jams dense observations down one level to the observation type, rather than being a different Annotation type overall.
This way the Annotation type itself has all the non-data dependent properties (e.g.,
Curator
,sandbox
, etc…) and it is only itsdata
attribute that is defined by the observation type (both the.data
andnamespace
attributes will be defined by the namespace). Thisdata
attribute is always an array of observations, and in the case of current DenseObservation types that exist out there in the wild, it will be a single element array with the observation type itself containingvalue
,confidence
,time
, andduration
arrays.This greatly simplifies the code and schema, but will change the schema for dense observations from something like:
to something like:
This has the added benefit of one annotation having possibly multiple dense observations in an Annotation. E.g., in the case of pitch contours, multiple pitch contours beginning and ending according to a vocal activity detector, or in an annotation application where the annotator is able to draw contours over a waveform, each drawn contour could be sampled and represented as a single DenseObservation.
At phase 3 of this issue, we can then further include a dense sampled observation type, e.g.:
Had a chat with @rabitt about some of this at ISMIR, and she pointed out that we currently have a bit of a blind spot when it comes to annotations of symbolic data. Concretely, objects like a score or a midi file may not have a fixed “duration” (in seconds), but may have similar extent specifications in terms of beats or ticks.
This seems soluble in the proposed framework by introducing extent types for symbolic data. We may need to wiggle a bit on the top-level schema (JAMS object) to make this work, but I think it would be worth doing in the long run.