Crop/trim jams utility function
See original GitHub issueFor scaper I need to crop/trim some audio files and their corresponding JAMS annotations. I started implementing a crop_jams
method in scaper, but after chatting with @bmcfee it transpired that this might be something useful to have as part of the core jams library, and regardless there are some methodological questions best run through this crowd.
Question 1: is this something we want to implement as part of the core jams lib? (this is the second time I need to crop jams annotations for a project, so at least based on my own experience this would be something very useful to have).
Question 2: regardless of where this is implemented, I can see (at least) two different approaches to implementation:
- Edit the annotation “in place” by directly manipulating the annotation.data DataFrame.
- Create a new annotation, copy over all metadata, and selectively adjust/add events based on the desired crop range.
The former has the advantage of not having to worry about replicating the annotation perfectly since the data frame (and .duration
object variable) is manipulated directly. The latter has the advantage of (potentially) being independent of the internal storage mechanism, though it requires care in ensuring all annotation information is copied over. Also if we wanted it to be completely decoupled from the dataframe we might want/need to implement some form of get
function or iterator to be able to access (and maybe even remove?) the annotation observations via an API.
This is a little time-critical for me, so thoughts/discussion are most welcome. Worst case scenario I’ll implement this in scaper first and port it into jams at a later point if we decide that’s the way we want to go.
Issue Analytics
- State:
- Created 7 years ago
- Comments:17 (17 by maintainers)
👍 guess I better get started then… 😃
Documenting latest modifications to API (as of 95cdda4):
Here’s the low-down for trimming:
start_time
,end_time
, andstrict=False
Annotation.trim()
will change the start time/duration of an annotation:max(Annotation.time, start_time)
min(Annotation.time + Annotation.duration, end_time)
Annotation.trim()
will remove all observations that lie outside of the trim range defined by the new start time and duration:strict=False
(default), observations at the boundaries will be adjusted. I.e., if an observation starts before the new start time but ends after it, only the part that lies at/after the new start time is kept. The equivalent happens if the observation starts before but ends after the new end time.strict=True
, observations at the boundaries are also discarded.Annotation.trim()
is documented inAnnotation.sandbox.trim
JAMS.trim()
callsAnnotationArray.trim()
which callsAnnotation.trim()
on every annotation in the jams object.JAMS.trim()
is documented inJAMS.sandbox.trim
JAMS.trim()
does not affectJAMS.file_metadata.duration
Here’s the low-down for slicing:
max(0, trimmed_annotation.time - start_time)
, and the start times of all observations it contains are re-computed with respect to this new reference time. To give a concrete example, say an annotation spans time 10-15s. If we call ann.slice(12, 14), the new annotation will have a start time of 0 and duration 2. If instead we call ann.slice(5, 15), the sliced annotation will have a start time of 5 and duration 5.JAMS.slice()
callsAnnotationArray.slice()
which callsAnnotation.slice()
on every annotation in the jams object.JAMS.slice()
does affectJAMS.file_metadata.duration
, which becomesend_time - start_time
In a nutshell, trimming is useful when one wishes to modify the time range spanned by an annotation, but keep the corresponding audio file unchanged (and hence all the start times remain unchanged). Slicing is useful when the corresponding audio file will also be trimmed (cropped in time), in which case the start time of the annotation and all observations it contains to need to be adjusted.
Why doesn’t slice() always set the start time of the annotation to 0? This could lead to undesirable behavior if you tried to slice an annotation by defining a time range that starts before the start time of the annotation (i.e.
start_time < Annotation.time
). To give a concrete example, say you have an audio file from 0-15s, and an annotation that spans 10-15s. If you called jam.slice(5, 15), the sliced jam file would have a duration of 10s corresponding to “original” time 5-15s. The annotation would have a duration of 5s which corresponds to “original” time 10-15s, but since its start time was reset to 0, it would be as if the annotation describes time 0-5s in the new jam, while it actually describes time 5-10s. That is, it should have a start time of 5s. That’s why the new start time of the annotation is given bymax(0, trimmed_annotation.time - start_time)
.