Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FeatureSet - scope

See original GitHub issue

I’m thinking about the FeatureSet, and I’m not sure what’s the scope of operations we’d like to support in lhotse. We will use lilcom to load/store the feature matrices, but what about feature extraction? Should we just use something precomputed e.g. with Kaldi, or also extract them on-the-fly at the FeatureSet API level? If the second is true, we’ll either need to use some other library (e.g. librosa) or delegate feature extraction to Kaldi by running it as a subprocess (unless there are some Python bindings available). I guess the same questions apply to data augmentation (we’ll get to that after having something initial working for features and having some example dataset represented in lhotse).

Of course, having the whole data augmentation + feature extraction pipeline as a part of lhotse would be more convenient in the long run. It’ll just take longer to get there. @danpovey @jtrmal WDYT?

Issue Analytics

State:
Created 3 years ago
Comments:10 (3 by maintainers)

Top GitHub Comments

1reaction

danpoveycommented, Jun 10, 2020

Makes sense I guess (although we’d have to make sure the defaults were stable when we do the release).

It might make sense to support writing the manifest files compressed, as they could get large and should be highly compressible.

1reaction

entn-atcommented, May 1, 2020

PyTorch audio also uses librosa https://github.com/pytorch/audio/blob/master/requirements.txt#L16

torchaudio only uses librosa for running compatibility tests; they wrote their own (compatible) feature extraction routines as PyTorch jit-able modules (including deltas and sliding CMN). They seem to have implemented support for two backends for reading audio files (sox and libsoundfile, the latter also works on Windows…) and are working on replacing sox effects with PyTorch versions (see https://github.com/pytorch/audio/issues/260 for a list of what they already implemented). I guess the point of that effort is to be able to use them on the fly during training.

Top Results From Across the Web

Defining Scope with Feature Levels and Events (Scope Part 2)

I define a feature as: “A set of logically related functional requirements that provides a capability to the user and enables the satisfaction...

How to Scope a New Feature - Prodify

Identifying success metrics for new features before any work started to ensure alignment; Understanding how a new feature was going to be released...

Feature Scope in Assemblies - 2022 - SolidWorks Web Help

When you create an assembly feature, you specify which components you want the feature to affect. Optionally, you can propagate the feature to...

Features – Scope Better

SCOPE is the only scoping platform designed for real time collaboration across disciplines, teams, markets and businesses. Working together for success.

Using Feature Trees to Depict Project Scope - Medium

A feature tree is a visual model that organizes a set of product features to make them easy to understand and to help...