Per-channel energy normalization (PCEN)
See original GitHub issueDescription
PCEN (Wang et al.) seems to be a good alternative to log-amplitude scaling for speech recognition systems; see also: (Battenberg, et al.). Since it’s fairly easy to implement in the static (non-trainable) form, do folks thing it’s worth including in librosa?
Here’s a quick reference implementation of eq.1 in the paper, using the default parameters listed in section 2:
def pcen(E, alpha=0.98, delta=2, r=0.5, s=0.025, eps=1e-6):
M = scipy.signal.lfilter([s], [1, s - 1], E)
smooth = (eps + M)**(-alpha)
return (E * smooth + delta)**r - delta**r
It could probably be made more numerically stable. It could also be vectorized to support channel-dependent parameters, which might make it more useful for things like CQT.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:12
- Comments:24 (19 by maintainers)
Top Results From Across the Web
Per-Channel Energy Normalization: Why and How - IEEE Xplore
This letter investigates the adequacy of PCEN for spectrogram-based pattern recognition in far-field noisy recordings, both from theoretical and ...
Read more >Per-Channel Energy Normalization: Why and How
This article investigates the adequacy of PCEN for spectrogram-based pattern recognition in far-field noisy recordings, both from theoretical and practical.
Read more >Per-Channel Energy Normalization: Why and How - NSF PAR
This Jetter investigates the adequacy of PCEN for spectrogram-based pattern recognition in far-field noisy recordings, both from theoretical and ...
Read more >BirdCLEF22: Per-channel energy normalization - Kaggle
It is simple and easy to use, but Wang et. al. developed more advanced method: per-channel energy normalization (PCEN)[1]. It's named as "normalization", ......
Read more >librosa.pcen — librosa 0.10.0.dev0 documentation
Per-channel energy normalization (PCEN). This function normalizes a time-frequency representation S by performing automatic gain control, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
As this is the first search result when looking for an implementation, it might be interesting to note that the trainable form is also fairly easy to implement, here’s a version of PCEN for Lasagne: https://gist.github.com/f0k/c837bcf0bfde189ca16eab63637839cb I’ve used Vincent’s formulation in there as well, thanks for the discussion here!
I think I’m inclined to leave it parameterized by
s
so that it more closely resembles the paper. I’d actually like to rename the parameters to something interpretable, but provide the equation and reference to the paper in docs.