Issue with expected dimensions on basic example
See original GitHub issueHello,
I’m trying to use Kapre to do some basic audio classification as a starting point before I dive into deeper projects. Right now, I’m having an issue compiling and fitting the most basic of models.
Right now, I have 2 classes, 556 samples, and I’m using a bit rate of 22050.
Per the Using Mel-spectrogram
section in the readme, I have my input data shaped like this:
>>> x.shape
(556, 1, 22050)
>>> y.shape
(556, 2)
I tried using the exact model from the README in the aforementioned section (except I substituted in the correct sampling rate (22050) and channel count (1) for my use case.
With those updates, it looks something like this:
input_shape = (1, 22050)
sr = 22050
model = Sequential()
# A mel-spectrogram layer
model.add(Melspectrogram(n_dft=512, n_hop=256, input_shape=input_shape,
padding='same', sr=sr, n_mels=128,
fmin=0.0, fmax=sr/2, power_melgram=1.0,
return_decibel_melgram=False, trainable_fb=False,
trainable_kernel=False,
name='trainable_stft'))
# Maybe some additive white noise.
model.add(AdditiveNoise(power=0.2))
# If you wanna normalise it per-frequency
model.add(Normalization2D(str_axis='freq')) # or 'channel', 'time', 'batch', 'data_sample'
# After this, it's just a usual keras workflow. For example..
# Add some layers, e.g., model.add(some convolution layers..)
# Compile the model
model.compile('adam', 'categorical_crossentropy') # if single-label classification
model.fit(x, y)
I get this stacktrace:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/anaconda3/envs/py3_6/lib/python3.7/site-packages/keras/engine/training.py", line 952, in fit
batch_size=batch_size)
File "/anaconda3/envs/py3_6/lib/python3.7/site-packages/keras/engine/training.py", line 789, in _standardize_user_data
exception_prefix='target')
File "/anaconda3/envs/py3_6/lib/python3.7/site-packages/keras/engine/training_utils.py", line 128, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected normalization2d_14 to have 4 dimensions, but got array with shape (556, 2)
OS: Mac 10.14.1 Python 3.7.2 (though it was happening with 3.6 earlier today too) Kapre: 0.1.3.1 keras: 2.2.4 tensorflow: 1.13.0rc2
Any ideas on what I’m missing?
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Are you kidding me? You don’t need to damn figure it out because I did. If you can read the comment,
, and if you understand how neural networks work, you should have figured it out by yourself. Clearly, it was not the case, so you asked me, which is fine. But once you’re told the answer you should be thankful instead of whining, right?
model.summary()
. It will show you that the shape is(batch, ch, freq, time)
or something similar, and that’s what the4d
in error message says because in your training the fedy
is in a shape of(batch, 2)
.I had no idea what you’ve tried, what you knew and not. I’ve tried my best to help you though. You could’ve told me that you’ve tried it instead of the response that I found difficult to interpret in a nice way. Thanks for clarifying.