Sometimes batches are created which do not have same number of supervisions and inputs
See original GitHub issueTraceback (most recent call last):
File "train.py", line 1019, in <module>
main()
File "train.py", line 1012, in main
run(rank=0, world_size=1, args=args)
File "train.py", line 867, in run
scan_pessimistic_batches_for_oom(
File "train.py", line 977, in scan_pessimistic_batches_for_oom
loss, _ = compute_loss(
File "train.py", line 542, in compute_loss
simple_loss, pruned_loss = model(
File "/home/rudolf/miniconda3/envs/k2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/ssd1/team/tmp/rudolf/icefall/ch/model.py", line 117, in forward
assert x.size(0) == x_lens.size(0) == y.dim0
AssertionError
I did do trim_to_supervisions(). I’m trying to track this down.
On the side, I’m surprised by the what I find unusual design of data loading (not using batch_sampler argument of dataloader, sampler instead of dataset containing data, collation happening in dataset.getitem etc.).
edit: think it’s because I forgot --discard-overlapping
edit2: still failing, some cuts now dont have supervisions
this is how I created the cuts manifest:
recording_set, supervision_set, _ = load_kaldi_data_dir(kdata, 16000, num_jobs=4)
cuts = CutSet.from_manifests(recordings=recording_set, supervisions=supervision_set)
cuts = cuts.trim_to_supervisions(keep_overlapping=False)
cuts = cuts.truncate(offset_type='start', max_duration=60.0, keep_excessive_supervisions=False)
cuts = cuts.compute_and_store_features(Fbank(), storage_path=outf + '-feats', num_jobs=6, storage_type=LilcomChunkyWriter)
cuts.to_file(outf + '.jsonl.gz')
edit3: it has to do with the duration
of the recording being shorter than the duration of a supervision, presumably somehow because of using sox bla.mp3 .. |
edit4: Okay so for some reason the sox resampling causes the file to be slightly shorter. This means the interval gotten from the [kaldi] segments is too long, so later when trimming it gets dropped. Not sure what the right fix for this, my quickfix is making the supervision duration not exceed the recording, if the difference is bigger then 0.1 then an error is thrown.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Regarding the design, some motivation is provided here: https://lhotse.readthedocs.io/en/latest/datasets.html#about-lhotses-datasets-and-samplers
As for Kaldi imports, segments file, Sox, and MP3, I’ve found in the past that the duration information is quite unreliable. Usually some manual/data-specific step is required to fix it. If you have any suggestions how we can improve this experience, it will be welcome.
I will try that out!