"BucketingSampler does not support working with lazy CutSet" when running icefall recipes
See original GitHub issueThis commit https://github.com/lhotse-speech/lhotse/commit/0dceff169c9af0c70c8eda1266640a85409617e9 seems to break running icefall/egs/librispeech/ASR/*/train.py.
I now get the ValueError raised (“BucketingSampler does not support working with lazy CutSet”) when running: python3 ./pruned_transducer_stateless2/train.py --exp-dir=./pruned_transducer_stateless2/exp --world-size 1 --num-epochs 26 --full-libri 1 --max-duration 300
.
I am using the librispeech datasets which are prepared in icefall and I have not modified anything.
@pzelasko whats the best way forward since I believe you added this raise condition? Thanks!
Issue Analytics
- State:
- Created a year ago
- Comments:6 (1 by maintainers)
Top Results From Across the Web
Issues · lhotse-speech/lhotse · GitHub
"BucketingSampler does not support working with lazy CutSet" when running icefall recipes. #721 opened May 19, 2022 by McHughes288.
Read more >[WIP] add wenetspeech recipe #167 - GitHub
commands to run WenetSpeech recipe are : cd icefall/egs/wenetspeech/ASR && . ... Can be useful when handling large, lazy manifests where it is...
Read more >lhotse's documentation!
CutSet supports lazy data augmentation/transformation methods which require adjusting ... Item doesn't exist yet - run extra work to prepare the manifest.
Read more >Lhotse - arXiv
Lhotse provides a common JSON description format with corresponding Python classes and data preparation recipes for over 30 popular speech ...
Read more >online-deployment issue "The Managed Inference service creation ...
ExecuteCommand not committing data, 2, 2022-03-24, 2022-05-15. "BucketingSampler does not support working with lazy CutSet" when running icefall recipes ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
my understanding is that jsonl is json but formatted in such a way that it’s one record per line. there might be a finer definition, but I have so far survived with this 😃 y.
On Fri, May 20, 2022 at 10:14 AM John Hughes @.***> wrote:
Both are fine for me. Shall we replace all “.json.gz" in icefall with ".jsonl.gz”