question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dataset Viewer issue for indonesian-nlp/librivox-indonesia

See original GitHub issue

Link

https://huggingface.co/datasets/indonesian-nlp/librivox-indonesia

Description

I created a new speech dataset https://huggingface.co/datasets/indonesian-nlp/librivox-indonesia, but the dataset preview doesn’t work with following error message:

Server error
Status code:   400
Exception:     TypeError
Message:       unsupported operand type(s) for +: 'NoneType' and 'str'

Please help, I am not sure what the problem here is. Thanks a lot.

Owner

Yes

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
cahya-wirawancommented, Sep 6, 2022

Hi @albertvillanova , I just add the streaming functionality and it works in the first try 😃 Thanks a lot!

1reaction
albertvillanovacommented, Sep 6, 2022

Yes, the issue arises when streaming (that is used by the viewer): your script does not support streaming and to support it in this case there are some subtleties that we are explaining better in our docs in a work-in progress pull request:

Just note that when streaming, local_extracted_archive is None, and this code line generates the error:

filepath = local_extracted_archive + "/librivox-indonesia/audio_transcription.csv"

For a proper implementation, you could have a look at: https://huggingface.co/datasets/common_voice/blob/main/common_voice.py

You can test your script locally by passing streaming=True to load_dataset:

ds = load_dataset("indonesian-nlp/librivox-indonesia", split="train", streaming=True); item = next(iter(ds)); item
Read more comments on GitHub >

github_iconTop Results From Across the Web

indonesian-nlp/librivox-indonesia · Datasets at Hugging Face
We collected only languages in Indonesia for this dataset. The original LibriVox audiobooks or sound files' duration varies from a few minutes to...
Read more >
Towards a Standardized Dataset on Indonesian Named Entity ...
In this study, we re-annotated this dataset, thereby developing a more standardized Indonesian NER resource to improve the NLP foundation for ...
Read more >
kit - Page 12 - LibriVox Forum
I am from Indonesia and very interested to read this book ... and a warm welcome to our forum, and especially to our...
Read more >
LibriVox Forum - Index page
Every work here needs a reader! Please sign up and help us complete these books. The symbol ~ means that Proof Listeners are...
Read more >
Puisi dari Indonesia - LibriVox
Section Chapter Author Source Reader Time Language Play 01 Megatruh Ki Yasadipura Etext Christianese 00:01:04 kaw Play 02 Durma Sejarah Jawa Kuna Etext Christianese 00:00:48...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found