Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow automatic creation of subtitles through AutoSub, served as WebVTT files

See original GitHub issue

An important feature to support is captions, this is especially important for accessibility for those who cannot hear, as well as for non-native English speakers to be able to understand content better. Also, it’s offered by YouTube and people should not have to feel like they are sacrificing features to use Odysee and we should aim for feature parity or even having additional features when compared to YouTube.

An issue is created for supporting captioning here: https://github.com/lbryio/lbry-desktop/issues/2325

But I will make this ticket only to cover automatically generated captions, with the ability for people to upload their own captions during the upload process to be implemented as a separate ticket.

I tested out the AutoSub module, which is a CLI which integrates open-source Mozilla DeepSpeech for the speech to text functionality, and then through some clever programming is able to correspond that text with the proper timestamps, and it works actually quite well out of the box.

https://github.com/abhirooptalasila/AutoSub

They say they have the capability to output in WebVTT automatically, which I wasn’t able to get working with a first attempt, but regardless .srt and .vtt formats are very similar so to convert between the two is trivial and there are a lot of packages that allow that to be carried out.

Once the .vtt file is created, it is trivial to serve it via videojs by adding to this line: ui/component/viewers/videoViewer/internal/videojs.jsx:220

Something along the lines of tracks: [{src: 'https://servestatic.tv/mysub.vtt', kind:'captions', srclang: 'en', label: 'English'}]

I implemented and tested this functionality and was quite impressed with how well Autosub worked. You can see it even properly transcribed the word ‘prophylactic’. It was similar to what would be expected from YouTube so I would say that Autosub would work well enough out of the box to ship. screen_shot_2021-08-02_at_22 10 25

AutoSub is also built on top of Mozilla’s Deep Speech which although I used against a model trained on an English speaking dataset, there are also models for different languages so we would be able to use those as well though I never tested using a non-English dataset myself. Although, I believe most of the content and viewers are English speaking so this could probably cover in Pareto distribution style maybe 80% of content creators/users right out of the gate. Would also be a great way to begin supporting captioning, at which point the ability for users to upload their custom captions during the upload process could be supported as well.