question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pre-trained models tracker

See original GitHub issue

On each of the datasets provided, we must train a Deepspeech model. The overall architecture is encompassed in this command:

python train.py  --rnn_type gru --hidden_size 800 --hidden_layers 5 --checkpoint --visdom --train_manifest /path/to/train_manifest.csv --val_manifest /path/to/val_manifest.csv --epochs 100 --num_workers $(nproc) --cuda

In the above command you must replace the manifests paths with the correct paths to the dataset. A few notes:

  • No noise injection for the pre-trained models, or augmentations
  • Train till convergence (should get a nice smooth training curve hopefully!)
  • For smaller datasets, you may need to reduce the learning rate annealing by adding the flag --learning anneal and setting it to a smaller value, like 1.01. For larger datasets, the default is fine (up to around 4.5k hours from internal testing on the deepspeech.torch version)

A release will be cut from the DeepSpeech package that will have the models, and a reference to the latest release added to the README to find latest models!

Progress tracker for datasets:

  • AN4
  • TEDLium
  • LibriSpeech

Let me know if you plan on working on running any of these, and I’ll update the ticket with details!

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:24 (12 by maintainers)

github_iconTop GitHub Comments

4reactions
ryanlearycommented, Jun 17, 2017

AN4 model is complete. Librispeech is still in progress. Below are the current evaluations:

Corpus Test Set Network WER CER
an4 an4-test 5x800gru 10.521 4.772
libri1k libri-val 5x800gru 20.758 7.787
libri1k libri-test 5x800gru 22.088 8.194
libri1k test-clean 5x800gru 11.546 3.538
libri1k test-other 5x800gru 31.813 12.483
2reactions
ryanlearycommented, Jun 12, 2017

Kicked off a 1000 hr libri training. Will know later tonight if convergence looks promising. Will probably take at least a few days to converge since I only have 2x Titan Xs for it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Introduction to the model—ArcGIS pretrained models
This model automates the object tracking process significantly, which speeds up motion imagery analysis workflows. It can be used in the Full Motion...
Read more >
03. Multiple object tracking with pre-trained SMOT models
SMOT is a new tracking framework that converts any single-shot detector (SSD) model into an online multiple object tracker, which emphasizes simultaneously ...
Read more >
Model Zoo - Deep learning code and pretrained models for ...
ModelZoo curates and provides a platform for deep learning researchers to easily find code and pre-trained models for a variety of platforms and...
Read more >
NVIDIA NGC Pretrained Models
With production-ready, AI pretrained models from the NGC™ catalog, NVIDIA's hub of ... It uses image classification, object detection and tracking, ...
Read more >
Pretrained Deep Learning Models | Image Feature Extraction ...
Types of models. Pretrained deep learning models perform tasks, such as feature extraction, classification, redaction, detection, and tracking, to derive ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found