Training issue
See original GitHub issueThanks for sharing the nice model implementation.
When I start training, the following warning appears, do you also get the same message?
I think it’s a fairseq installation problem.
No module named 'lightconv_cuda'
And I’m training in batch size 5… on 24G memory sized RTX 3090. Could the above problem be the cause?
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
It Isn't Always a Training Issue - Training Industry
If it's not a training issue, then training won't solve the performance problem, no matter how stellar the training solution is.
Read more >9 Critical Employee Training Challenges to Overcome (2023)
1. Employee training is not cut and dry · 2. There are numerous training options · 3. Employees have diverse needs · 1....
Read more >10 Challenges of Training & Development of Professionals
The most common challenges of training and development include geographic limitations, increased costs, language barriers, translation issues, and virtual ...
Read more >More Training Won't Solve Your Company's Problems
The go-to response for organizational issues is typically some form of reactionary training. The mantra goes like this: Design the training.
Read more >Is it a Training Issue? 5 Critical Questions to Ask Requestors
Training request questions to help identify performance issues · Is this a new issue? · How many employees are impacted by this problem?...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
No, I couldn’t solved fairseq installing problem. Maybe it requires to reinstall cuda or version up it to 11.0
Instead, I use my own lightweight_conv module. Insert the code below in Parallel-Tacotron2/model/blocks and remove
from fairseq.modules import LightweightConv
in the same file.Whether you do this or not, the program runs and you can only train with very low batch sizes. And the loss stays around 70 and it doesn’t seemed to be trained properly.
Hi @LEECHOONGHO, thanks for your attention. Please refer to #5 for that. It should resolve your issue.