Clarification questions
See original GitHub issueHi, I have a few questions regarding TransCoder’s training data and optimization setting.
- From the paper, it is clear that TransCoder is trained using Standalone functions during the DAE+BT training stage. But is TransCoder only trained using Standalone functions in the MLM stage too?
- During the MLM stage, only the encoder part of TransCoder is pre-trained, right?
- For the MLM pre-training,
max_epoch
andepoch_size
are set to 100k. If I understand correctly,epoch_size
basically refers to the number of instances used in each epoch. Is it correct? Also, for MLM pre-training, the following are set:
--validation_metrics _valid_mlm_ppl \
--stopping_criterion '_valid_mlm_ppl,10'
So, I am assuming TransCoder pre-training is stopped based on the stopping_criterion
. Before, the MLM pre-training was stopped, how many optimization steps were executed?
- Unlike the MLM pre-training stage, for the DAE+BT stage training, there is no
stopping_criterion
is set. And theepoch_size
was set to 50000 and themax_epoch
was set to 10000000. So, when the training stops? How many optimization steps were executed during this stage?
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
What Are Clarifying Questions and When Should You Ask ...
Clarifying questions are questions that the listener asks the speaker in an attempt to eliminate or prevent any misunderstanding, confusion or ...
Read more >HANDOUT: CLARIFYING AND PROBING QUESTIONS
Clarifying Questions are simple questions of fact. They clarify the dilemma and provide the nuts and bolts so that the participants can ask...
Read more >Clarifying questions — what they are & why you should know ...
When I first learned about clarifying questions — questions designed to get clarity about an issue, idea, or perspective before making a decision...
Read more >How to Ask Clarifying Questions - Video & Lesson Transcript
Clarifying questions are tools used by active listeners to ensure understanding and obtain essential information. These types of questions are ...
Read more >Clarification - Communication Skills | SkillsYouNeed
Clarifying can involve asking questions or occasionally summarising what the speaker has said. A listener can ask for clarification when they cannot make ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It’s per GPU so yes the actual batch size is 32 * 32
The epoch size is supposed to be the number of samples you train on. However there is something that’s a bit confusing in our code. We always use increase the sentences counter by the batch_size parameter when training but it’s not always the actual batch size.
I’ll think about changing the behaviour of this parameter to have something more coherent in a way that minimizes the changes people will need to make to the parameter.