Cannot use test and reproduce the result?
See original GitHub issue@dukebw Hi, thanks for your code. I download it and run it but meet with 3 main problems:
- It seems that it cannot reproduce the result in the paper? I run it using the run.sh by default and the eval ppl is around 80~100 until the end of the training (150 epochs)
- There is no test function in the Trainer class. I add one using the evaluation method (by passing self.test_data as the arg). However, the ppl is around 1500. Even when I pass self.train_data, self.eval_data or self.valid_data, it`s also around 1500.
- After training is done, when I call either test() or derive() and pass the arg --load_path, the self.shared.load_state_dict in load_model() throws out an error as “KeyError: unexpected key batch_norm.weight in state_dict”. Moreover, I print the self.shared.state_dict.keys() and the content loaded by torch.load from the checkpoint, and find that, parameters stored in the checkpoint contain 4 parameters related with batch normalization as “batch_norm.weight”, “batch_norm.bias”, “bath_norm.running_mean”, “batch_norm.running_var”, while the parameters shown by self.shared.load_state_dict not.
It would be great if you could help check these.
Issue Analytics
- State:
- Created 6 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
What should you do if you cannot reproduce published results?
Just publish. Publish your attempts to replicate the findings, documenting the discrepancies, together with the nice results you've obtained ...
Read more >How to Reproduce a Non-Reproducible Defect and Make ...
Be precise; Include exact data used during testing for easy ... Speaking technically, if you can't reproduce a bug, you can never fix...
Read more >Can not reproduce results · Issue #563 · EpistasisLab/tpot
I am getting a similar problem, I have train and test data in seperate files, I pass the training data in the tpot...
Read more >Six factors affecting reproducibility in life science research and ...
There are several reasons why an experiment cannot be replicated. Independent verification of data is a fundamental principle of scientific research across ...
Read more >Why can't we reproduce so many scientific findings?
Being able to do an experiment over again and confirm the results is actually a crucial aspect of science, called reproducibility.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi Ren Qian,
To address 1. and 2., it is necessary to re-train the model from scratch after discovering a good architecture. See Section 4.5 of this early revision of the ENAS paper: https://openreview.net/references/pdf?id=BykVS-WC-. This re-training is not implemented yet.
My results on the validation of the shared model are similar to yours, with the best validation perplexity I have seen being 82.
The third point you raised sounds like a genuine bug. I think it has to do with these lines in
models/shared_rnn.py
: https://github.com/carpedm20/ENAS-pytorch/blob/5736a24ef6d1967e4e9eefe2f6662a872a98586b/models/shared_rnn.py#L185-L188.Thank you for pointing out the issues.
Also, re the last reply: it’s not clear to me from reading the paper if that picking the best model of the samples and then retraining happens every epoch or at the end of all the epochs.