best_output_dir being overwritten when training multiple models
See original GitHub issueDescribe the bug
I’m running ClassificationModel()
on multiple models to find the best-performing one, and I’m specifying best_model_dir
for each of them as model_args["best_model_dir"] = os.path.join("outputs", model_name, "best_model")
. However, I only see a single best_model_dir
, in one of the directories (I’m specifying output_dir
like model_args["output_dir"] = os.path.join("outputs", model_name)
).
To Reproduce Mentioned above
Expected behavior
I’m expecting each output_dir
to contain its own best_model
folder. Or is that incorrect?
Desktop (please complete the following information): OS: Ubuntu 18.04 if it matters (python3 in anaconda in jupyter notebook)
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (3 by maintainers)
Top Results From Across the Web
Model and output files do not get saved to S3 when training ...
I am trying to train my own model on AWS Sagemaker in order to ... the 'latest' tagged image as it should be...
Read more >Trainer - Hugging Face
The API supports distributed training on multiple GPUs/TPUs, mixed precision ... The Trainer class is optimized for Transformers models and can have ...
Read more >Save only best weights with huggingface transformers
Currently, I'm thinking about making a custom Trainer class that inherits original one and change the train() method, and it would be great...
Read more >General Usage - Simple Transformers
Initialize a task-specific model; Train the model with train_model ... The directory where the best model (model checkpoints) will be saved ...
Read more >Building A Flexible Configuration System For Deep Learning ...
We need a routine that overwrites default configs with the following priority: . env > YAML config > default config. Then we import...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks, that works! I thought I did the correct thing, but it seems like each time I tried I left out a different parameter. Works now, thanks!
Noted, thanks! That makes sense, I’ll try that 😃