Crash when metric is NaN
See original GitHub issueUsing the FastaiV2 callback, when the metric becomes NaN, the error is:
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) NOT NULL constraint failed: trial_intermediate_values.intermediate_value [SQL: INSERT INTO trial_intermediate_values (trial_id, step, intermediate_value) VALUES (?, ?, ?)] [parameters: (52, 20, nan)]
My workaround is (inside FastAIV2PruningCallback):
def after_epoch(self) -> None:
super().after_epoch()
# self.idx is set by TrackTrackerCallback
out = self.recorder.final_record[self.idx]
if np.isnan(out):
out = np.inf if self.trial.study.direction == optuna.study.StudyDirection.MINIMIZE else -np.inf
self.trial.report(out, step=self.epoch)
if self.trial.should_prune():
raise CancelFitException()
Issue Analytics
- State:
- Created 2 years ago
- Comments:7
Top Results From Across the Web
tf.keras giving nan loss and non validation error - Stack Overflow
So I'd say to try and replace the last activation from softmax to sigmoid and change the loss to binary_crossentropy . Also, how...
Read more >Debugging a Machine Learning model written in TensorFlow ...
In this article, you get to look over my shoulder as I go about debugging a TensorFlow model. I did a lot of...
Read more >Metric - All Comes Crashing (Official Video) - YouTube
Listen to "All Comes Crashing " from the forthcoming album Formentera: https://orcd.co/allcomescrashingTour Dates ::: Metric's The ...
Read more >How Prometheus Monitoring works - YouTube
3) How does Prometheus collect those metrics from its targets? 4) Furthermore, I explain Prometheus Architecture with simple diagrams and ...
Read more >Machine Learning Glossary - Google Developers
A/B testing usually compares a single metric on two techniques; ... For more details, see this tutorial in Machine Learning Crash Course.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thank you for your clarification! Indeed, I can reproduce the error by using the following simpler code
This might be related storage issue of Optuna, so I transfer this issue to the main repository.
@lsc64 thank you for letting us know about it! Indeed, this issue is already known. Let me close this issue.