question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Errors thown when using custom loss function (rmsle)

See original GitHub issue

Hello! I’m super excited to dig into using pytorch_tabnet, but I’ve been banging my head against a wall for the past 2 nights on this issue, so I’m putting out a call for assistance.

I’ve got everything setup properly and confirmed that my data has no missing values and no values outside the defined dimensions.

I can train properly using the default (MSELoss) loss function, but for my particular problem I need to use either mean squared log error or, ideally, root mean squared log error.

I’ve defined a custom loss function as follows:

def rmsle_loss(y_pred, y_true):
    return torch.sqrt(nn.functional.mse_loss(torch.log(y_pred + 1), torch.log(y_true + 1)))

And I’m applying it to the model with the loss_fn=rmsle_loss parameter to .fit().

However - when I do this, I’m getting these dreaded errors.

Using CPU: index -1 is out of bounds for dimension 1 with size 22

Using GPU: CUDA error: device-side assert triggered

Both of these are being thrown at line 94 in sparsemax.py:

tau = input_cumsum.gather(dim, support_size - 1)

Note this ONLY happens when I’m using a custom loss function. I am able to train the model just fine using the default loss function, but since that’s not ideal for my domain, I really need to use the custom function. As I mentioned above, I’ve confirmed that there are no inf, NA, or out-of-bounds data in my training set.

Any thoughts? Help would be deeply appreciated!

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:10

github_iconTop GitHub Comments

1reaction
Optimoxcommented, Sep 26, 2022

Thanks for the detailed information.

I’m not sure why the model making any predictions of the form 1 + y_pred < 0.

The model should not predict negative values if training data is only positive. And that will probably be the case when training is finished. However, at start the model weights are randomly initialized so it’s very likely that negative values will occur. Even after few epochs, if the model needs to reach values as high as 10K, it’s hard to implicitly be sure that no input will yield negative scores. So as long as you don’t explicitly prevent the model to make negative predictions it might always happen.

Good luck with tuning your model!

1reaction
noahlhcommented, Sep 23, 2022

Oh wow you are legendary @Optimox! Thanks for uncovering that and I’m glad I was (indirectly) able to help fix a bug 😃

I’m retraining now and there’s still a slight discrepancy (see below), but it’s now within range and likely for the reasons you mentioned, so I think we’re all good. Many many thanks.

epoch 0  | loss: 2.37828 | train_rmsle: 1.7417000532150269| valid_rmsle: 1.7888699769973755|  0:00:22s
epoch 1  | loss: 1.3471  | train_rmsle: 2.0144999027252197| valid_rmsle: 2.071079969406128|  0:00:45s
epoch 2  | loss: 1.01037 | train_rmsle: 2.018090009689331| valid_rmsle: 2.063570022583008|  0:01:06s
epoch 3  | loss: 0.83754 | train_rmsle: 1.5472899675369263| valid_rmsle: 1.5472899675369263|  0:01:28s
epoch 4  | loss: 0.76075 | train_rmsle: 0.9113900065422058| valid_rmsle: 0.9303600192070007|  0:01:49s
epoch 5  | loss: 0.71234 | train_rmsle: 0.7181299924850464| valid_rmsle: 0.7953600287437439|  0:02:12s
epoch 6  | loss: 0.67979 | train_rmsle: 0.6658599972724915| valid_rmsle: 0.7813699841499329|  0:02:34s
epoch 7  | loss: 0.65395 | train_rmsle: 0.6251800060272217| valid_rmsle: 0.7234600186347961|  0:02:56s
epoch 8  | loss: 0.63447 | train_rmsle: 0.6097800135612488| valid_rmsle: 0.704200029373169|  0:03:18s
epoch 9  | loss: 0.62041 | train_rmsle: 0.5897899866104126| valid_rmsle: 0.7026200294494629|  0:03:39s
epoch 10 | loss: 0.60307 | train_rmsle: 0.5744100213050842| valid_rmsle: 0.6758300065994263|  0:04:01s
epoch 11 | loss: 0.59601 | train_rmsle: 0.5818799734115601| valid_rmsle: 0.6536700129508972|  0:04:23s
epoch 12 | loss: 0.58429 | train_rmsle: 0.560479998588562| valid_rmsle: 0.6636599898338318|  0:04:45s
epoch 13 | loss: 0.5752  | train_rmsle: 0.5513899922370911| valid_rmsle: 0.6779299974441528|  0:05:08s
epoch 14 | loss: 0.56832 | train_rmsle: 0.5371400117874146| valid_rmsle: 0.6313999891281128|  0:05:29s
epoch 15 | loss: 0.5622  | train_rmsle: 0.5362799763679504| valid_rmsle: 0.6614099740982056|  0:05:51s
Read more comments on GitHub >

github_iconTop Results From Across the Web

RMSE/ RMSLE loss function in Keras - python - Stack Overflow
When you use a custom loss, you need to put it without quotes, as you pass the function object, not a string:
Read more >
Issues · dreamquark-ai/tabnet - GitHub
Errors thown when using custom loss function (rmsle) bug Something isn't working ... ProTip! Exclude everything labeled bug with -label:bug.
Read more >
How do you Interpret RMSLE (Root Mean Squared ...
I haven't seen RMSLE before, but I'm assuming it's √1N∑Ni=1(log(xi)−log(yi))2. Thus exponentiating it won't give you RMSE, it'll give you.
Read more >
Custom metrics - Yardstick - Tidymodels
Custom metrics · Standardization between your metric and other preexisting metrics · Automatic error handling for types and lengths · Automatic ...
Read more >
5 Regression Loss Functions All Machine Learners Should ...
Mean Absolute Error (MAE ) is another loss function used for regression models. ... We can either write our own functions or use...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found