custom sparse categorical loss
See original GitHub issueI want to write a custom sparse categorical loss function in numpy or pure tensorflow. It should handle integer target labels and logit or probabilities output. To this end, I have the following:
def softmax(x, axis=-1):
y = np.exp(x - np.max(x, axis, keepdims=True))
return y / np.sum(y, axis, keepdims=True)
def categorical_crossentropy(target, output, from_logits=False):
if from_logits:
output = softmax(output)
else:
output /= output.sum(axis=-1, keepdims=True)
output = np.clip(output, 1e-7, 1 - 1e-7)
return np.sum(target * -np.log(output), axis=-1, keepdims=False)
I can do (target is one hot)
y_true = np.array([[0, 1, 0], [0, 0, 1]])
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]])
categorical_crossentropy(y_true, y_pred)
array([0.05129329, 2.30258509])
But I can not do it (target is an integer, desired).
y_true = np.array([1, 2])
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]])
categorical_crossentropy(y_true, y_pred)
ValueError: operands could not be broadcast together with shapes (2,) (2,3)
How to achieve this, where a loss function takes integer target and is able to compute with logits as well as probabilities output. I know there is a built-in function (sparse_categorical_crossentropy
), but I like to write it in plain numpy or pure tensorflow as a custom loss function.
Issue Analytics
- State:
- Created a year ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
tf.keras.losses.SparseCategoricalCrossentropy - TensorFlow
Computes the crossentropy loss between the labels and predictions. ... dN] , except sparse loss functions such as sparse categorical ...
Read more >How To Build Custom Loss Functions In Keras For Any Use ...
The model has been trained, where the loss is calculated using sparse categorical cross entropy, and the weights have been updated using stochastic...
Read more >custom sparse categorical loss #16644 - keras-team ... - GitHub
I want to write a custom sparse categorical loss function in numpy or pure tensorflow. It should handle integer target labels and logit...
Read more >Cross Entropy vs. Sparse Cross Entropy: When to use one ...
Both, categorical cross entropy and sparse categorical cross entropy have the same loss function which you have mentioned above.
Read more >Losses - Keras
Loss functions are typically created by instantiating a loss class (e.g. keras.losses.SparseCategoricalCrossentropy ). All losses are also provided as ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
p is the predictions or output. y is the labels or target. Starting from your own implementation:
$\text{target}\in[0, 1]$, so all output values are important and might affect the loss function. For the sparse case, however, only one item in
target
is 1.0, while the remaining ones are 0. This means that, for $n$ classes with $i$ being the true label, sum will be:$$0\times output_0 + 0\times output_1 + … + 1\times output_i + … 0\times output_n = 1\times output_i = output_i$$
So we don’t add a bunch of 0s which would not affect the result. Instead, we just pick the i-th output for each sample in the batch:
In tensorflow, we could accomplish the same with the
tf.gather
function:Are you satisfied with the resolution of your issue? Yes No