Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

custom sparse categorical loss

See original GitHub issue

I want to write a custom sparse categorical loss function in numpy or pure tensorflow. It should handle integer target labels and logit or probabilities output. To this end, I have the following:


def softmax(x, axis=-1):
    y = np.exp(x - np.max(x, axis, keepdims=True))
    return y / np.sum(y, axis, keepdims=True)

def categorical_crossentropy(target, output, from_logits=False):
    if from_logits:
        output = softmax(output)
    else:
        output /= output.sum(axis=-1, keepdims=True)
    output = np.clip(output, 1e-7, 1 - 1e-7)
    return np.sum(target * -np.log(output), axis=-1, keepdims=False)

I can do (target is one hot)


y_true = np.array([[0, 1, 0], [0, 0, 1]])
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]])

categorical_crossentropy(y_true, y_pred)
array([0.05129329, 2.30258509])

But I can not do it (target is an integer, desired).


y_true = np.array([1, 2])
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]])

categorical_crossentropy(y_true, y_pred)
ValueError: operands could not be broadcast together with shapes (2,) (2,3)

How to achieve this, where a loss function takes integer target and is able to compute with logits as well as probabilities output. I know there is a built-in function (sparse_categorical_crossentropy), but I like to write it in plain numpy or pure tensorflow as a custom loss function.

Issue Analytics

State:
Created a year ago
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

lucasdavidcommented, Jul 4, 2022

p is the predictions or output. y is the labels or target. Starting from your own implementation:

def categorical_crossentropy(target, output, from_logits=False):
    ...
    return np.sum(target * -np.log(output), axis=-1, keepdims=False)

$\text{target}\in[0, 1]$, so all output values are important and might affect the loss function. For the sparse case, however, only one item in target is 1.0, while the remaining ones are 0. This means that, for $n$ classes with $i$ being the true label, sum will be:

$$0\times output_0 + 0\times output_1 + … + 1\times output_i + … 0\times output_n = 1\times output_i = output_i$$

So we don’t add a bunch of 0s which would not affect the result. Instead, we just pick the i-th output for each sample in the batch:

def sparse_categorical_crossentropy(output, target):
  output_i = output[range(len(target)), target]
  return -np.log(output_i)

y_true = np.asarray([1, 2])
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]])

sparse_categorical_crossentropy(y_pred, y_true)  # array([0.05129329, 2.30258509])

In tensorflow, we could accomplish the same with the tf.gather function:

def sparse_categorical_crossentropy(output, target):
  output_i = tf.gather(output, target, axis=1, batch_dims=1)
  return -tf.math.log(output_i)

0reactions

google-ml-butler[bot]commented, Aug 15, 2022

Are you satisfied with the resolution of your issue? Yes No

Top Results From Across the Web

tf.keras.losses.SparseCategoricalCrossentropy - TensorFlow

Computes the crossentropy loss between the labels and predictions. ... dN] , except sparse loss functions such as sparse categorical ...

How To Build Custom Loss Functions In Keras For Any Use ...

The model has been trained, where the loss is calculated using sparse categorical cross entropy, and the weights have been updated using stochastic...

custom sparse categorical loss #16644 - keras-team ... - GitHub

I want to write a custom sparse categorical loss function in numpy or pure tensorflow. It should handle integer target labels and logit...

Cross Entropy vs. Sparse Cross Entropy: When to use one ...

Both, categorical cross entropy and sparse categorical cross entropy have the same loss function which you have mentioned above.

Losses - Keras

Loss functions are typically created by instantiating a loss class (e.g. keras.losses.SparseCategoricalCrossentropy ). All losses are also provided as ...