Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

possible error in user guide for log loss

See original GitHub issue

Description

The user guide for log loss (section 3.3.2.11) says (in the second paragraph, just before the equation):

the log loss per sample is the negative log-likelihood of the classifier given the true label

I think that this is backwards; I think that it’s the negative log-likelihood of the true label given the probability predictions from the classifier. The equation in the subsequent line has the conditioning on the predictions. Also, the documentation for the function metrics.log_loss describes it as I wrote it.

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

I’m looking at the documentation for sklearn 0.22.1.

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:9 (7 by maintainers)

Top GitHub Comments

1reaction

NicolasHugcommented, Jan 7, 2020

The likelihood function is indeed a function of the parameters of the model, but not given the data

From the wikipedia link you provided:

it is viewed and used as a function of the parameters given the data sample.

Saying “given the data” is just a way of saying that the dataset is considered fixed, while what varies are the parameters of the models

1reaction

NicolasHugcommented, Jan 7, 2020

The likelihood is a function of the parameters of the model given some data, so IMO the user-guide formulation is correct and we should update the docstring