Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Output sequence tagger has wrong probabilities output

See original GitHub issue

Describe the bug In Ludwig 0.1.x and 0.2.x when having an output feature using a tagger (like in the https://ludwig-ai.github.io/ludwig-docs/examples/#natural-language-understanding example), all the various slots were correctly tagged, and for each slot you were presented with an array of confidence values, representing each possible slot tag confidence. In 0.4.0 (and also happens in 0.3.x) you are presented with a fixed array of confidences, with length max_sequence_length, with one value per slot.

I would expect having multiple arrays of confidences for each output slot as in previous versions, or, if this is an intended change, I would expect this array length to match the output tags length (in the case you want to only output the maximum confidence value for each slot, which I don’t think is the case).

To Reproduce Train a model like presented in https://ludwig-ai.github.io/ludwig-docs/examples/#natural-language-understanding Run inference on an utterance Observe the prediction field slots_predictions and note its length Observe the prediction field slots_probabilities and note its length (always set to the maximum possible output value, and with a strange default value for non used tags, without all the sub-arrays representing each slot confidence)

Python 3.6 and 3.9 Ludwig 0.3.x, 0.4.0, git+master current

please note: Ludwig 0.3.3 has a different slots_probabilities output format. In 0.4.0 strange "\n"s were present in the array (like it was a text being parsed), while in 0.3.3 it is correctly parsed as an array.

One more thing: slots_probability which should be the overall confidence of all the slots being predicted, is often negative-valued (which should never happen).

Issue Analytics

State:
Created 2 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

carlogrisetticommented, May 3, 2022

It took a little bit to get around to seeing this, but here I am 😃 This is the slots_probabilities array I get when parsing the utterance it is a nice day: [0.99996495, 0.9998147 , 0.99985003, 0.9999442 , 0.99750584,\n 0.38254768, 0.38254768, 0.38254768, 0.38254768, 0.38254768,\n 0.38254768, 0.38254768, 0.38254768, 0.38254768, 0.38254768,\n 0.38254768, 0.38254768, 0.38254768, 0.382547] I expected to have an array that displayed only 5 values, one for each slot, outputting each slot probability.

Also, this is the slots_probability for the same utterance: -0.0029234159737825394 I for sure did not expect a negative value here.

I’ll look some more into this since it’s now a thing I need to quickly fix and use, and hopefully I’ll find the culprit 😃 Any help is much appreciated!

0reactions

dalianaliucommented, Jul 28, 2022

Hi @carlogrisetti, has the update to 0.5 worked for this issue?