Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TypeError with confusion matrix

See original GitHub issue

Describe the bug Training data is twitter airline sentiment. My model definition yaml is

input_features:
    -
        name: text
        type: text
output_features:
    -
        name: airline_sentiment
        type: category

After training, I used ludwig test to produce test_statistics.json. When I want to visualize a confusion matrix of the output ludwig visualize --visualization confusion_matrix -tes ./results_0/test_statistics.json , I got TypeError

Traceback (most recent call last):
  File "C:\Users\huuhi\Anaconda3\Scripts\ludwig-script.py", line 11, in <module>
    load_entry_point('ludwig==0.2.1', 'console_scripts', 'ludwig')()
  File "C:\Users\huuhi\Anaconda3\lib\site-packages\ludwig-0.2.1-py3.6.egg\ludwig\cli.py", line 108, in main
  File "C:\Users\huuhi\Anaconda3\lib\site-packages\ludwig-0.2.1-py3.6.egg\ludwig\cli.py", line 64, in __init__
  File "C:\Users\huuhi\Anaconda3\lib\site-packages\ludwig-0.2.1-py3.6.egg\ludwig\cli.py", line 94, in visualize
  File "C:\Users\huuhi\Anaconda3\lib\site-packages\ludwig-0.2.1-py3.6.egg\ludwig\visualize.py", line 3119, in cli
  File "C:\Users\huuhi\Anaconda3\lib\site-packages\ludwig-0.2.1-py3.6.egg\ludwig\visualize.py", line 650, in confusion_matrix_cli
  File "C:\Users\huuhi\Anaconda3\lib\site-packages\ludwig-0.2.1-py3.6.egg\ludwig\utils\data_utils.py", line 76, in load_json
TypeError: expected str, bytes or os.PathLike object, not NoneType

Environment:

OS: Win 10
Python 3
Ludwig version 0.2

Issue Analytics

State:
Created 4 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

1reaction

rthiruv-lab58commented, Aug 5, 2019

I had faced similar problems while training using custom dataset as well. What worked for me was two things,

I gave full path directory to the files (test_statistics.json) and (Should work fine without this point as well)
Gave the ground_truth metadata (the training Json file), which in the documentation is described as necessary. Documentation: https://uber.github.io/ludwig/user_guide/#confusion-matrix

Changing these two worked out for me. Let me know if it works for you too.

The command I used: ludwig visualize --visualization confusion_matrix --normalize --top_n_class 8 --test_statistics ./results_1/test_statistics.json --ground_truth_metada train1.json

0reactions

w4nderlustcommented, Aug 12, 2019

Just to clarify for people who may end up reading this issue: the --ground_truth_metada parameter accepts the metadata JSON file that is created when a dataset is used the firt time with the same name of the dataset with .JSON at the end. If the preprocessed data saving is skipped, the same information is actually present in results/experiment_run_0/model/tran_set_metadata.json (results and experiment run may be different in your case).