Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can we have some more documentation of the predict function/ data format

See original GitHub issue

I’m new to machine learning and have managed to train ludwig using an f1 data set:

team,surname,position,track,year
Mercedes,Rosberg,1,Albert Park Grand Prix Circuit,2014
McLaren,Magnussen,2,Albert Park Grand Prix Circuit,2014
McLaren,Button,3,Albert Park Grand Prix Circuit,2014
...

my model is as follows:

input_features:
 -
   name: team
   type: category
 -
   name: track
   type: category
 -
   name: surname
   type: category
 -
   name: year
   type: category
output_features:
 -
   name: position
   type: numerical
training:
 epochs: 10

When I run the predict function I am trying to get a driver / position prediction for each track. What do I need to do in order to make this happen. I have tried to make a new data file without the position bugt it falls over with a missing ‘position’ key so clearly that’s needed even though that is the field I am trying to predict:

team,surname,track,year
Mercedes,Hamilton,Albert Park Grand Prix Circuit,2019
McLaren,Magnussen,Albert Park Grand Prix Circuit,2019
McLaren,Button,Albert Park Grand Prix Circuit,2019

Note the above data is truncated from the last 5 years with positions of 10th and above. I can expand this if it helps training!

When I add some positions in the prediction just returns a list of numerical values:

1.7585578
2.0917244
1.6508131

I’m assuming this is the position that is would be expecting for that driver/track/team combo but I’m not sure.

Can someone explain this to me further? Do I need a model without the ‘position’ key in for the prediction? Is there any way to tell Ludwig to use full Integers for positions and maybe only assign one of each per track (/per year)

Or can I be pointed to somewhere I can learn this myself?

Issue Analytics

State:
Created 5 years ago
Comments:7 (2 by maintainers)

Top GitHub Comments

1reaction

msaisumanthcommented, Feb 14, 2019

For future reference, take a look at the user guide: https://uber.github.io/ludwig/user_guide/#predict

1reaction

msaisumanthcommented, Feb 14, 2019

There are a couple issues here:

Numerical may not be the most ideal data type for position. Because position is discrete, categorical would be more appropriate (for instance, position 1.4 means nothing in this scenario).
For predict, the error is because when Ludwig does prediction it is also trying to evaluate the performance. For evaluating the performance, you need the ground truth. If you just want to run predictions (and not worry about evaluation), try this:

ludwig predict --only_predictions --model_path PATH_TO_THE_MODEL --data_csv PATH_TO_DATA

Top Results From Across the Web

How to Use the Sklearn Predict Method - Sharp Sight

In this tutorial, I'll show you how to use the Sklearn predict method to predict outputs using a machine learning model in Python....

The ML.PREDICT function | BigQuery ML - Google Cloud

DECODE_IMAGE function to reference the object table data. For more information, see Predicting an outcome from image data with an imported TensorFlow model....

How to use a model to do predictions with Keras - ActiveState

Keras models can be used to detect trends and make predictions, using the model.predict() class and it's variant, ...

3.6. scikit-learn: machine learning in Python

In Supervised Learning, we have a dataset consisting of both features and labels. ... We can plot the error: expected as a function...

How To Use the predict() Function in R Programming

The predict() function is used to predict the values based on the previous data behaviors and thus by fitting that data to the...