Prediction of numerical features
See original GitHub issueMy yaml file looks like
training:
epochs: 10
learning_rate: 0.001
batch_size: 128
early_stop: 10
input_features:
-
name: lyrics
type: text
encoder: parallel_cnn
level: word
output_features:
-
name: f1
type: numerical
-
name: f2
type: numerical
so I have two float features f1 and f2 (and so MSE will be used by default as loss) to be predicted over a input text, for which Iβm using a parallel_cnn
at word level.
After few epochs Iβm getting a 0 accuracy.
ββββββββββββββ€ββββββββββββ€βββββββββββββ
β combined β loss β accuracy β
ββββββββββββββͺββββββββββββͺβββββββββββββ‘
β train β 1121.6427 β 0.0000 β
ββββββββββββββΌββββββββββββΌβββββββββββββ€
β vali β 1140.5157 β 0.0000 β
ββββββββββββββΌββββββββββββΌβββββββββββββ€
β test β 1136.8768 β 0.0000 β
ββββββββββββββ§ββββββββββββ§βββββββββββββ
I get the same result when using for the input a different encoder like
input_features:
-
name: lyrics
type: text
encoder: rnn
cell: lstm
bidirectional: true
is the yaml output_features
definition correct for these float values?
Issue Analytics
- State:
- Created 5 years ago
- Comments:7
Top Results From Across the Web
Machine Learning: Trying to predict a numerical value
Regression algorithms are machine learning techniques for predicting continuous numerical values. They are supervised learning tasks which meansΒ ...
Read more >Predicting numerical values with regression - Elastic
An introduction to machine learning regression, which enables you to predict numerical values in a data set.
Read more >2 supervised learning techniques that aid value predictions
This article explores the numerical prediction and category prediction supervised learning techniques. These machine learning techniques areΒ ...
Read more >Predicting classes and numeric values - HighBond
Using the predictive model, the prediction process predicts a class or a numeric value associated with each unlabeled record in the new data...
Read more >Visualize the predictive power of a numerical feature in a ...
In this article, I've shown my favorite way to visualize the predictive power of a numerical feature against a categorical target. This may...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Clipping and normalization were added to numerical features, so i consider this to be solved.
@loretoparisi let me tray to understand your usecase batter. So those numbers that you want to output are between [0,1] but they are not probabilities of a binary classifier, is that correct?
Depending on that, one solution could be to add a preprocessing parameter like
normalize_01
that performs this normalization at the data level (so it would work for both numerical inputs and numerical features. There could also be anormalize_zscore
and anormalize_minmax
normalization strategy, so probably it would be better to have anormalize
parameter that by default isNone
but then you can pass a string with the name of the normalization strategy (01, maxmin, zscore) and it will adopt that strategy reading it from a normalization strategy registry.This will work at the data level, but there wouldnβt be anything in the model to constraint it to produce a value in [0,1]. For that purpose one can think about writing a decoder that clips values before outputting them, or some other strategy (for instance applying a sigmoid). Adding a decoder should be pretty easy, the only difficulty is that sequence features for instance already have a machinery with a registry of decoders that are selected by their name, while numerical features donβt have that because so far there has only been one decoder. Adding it would be simple, and probably I should do it for all the features anyway.
What do you think?