How to encode input order in a FF Neural Network
See original GitHub issueThis is a question/request for help, not a bug.
What is wrong?
If I build a train a simple NeuralNetwork on input data that looks like this:
{a: 1, b:1},
{a: 1, b:1} ,
{b: 1, a:1} ,
{a: 1, b:1} ,
{b: 1, a:1},
...
and output that looks like this:
{a:1}
{a:1}
{b:1}
{b:1}
{b:1}
...
It appears that the network doesn’t have a way of learning that order of the inputs is important.
Is there any way to encode this? Essentially, this is a very simple game outcome prediction test in which the first player has an advantage. Even if b consistently beats a on average, when a is the first mover, a wins slightly more than a wins on average (a loses at a higher rate when it’s the second mover than when it’s first).
Note that I’ve tried to encode the inputs as complex objects such as:
{ home:{b:1}, away:{a:1} }
But I get NaN probabilities for all the output values.
Where does it happen?
Just trying simple examples in the browser.
How important is this (1-5)?
2
Other Comments
I’m a bit of a newbie, so it could be I’m just using the wrong network type.
Thanks!
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (4 by maintainers)
Top GitHub Comments
Hi @mubaidr,
I believe I tried this, but if you consider that you could always feed in the home team first, then you get a third parameter that is the same value for every input. This means that there’s no effect of this third parameter. It’s been a couple of months since I looked at this and I’ve been working on other tech, so I’d have to refresh my memory a bit.
The reason I was originally trying a higher order object was because I thought I could then couple the home or away label to a specific team for that instance of a ‘game’ input.
One other possibility is to just have the inputs be the individual team’s performance per game with the home or away flag set. This, however, separates that team’s performance from the specific opponents and then you just end up predicting the team’s general performance if it’s home or away.
Thanks for the thought, though!
Thanks very much for the response.
For context, I have a bright 12yo who has gone through some probability classes and I am a lead in product management at a large software company. I engage my son in tech problems to get him thinking and I was showing him the great ‘identify a number’ tutorial on Brain.js and we started talking about Premier League. I told him a potential application would be to try to predict premier league scores and outcomes.
We were able to find a series of data that looks like:
{home_team_name, away_team_name, home_goals, away_goals, home_corners, home_shots, home_shots_on_goal, away_corners, away_shots, away_shots_on_goal}
I broke this up into a list of inputs of the structure:
There are 20 teams so team names will repeat in both columns if the data contain repeated weeks of play. The first item in each object is always the home team, but otherwise, there’s no encoding of that information.
I tried training a simple NeuralNetwork with the output looking like:
“#” indicates a normalized value between 0…1.
The results looked promising, but then I noticed that if I flipped the order of inputs when I ran the trained network, the logged output would not change, yet there is clearly a home field advantage. (Similarly, there’s a first mover advantage for equal opponents in chess, checkers, etc.)
This makes sense as I’m guessing the input really ends up looking like this under the hood and there’s no explicit information in the order of the input values in the matrix.:
I also thought that this might be a LSTM problem, but I couldn’t quite figure out how/why and if I would need to structure the data any differently. That’s why I tried building a more complex input object, but that wasn’t successful.
Thanks!
PS: I’m not trying to beat the bookies here, just trying to show a 12yo that we can arrive at plausible results.