Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to encode input order in a FF Neural Network

See original GitHub issue

This is a question/request for help, not a bug.

What is wrong?

If I build a train a simple NeuralNetwork on input data that looks like this:

{a: 1, b:1},
{a: 1, b:1} ,
{b: 1, a:1} ,
{a: 1, b:1} ,
{b: 1, a:1}, 
...

and output that looks like this:

{a:1}
{a:1}
{b:1}
{b:1}
{b:1}
...

It appears that the network doesn’t have a way of learning that order of the inputs is important.

Is there any way to encode this? Essentially, this is a very simple game outcome prediction test in which the first player has an advantage. Even if b consistently beats a on average, when a is the first mover, a wins slightly more than a wins on average (a loses at a higher rate when it’s the second mover than when it’s first).

Note that I’ve tried to encode the inputs as complex objects such as:

{ home:{b:1}, away:{a:1} }

But I get NaN probabilities for all the output values.

Where does it happen?

Just trying simple examples in the browser.

How important is this (1-5)?

Other Comments

I’m a bit of a newbie, so it could be I’m just using the wrong network type.

Thanks!

Issue Analytics

State:
Created 4 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

cjandrewscommented, May 1, 2020

Hi @mubaidr,

I believe I tried this, but if you consider that you could always feed in the home team first, then you get a third parameter that is the same value for every input. This means that there’s no effect of this third parameter. It’s been a couple of months since I looked at this and I’ve been working on other tech, so I’d have to refresh my memory a bit.

The reason I was originally trying a higher order object was because I thought I could then couple the home or away label to a specific team for that instance of a ‘game’ input.

One other possibility is to just have the inputs be the individual team’s performance per game with the home or away flag set. This, however, separates that team’s performance from the specific opponents and then you just end up predicting the team’s general performance if it’s home or away.

Thanks for the thought, though!

1reaction

cjandrewscommented, Nov 8, 2019

Thanks very much for the response.

For context, I have a bright 12yo who has gone through some probability classes and I am a lead in product management at a large software company. I engage my son in tech problems to get him thinking and I was showing him the great ‘identify a number’ tutorial on Brain.js and we started talking about Premier League. I told him a potential application would be to try to predict premier league scores and outcomes.

We were able to find a series of data that looks like:

{home_team_name, away_team_name, home_goals, away_goals, home_corners, home_shots, home_shots_on_goal, away_corners, away_shots, away_shots_on_goal}

I broke this up into a list of inputs of the structure:

[ 
   {home_team_1:1, away_team_1:1},
   {home_team_2:1, away_team_2:1},
   {home_team_3:1, away_team_3:1},
   ...
]

There are 20 teams so team names will repeat in both columns if the data contain repeated weeks of play. The first item in each object is always the home team, but otherwise, there’s no encoding of that information.

I tried training a simple NeuralNetwork with the output looking like:

[
    {home_goals_1:#, away_goals_1:#, home_corners_1:#, home_shots_1:#, home_shots_on_goal_1:#, away_corners_1:#, away_shots_1:#, away_shots_on_goal_1:#},
    {home_goals_1:#, away_goals_1:#, home_corners_1:#, home_shots_1:#, home_shots_on_goal_1:#, away_corners_1:#, away_shots_1:#, away_shots_on_goal_1:#},
    {home_goals_1:#, away_goals_1:#, home_corners_1:#, home_shots_1:#, home_shots_on_goal_1:#, away_corners_1:#, away_shots_1:#, away_shots_on_goal_1:#},
    ...
]

“#” indicates a normalized value between 0…1.

The results looked promising, but then I noticed that if I flipped the order of inputs when I ran the trained network, the logged output would not change, yet there is clearly a home field advantage. (Similarly, there’s a first mover advantage for equal opponents in chess, checkers, etc.)

This makes sense as I’m guessing the input really ends up looking like this under the hood and there’s no explicit information in the order of the input values in the matrix.:

[
 [1,1,0,0,0,...],// position zero could be home or away
 [1,1,0,0,0,...],// position zero could be home or away
 [0,0,1,1,0,...],
 [0,0,0,1,1,...],
 ...
]

I also thought that this might be a LSTM problem, but I couldn’t quite figure out how/why and if I would need to structure the data any differently. That’s why I tried building a more complex input object, but that wasn’t successful.

Thanks!

PS: I’m not trying to beat the bookies here, just trying to show a 12yo that we can arrive at plausible results.