Getting started - basic network won't classify two distinct patterns
See original GitHub issueHello, I’m trying to get to a working starting point and work up from there so I set up a basic net. I then created a simple problem, two test cases each initialized with a different constant value. However after running through 1000 iterations of training on both patterns, when I forward the patterns back through the net, I get exactly the same classification probabilities for both patterns (increasing the number of iterations doesn’t help). Initially, I had a single class, but then changed it to two (thinking that might somehow help, but of course it didn’t). I’m sure I must be doing something very fundamentally wrong. If you would be willing to take a look at my code (below), that would be much appreciated. I’m sure it must be something very basic that’s wrong. Thanks!
Output
using System;
using ConvNetSharp.Core;
using ConvNetSharp.Core.Layers.Double;
using ConvNetSharp.Core.Training.Double;
using ConvNetSharp.Volume;
using ConvNetSharp.Volume.Double;
namespace ClusterNN
{
internal class Program
{
private static void Main()
{
var net = new Net();
net.AddLayer(new InputLayer(7, 9, 6));
// Convolution layer #1 2x2
net.AddLayer(new ConvLayer(2, 2, 6) { Stride = 1 });
// Result after applying filter is (6 x 8) x 6_filters = 288 elements
// declare a ReLU (rectified linear unit non-linearity)
net.AddLayer(new ReluLayer());
// Convolution layer #2 2x2
net.AddLayer(new ConvLayer(2, 2, 6) { Stride = 1 });
// Result after applying filter is (5 x 7) x 6_filters = 210 elements
// declare a ReLU (rectified linear unit non-linearity)
net.AddLayer(new ReluLayer());
// Fully connected layer #1.
net.AddLayer(new FullyConnLayer(100));
// Fully connected layer #2
net.AddLayer(new FullyConnLayer(2));
// declare the linear classifier on top of the previous hidden layer
net.AddLayer(new SoftmaxLayer(2));
// Construct a trainer for our network
var trainer = new SgdTrainer(net) { LearningRate = 0.001, Momentum = 0.0, BatchSize = 1 };
var evenInput = BuilderInstance.Volume.SameAs(new Shape(7, 9, 6));
var oddInput = BuilderInstance.Volume.SameAs(new Shape(7, 9, 6));
MakeEvenOddVolumes(evenInput, oddInput);
var label0_2 = BuilderInstance.Volume.SameAs(new Shape(1, 1, 2));
label0_2.Set(0, 0, 0, 0.0);
label0_2.Set(0, 0, 1, 1.0);
var label1_2 = BuilderInstance.Volume.SameAs(new Shape(1, 1, 2));
label1_2.Set(0, 0, 0, 1.0);
label1_2.Set(0, 0, 1, 0.0);
for (int i = 0; i < 1000; i++)
{
Console.WriteLine("iteration: " + i);
trainer.Train(evenInput, label0_2);
trainer.Train(oddInput, label1_2);
}
// forward a data point through the network
var vscore = net.Forward(evenInput);
Console.WriteLine("probability that x is even: " + vscore.Get(0, 0, 0) + " " + vscore.Get(0, 0, 1));
var vscore2 = net.Forward(oddInput);
Console.WriteLine("probability that x is odd: " + vscore2.Get(0, 0, 0) + " " + vscore2.Get(0, 0, 1));
}
private static void MakeEvenOddVolumes(Volume<double> evenVolume, Volume<double> oddVolume)
{
int counter = 0;
for (int i = 0; i < 7; i++)
{
for (int j = 0; j < 9; j++)
{
for (int k = 0; k < 6; k++)
{
evenVolume.Set(i, j, k, 100.0);
oddVolume.Set(i, j, k, 11);
counter++;
}
}
}
}
}
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
Thank you so much for your awesome support, I think I’ve reached a point where I can make real progress now and now I’m really looking forward to that!
It should work with Relu / LeakyRelu as well. Initially I noticed that the gradients of even sample where much bigger than odd sample so I thought of using a tanh or sigmoid layer to prevent big values flowing into the network.
If you want to produce a single output value, then you should use a
RegressionLayer
as last layer rather thanSoftmaxLayer
: