Getting started - basic network won't classify two distinct patterns

Hello, I’m trying to get to a working starting point and work up from there so I set up a basic net. I then created a simple problem, two test cases each initialized with a different constant value. However after running through 1000 iterations of training on both patterns, when I forward the patterns back through the net, I get exactly the same classification probabilities for both patterns (increasing the number of iterations doesn’t help). Initially, I had a single class, but then changed it to two (thinking that might somehow help, but of course it didn’t). I’m sure I must be doing something very fundamentally wrong. If you would be willing to take a look at my code (below), that would be much appreciated. I’m sure it must be something very basic that’s wrong. Thanks!



using System;
using ConvNetSharp.Core;
using ConvNetSharp.Core.Layers.Double;
using ConvNetSharp.Core.Training.Double;
using ConvNetSharp.Volume;
using ConvNetSharp.Volume.Double;

namespace ClusterNN
internal class Program
private static void Main()
var net = new Net();

        net.AddLayer(new InputLayer(7, 9, 6));

        // Convolution layer #1 2x2
        net.AddLayer(new ConvLayer(2, 2, 6) { Stride = 1 });
        // Result after applying filter is (6 x 8) x 6_filters = 288 elements

        // declare a ReLU (rectified linear unit non-linearity)
        net.AddLayer(new ReluLayer());

        // Convolution layer #2 2x2
        net.AddLayer(new ConvLayer(2, 2, 6) { Stride = 1 });
        // Result after applying filter is (5 x 7) x 6_filters = 210 elements

        // declare a ReLU (rectified linear unit non-linearity)
        net.AddLayer(new ReluLayer());

        // Fully connected layer #1.
        net.AddLayer(new FullyConnLayer(100));
        // Fully connected layer #2
        net.AddLayer(new FullyConnLayer(2));

        // declare the linear classifier on top of the previous hidden layer
        net.AddLayer(new SoftmaxLayer(2));

        // Construct a trainer for our network
        var trainer = new SgdTrainer(net) { LearningRate = 0.001, Momentum = 0.0, BatchSize = 1 };

        var evenInput = BuilderInstance.Volume.SameAs(new Shape(7, 9, 6));
        var oddInput = BuilderInstance.Volume.SameAs(new Shape(7, 9, 6));

        MakeEvenOddVolumes(evenInput, oddInput);

        var label0_2 = BuilderInstance.Volume.SameAs(new Shape(1, 1, 2));
        label0_2.Set(0, 0, 0, 0.0);
        label0_2.Set(0, 0, 1, 1.0);

        var label1_2 = BuilderInstance.Volume.SameAs(new Shape(1, 1, 2));
        label1_2.Set(0, 0, 0, 1.0);
        label1_2.Set(0, 0, 1, 0.0);

        for (int i = 0; i < 1000; i++)
            Console.WriteLine("iteration: " + i);

            trainer.Train(evenInput, label0_2);
            trainer.Train(oddInput, label1_2);

        // forward a data point through the network
        var vscore = net.Forward(evenInput);
        Console.WriteLine("probability that x is even: " + vscore.Get(0, 0, 0) + " " + vscore.Get(0, 0, 1));

        var vscore2 = net.Forward(oddInput);
        Console.WriteLine("probability that x is odd: " + vscore2.Get(0, 0, 0) + " " +  vscore2.Get(0, 0, 1));


    private static void MakeEvenOddVolumes(Volume<double> evenVolume, Volume<double> oddVolume)
        int counter = 0;
        for (int i = 0; i < 7; i++)
            for (int j = 0; j < 9; j++)
                for (int k = 0; k < 6; k++)
                    evenVolume.Set(i, j, k, 100.0);
                    oddVolume.Set(i, j, k, 11);

gschmidt958commented, Apr 30, 2020

Thank you so much for your awesome support, I think I’ve reached a point where I can make real progress now and now I’m really looking forward to that!

cbovarcommented, Apr 30, 2020
  1. It should work with Relu / LeakyRelu as well. Initially I noticed that the gradients of even sample where much bigger than odd sample so I thought of using a tanh or sigmoid layer to prevent big values flowing into the network.

  2. If you want to produce a single output value, then you should use a RegressionLayer as last layer rather than SoftmaxLayer:


