question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Getting started - basic network won't classify two distinct patterns

See original GitHub issue

Hello, I’m trying to get to a working starting point and work up from there so I set up a basic net. I then created a simple problem, two test cases each initialized with a different constant value. However after running through 1000 iterations of training on both patterns, when I forward the patterns back through the net, I get exactly the same classification probabilities for both patterns (increasing the number of iterations doesn’t help). Initially, I had a single class, but then changed it to two (thinking that might somehow help, but of course it didn’t). I’m sure I must be doing something very fundamentally wrong. If you would be willing to take a look at my code (below), that would be much appreciated. I’m sure it must be something very basic that’s wrong. Thanks!

Output

image

using System;
using ConvNetSharp.Core;
using ConvNetSharp.Core.Layers.Double;
using ConvNetSharp.Core.Training.Double;
using ConvNetSharp.Volume;
using ConvNetSharp.Volume.Double;

namespace ClusterNN
{
internal class Program
{
private static void Main()
{
var net = new Net();

        net.AddLayer(new InputLayer(7, 9, 6));

        // Convolution layer #1 2x2
        net.AddLayer(new ConvLayer(2, 2, 6) { Stride = 1 });
        // Result after applying filter is (6 x 8) x 6_filters = 288 elements

        // declare a ReLU (rectified linear unit non-linearity)
        net.AddLayer(new ReluLayer());

        // Convolution layer #2 2x2
        net.AddLayer(new ConvLayer(2, 2, 6) { Stride = 1 });
        // Result after applying filter is (5 x 7) x 6_filters = 210 elements

        // declare a ReLU (rectified linear unit non-linearity)
        net.AddLayer(new ReluLayer());

        // Fully connected layer #1.
        net.AddLayer(new FullyConnLayer(100));
        
        // Fully connected layer #2
        net.AddLayer(new FullyConnLayer(2));

        // declare the linear classifier on top of the previous hidden layer
        net.AddLayer(new SoftmaxLayer(2));

        // Construct a trainer for our network
        var trainer = new SgdTrainer(net) { LearningRate = 0.001, Momentum = 0.0, BatchSize = 1 };

        var evenInput = BuilderInstance.Volume.SameAs(new Shape(7, 9, 6));
        var oddInput = BuilderInstance.Volume.SameAs(new Shape(7, 9, 6));

        MakeEvenOddVolumes(evenInput, oddInput);

        var label0_2 = BuilderInstance.Volume.SameAs(new Shape(1, 1, 2));
        label0_2.Set(0, 0, 0, 0.0);
        label0_2.Set(0, 0, 1, 1.0);

        var label1_2 = BuilderInstance.Volume.SameAs(new Shape(1, 1, 2));
        label1_2.Set(0, 0, 0, 1.0);
        label1_2.Set(0, 0, 1, 0.0);

        for (int i = 0; i < 1000; i++)
        {
            Console.WriteLine("iteration: " + i);

            trainer.Train(evenInput, label0_2);
            trainer.Train(oddInput, label1_2);
        }

        // forward a data point through the network
        var vscore = net.Forward(evenInput);
        Console.WriteLine("probability that x is even: " + vscore.Get(0, 0, 0) + " " + vscore.Get(0, 0, 1));

        var vscore2 = net.Forward(oddInput);
        Console.WriteLine("probability that x is odd: " + vscore2.Get(0, 0, 0) + " " +  vscore2.Get(0, 0, 1));

    }

    private static void MakeEvenOddVolumes(Volume<double> evenVolume, Volume<double> oddVolume)
    {
        int counter = 0;
        for (int i = 0; i < 7; i++)
        {
            for (int j = 0; j < 9; j++)
            {
                for (int k = 0; k < 6; k++)
                {
                    evenVolume.Set(i, j, k, 100.0);
                    oddVolume.Set(i, j, k, 11);
                    counter++;
                }
            }
        }
    }
}

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
gschmidt958commented, Apr 30, 2020

Thank you so much for your awesome support, I think I’ve reached a point where I can make real progress now and now I’m really looking forward to that!

0reactions
cbovarcommented, Apr 30, 2020
  1. It should work with Relu / LeakyRelu as well. Initially I noticed that the gradients of even sample where much bigger than odd sample so I thought of using a tanh or sigmoid layer to prevent big values flowing into the network.

  2. If you want to produce a single output value, then you should use a RegressionLayer as last layer rather than SoftmaxLayer:

image

Read more comments on GitHub >

github_iconTop Results From Across the Web

The 3 Types of Design Patterns All Developers Should ...
In this post, we will go through one basic design pattern for each classified type.
Read more >
First neural network for beginners explained (with code)
This configuration allows to create a simple classifier to distinguish 2 groups.
Read more >
4 Types of Classification Tasks in Machine Learning
There are many different types of classification tasks that you may encounter in machine learning and specialized approaches to modeling ...
Read more >
Classifying Twitter Topic-Networks Using Social Network ...
When combined with modularity, density can distinguish between unified and divided network patterns. These two distinct network structures, as ...
Read more >
Java Design Patterns - Example Tutorial
Structural design patterns provide different ways to create a Class structure (for example, using inheritance and composition to create a large ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found