MLPClassifier drops accuracy when number of features is equal to number of nodes in the hidden layer
See original GitHub issueOdd behaviour when the number of layers is equal to the number of nodes in a single hidden layers. This may be a mathematical property of NNs or a bug, could not confirm online, please advise.
from sklearn.datasets import load_breast_cancer
import numpy as np
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import cross_validate
df = load_breast_cancer()
X, y = load_breast_cancer().data, load_breast_cancer().target
from sklearn.model_selection import KFold
def create_CV_sets(X, y, n_splits = 5):
skf = KFold(n_splits, shuffle=True, random_state=0)
X_trains, y_trains, X_tests, y_tests = [], [], [], []
for train_index, test_index in skf.split(X, y):
X_trains.append(X[train_index,:])
X_tests.append(X[test_index,:])
y_trains.append(y[train_index])
y_tests.append(y[test_index])
return X_trains, y_trains, X_tests, y_tests, skf
X_trains, y_trains, X_vals, y_vals , cv = create_CV_sets(X=X, y=y, n_splits=5);
hidden_layer_sizes = [5, 10, 20, 28, 29, 30, 31, 32, 40, 50, 80, 100]
score_test_scores= {}
score_fit_times= {}
mlp_hls = []
print("Full set")
for hls in hidden_layer_sizes:
mlp_hls.append(MLPClassifier(random_state=0, activation = "identity", hidden_layer_sizes = (hls, ), learning_rate = 'adaptive'))
k = len(mlp_hls)-1
scores = cross_validate(mlp_hls[k], X, y, cv=cv,return_train_score=True, scoring='accuracy')
score_test_scores[k]=scores["test_score"].mean()
score_fit_times[k]=scores["fit_time"].mean()
print("Hidden Layer Sizes", hls, "\t Score: {:.3f}".format(score_test_scores[k]),"\tFit Time: {:.3f}".format(score_fit_times[k]))
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
sklearn.neural_network.MLPClassifier
MLPClassifier : Classifier comparison Classifier comparison Compare Stochastic ... The ith element represents the number of neurons in the ith hidden layer.
Read more >Rules-of-thumb for building a Neural Network | by Chitta Ranjan
We will learn the thumb-rules, e.g. the number of hidden layers, number of nodes, activation, etc., and see the implementations in ...
Read more >Confused in selecting the number of hidden layers and ...
So, I've got 100% accuracy with this architecture. The number of hidden layers and hidden neurons per layer was taken arbitrarily. However, I ......
Read more >How to choose number of hidden layers and nodes in Neural ...
In this video we will understand how we can perform hyperparameter optimization on an Artificial Neural Network.
Read more >NN - Multi-layer Perceptron Classifier (MLPClassifier)
hidden_layer_sizes : With this parameter we can specify the number of layers and the number of nodes we want to have in the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
We did some debugging with @jbsilva in the #DataUmbrella sprint, and found out that this behavior seems to be just by chance with that particular random seed (
random_state
param). After trying with some other seed values, we saw that the fall in accuracy doesn’t seem to be tied to the equality between the hidden layer size and the features quantity (there are always some accuracy dips, but at random sizes). Also trying with other activation functions and datasets, yields results where the relation isn’t there anymore.Closing because https://github.com/scikit-learn/scikit-learn/issues/20083#issuecomment-869041031 shows that the performance was related to the random seed.