Use of sparse matrices
See original GitHub issueHi, nice to see someone continue development on a wrapper like this after the people at tensorflow decided to discontinue development on their wrapper.
I have run into an issue with the use of sparse matrices.
In the API documentation it is mentioned that the fit and predict functions from the KerasClassifier wrapper should work with array-like, sparse matrix and dataframe. However, when I use a sparse matrix, I get the following exception:
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
I used the quickstart guide to get a simple reproducable issue. I simply converted the ndarrays in the example into a scipy.sparse coo_matrix:
import numpy as np
from sklearn.datasets import make_classification
from tensorflow import keras
from scipy.sparse import coo_matrix
from scikeras.wrappers import KerasClassifier
X, y = make_classification(1000, 20, n_informative=10, random_state=0)
X = X.astype(np.float32)
y = y.astype(np.int64)
X = coo_matrix(X)
y = coo_matrix(y)
def get_model(hidden_layer_dim, meta):
# note that meta is a special argument that will be
# handed a dict containing input metadata
n_features_in_ = meta["n_features_in_"]
X_shape_ = meta["X_shape_"]
n_classes_ = meta["n_classes_"]
model = keras.models.Sequential()
model.add(keras.layers.Dense(n_features_in_, input_shape=X_shape_[1:]))
model.add(keras.layers.Activation("relu"))
model.add(keras.layers.Dense(hidden_layer_dim))
model.add(keras.layers.Activation("relu"))
model.add(keras.layers.Dense(n_classes_))
model.add(keras.layers.Activation("softmax"))
return model
clf = KerasClassifier(
get_model,
loss="sparse_categorical_crossentropy",
hidden_layer_dim=100,
)
clf.fit(X, y)
y_proba = clf.predict_proba(X)
A potential reason for the issue could be that when validating the inputs via sklearn.utils.check_X_y, the default parameter for accept_sparse is False. See also here
Setting this parameter to true might solve the issue (I will go and test that soon). I am running this on python=3.7.10, scikit-learn=0.24.2 and tensorflow=2.5.0
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:11 (5 by maintainers)
Awesome that’s about as real world useful as it gets. I think I’ll move forward with #240 tomorrow
The main way I know is that casting
.todense()
made my container crash, while passing the Sparse matrix didn’t.