TabNet for MultiLabel classification
See original GitHub issueDescribe the bug
Hi, I am trying to use this implementation of TabNet for building a model with 20k features and 88 target classes. All one hot encoded. What is the current behavior? When I try to pass the same data to model.fit I get following error
TypeError Traceback (most recent call last)
<ipython-input-219-976344abca0e> in <module>
----> 1 clf.fit(X_train[x_cols].values, y_train.values, X_valid[x_cols], y_valid.values)
~/anaconda3/envs/tabnet/lib/python3.6/site-packages/pytorch_tabnet/tab_model.py in fit(self, X_train, y_train, X_valid, y_valid, loss_fn, weights, max_epochs, patience, batch_size, virtual_batch_size, num_workers, drop_last)
168 self.update_fit_params(X_train, y_train, X_valid, y_valid, loss_fn,
169 weights, max_epochs, patience, batch_size,
--> 170 virtual_batch_size, num_workers, drop_last)
171
172 train_dataloader, valid_dataloader = self.construct_loaders(X_train,
~/anaconda3/envs/tabnet/lib/python3.6/site-packages/pytorch_tabnet/tab_model.py in update_fit_params(self, X_train, y_train, X_valid, y_valid, loss_fn, weights, max_epochs, patience, batch_size, virtual_batch_size, num_workers, drop_last)
572 self.input_dim = X_train.shape[1]
573
--> 574 output_dim, train_labels = self.infer_output_dim(y_train, y_valid)
575 self.output_dim = output_dim
576 self.classes_ = train_labels
~/anaconda3/envs/tabnet/lib/python3.6/site-packages/pytorch_tabnet/tab_model.py in infer_output_dim(self, y_train, y_valid)
508 Sorted list of initial classes
509 """
--> 510 train_labels = unique_labels(y_train)
511 output_dim = len(train_labels)
512
~/anaconda3/envs/tabnet/lib/python3.6/site-packages/pytorch_tabnet/multiclass_utils.py in unique_labels(*ys)
124 raise ValueError("Unknown label type: %s" % repr(ys))
125
--> 126 ys_labels = set(chain.from_iterable(_unique_labels(y) for y in ys))
127
128 # Check that we don't mix string type with number type
TypeError: 'NoneType' object is not iterable
If the current behavior is a bug, please provide the steps to reproduce.
For this I am using a simple dataset as mentioned before with 20k features and 88 targets, all 1 hot encoded.
clf.fit(X_train[x_cols].values, y_train.values, X_valid[x_cols].values, y_valid.values)
where
and
I tried tracing it to multiclass_utils where this error is coming but not able to understand where I am wrong. Can you please help me in understanding if the input format is wrong or some other mistake I might be doing while trying this out.
Also it works when I am using just one target out of 88 Expected behavior
Screenshots
Other relevant information:
poetry version:
python version:
Operating System:
Additional tools:
Additional context
Issue Analytics
- State:
- Created 3 years ago
- Comments:8
Top GitHub Comments
@aishwarya-agrawal new TabNetMultiTaskClassification is out (in develop branch) check it out
Awesome!