Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training Stuck at 0%

See original GitHub issue

I am trying to fit a model with my own image datasets 1,000 images, 5 labels (Image dimension : 128x128 pixels)

I Have no idea with the output Only 0% from start to finish

train_path = ‘./productV1.1/train_images/'
train_labels = './productV1.1/product.csv'

X, y = load_image_dataset(csv_file_path=train_labels,
                                    images_path=train_path)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = ak.ImageClassifier(verbose=True)
model.fit(X_train, y_train, time_limit=seconds)
model.final_fit(X_train, y_train, X_test, y_test, retrain=True)

————

Output :

Preprocessing the images. Preprocessing finished.

Initializing search. Initialization finished.

±---------------------------------------------+ | Training model 0 | ±---------------------------------------------+ Using TensorFlow backend.

Epoch-1, Current Metric - 0: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-1, Current Metric - 0: 10 batch [00:00, 40.43 batch/s]

Epoch-1, Current Metric - 0: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-1, Current Metric - 0: 10 batch [00:00, 68.41 batch/s]

Epoch-2, Current Metric - 0.1111111111111111: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-2, Current Metric - 0.1111111111111111: 10 batch [00:00, 56.75 batch/s]

Epoch-2, Current Metric - 0.1111111111111111: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-2, Current Metric - 0.1111111111111111: 10 batch [00:00, 49.93 batch/s]

. . …

±---------------------------------------------+ | Training model 13 | ±---------------------------------------------+

Epoch-1, Current Metric - 0: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-1, Current Metric - 0: 10 batch [00:00, 19.50 batch/s]

Epoch-1, Current Metric - 0: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-1, Current Metric - 0: 10 batch [00:00, 35.31 batch/s]

Epoch-2, Current Metric - 0.1111111111111111: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-2, Current Metric - 0.1111111111111111: 10 batch [00:00, 32.93 batch/s]

Epoch-2, Current Metric - 0.1111111111111111: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-2, Current Metric - 0.1111111111111111: 10 batch [00:00, 24.65 batch/s]

Epoch-3, Current Metric - 0.1111111111111111: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-3, Current Metric - 0.1111111111111111: 10 batch [00:00, 36.40 batch/s]

Epoch-3, Current Metric - 0.1111111111111111: 0%| | 0/1 [00:00<?, ? batch/s] Epoch-3, Current Metric - 0.1111111111111111: 10 batch [00:00, 26.43 batch/s]

Issue Analytics

State:
Created 5 years ago
Reactions:3
Comments:10 (1 by maintainers)

Top GitHub Comments

2reactions

kuba-machacekcommented, Apr 3, 2019

I’m getting the same issue. It seems to be stuck completely as the time_limit param is not working in this case.

2reactions

DGaffneycommented, Feb 15, 2019

Adding my voice to this - I am also running into this issue on 0.3.7 on python 3.6.7, Ubuntu 18.04.1 on a fresh amazon box, only autokeras + dependencies installed, running text classifier as follows:

from autokeras import TextClassifier

import csv
rows = []
labels = []
with open('labeled_data.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        rows.append(row[0])
        labels.append(int(row[1]))

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(rows, labels, test_size=0.33, random_state=42)
clf = TextClassifier(verbose=True)
clf.fit(x=X_train, y=y_train, time_limit=12*60*60)
clf.final_fit(X_train, y_train, X_test, y_test, retrain=True)
y_out = clf.evaluate(X_test, y_test)

It gets stuck during fit like so:

Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           31           |   2.545453941822052    |   0.6641509433962264   |
+--------------------------------------------------------------------------+


+----------------------------------------------+
|              Training model 32               |
+----------------------------------------------+

No loss decrease after 5 epochs.


Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           32           |   5.955252933502197    |  0.43018867924528303   |
+--------------------------------------------------------------------------+


+----------------------------------------------+
|              Training model 33               |
+----------------------------------------------+
Epoch-1, Current Metric - 0:   0%|                                        | 0/5 [00:00<?, ? batch/s]

happy to provide the CSV I’m using off-list.