Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

model not learning anything

See original GitHub issue

been training images on my model with masks for earrings and not earrings.

dice_loss = sm.losses.DiceLoss()
focal_loss = sm.losses.BinaryFocalLoss() 
total_loss = dice_loss + (1 * focal_loss)
 
opt = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999)
 
metrics = [sm.metrics.IOUScore(threshold=0.5), sm.metrics.FScore(threshold=0.5)]

model = sm.PSPNet(BACKBONE, encoder_weights = 'imagenet', classes = 1,
encoder_freeze=True, activation='sigmoid', downsample_factor=16, input_shape=(960,960,3),
psp_conv_filters=1024, psp_pooling_type='avg')

backbone is inceptionresnetv2 and i’ve set masks and images to 0…1 range.

the model trains but doesn’t learn anything as its output is a matrix full of 0’s.

it generated a mask when the input size was 380,380 but does not in this case.

could you point out any error i might be making? thanks in advance

PS: i could attach the colab file that i am using if you want to refer to something else.

Issue Analytics

State:
Created 4 years ago
Comments:11 (2 by maintainers)

Top GitHub Comments

3reactions

JordanMakesMapscommented, Oct 24, 2019

Okay @jayayy, I have some input that may or may not be useful but I think it would be worth trying out.

Starting at line 60:

for i in range(l): 
  imgg = cv2.imread(path + '/'+lis[i])
  gray = cv2.cvtColor(imgg, cv2.COLOR_BGR2GRAY)
  gray = cv2.resize(gray,(960,960))
  mask[i] = np.expand_dims(gray,axis=2)
  #mask[i] = gray.reshape(960,960,1)
  if i%50==0:
      print(i)

Here you’re taking in a mask that is currently in RGB format and converting it to Grayscale, then resizing and expanding so that the shape is correct for what you’re trying to do. I think the issue is that the mask needs to be in a binary format, not Grayscale. Read this for info on that (specifically, read “Representing the Task”).

Once your masks are in a binary format, you have to one-hot-encode them. You can do this with your own script or use keras.utils.to_categorical().

As a sanity check, the final shape of your masks that you pass into training should be (batch_size, height, width, 2), the 2 comes from the fact that you’re doing a binary segmentation. Even though you only have one class that you’re looking at, you need to make it clear in your training data that there are really 2 classes, the one that you’re interested in, and the background class.

The one-hot-encoded mask’s last dimension when indexed should be (960, 960), and for the 0th index should have zeros where the class of interest is, and ones where the background is. The 1st index should be the opposite, where there are zeros where the background is, and ones where the class of interest is (see image below).

one-hot-endcoded-example

I would also recommend you to use the preprocess_input() function provided by @qubvel. Each preprocess_input() is different but will preprocess in the input image in such a way to mimic how the original backbone was trained. So changing gears and looking at the code starting at line 119:

# data_gen_args = dict(featurewise_center=True,          # consider removing
#                      featurewise_std_normalization=True,     # consider removing
#                      rotation_range=90,
#                      width_shift_range=0.1,
#                      height_shift_range=0.1,
#                      zoom_range=0.2)
# image_datagen = ImageDataGenerator(**data_gen_args)
# mask_datagen = ImageDataGenerator(**data_gen_args)

You have featurewise_center and featurewise_std_normalization which alter the pixel values (other alternatives are rescale and zca_whitening) in an attempt to make is easier for the model to learn. The thing is, those two need to be fit on a the dataset (or a representative sample of the dataset) to be useful. Because you do not fit them, right now I don’t think that those lines of code are doing anything for you. If you choose not to remove those to lines of code and instead find a way to make them work properly (see this) DEFINITELY create a different data_gen_args for your mask_datagen because featurewise_center and featurewise_std_normalization would mess up your masks values resulting in bad predictions.

The very last thing is the predictions:

path = 'C:\\Users\\Deepak\\Desktop\\ready\\test\\img'
lis = os.listdir(path)
imager = cv2.resize(cv2.imread(path +'/'+ lis[9]),(960,960))
imager = imager.reshape(1,960,960,3)
raw = model.predict(x=imager)
raw = raw.reshape(960,960)

You can change that, into:

path = 'C:\\Users\\Deepak\\Desktop\\ready\\test\\img'
lis = os.listdir(path)
imager = cv2.resize(cv2.imread(path +'/'+ lis[9]),(960,960))
imager = imager.reshape(1,960,960,3)
raw = model.predict(x=imager).squeeze() # Changing the shape back to (height, width, classes)
cv2.imshow(np.argmax(raw, axis = 0)) # We're finding the max index across the 0th dimension

Hope this helps 👍

Just realized you commented out all of the ImageDataGenerator stuff 😂, either way still useful to know.

3reactions

qubvelcommented, Oct 23, 2019

Try to learn something with freeze_encoder=False