question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] [tf.keras] BLSTM NER overfitting while 0.2.1 works just fine

See original GitHub issue

Check List

Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.

Environment

  • OS [e.g. Mac OS, Linux]: Colab

Issue Description

I have tried 0.2.1 version and tf.keras version for ChineseNER task, found that tf.keras version perform very badly. 0.21 val loss will reduce during training, but tf.keras only reduce the training loss.

What

0.2.1 perfomance

Epoch 1/200
41/41 [==============================] - 159s 4s/step - loss: 0.2313 - acc: 0.9385 - val_loss: 0.0699 - val_acc: 0.9772
Epoch 2/200
41/41 [==============================] - 277s 7s/step - loss: 0.0563 - acc: 0.9823 - val_loss: 0.0356 - val_acc: 0.9892
Epoch 3/200
41/41 [==============================] - 309s 8s/step - loss: 0.0361 - acc: 0.9887 - val_loss: 0.0243 - val_acc: 0.9928
Epoch 4/200
41/41 [==============================] - 242s 6s/step - loss: 0.0297 - acc: 0.9905 - val_loss: 0.0228 - val_acc: 0.9927
Epoch 5/200
41/41 [==============================] - 328s 8s/step - loss: 0.0252 - acc: 0.9920 - val_loss: 0.0196 - val_acc: 0.9938
Epoch 6/200
 4/41 [=>............................] - ETA: 4:37 - loss: 0.0234 - acc: 0.9926

tf.keras performance

Epoch 1/200
Epoch 1/200
5/5 [==============================] - 5s 1s/step - loss: 2.3491 - acc: 0.9712
42/42 [==============================] - 115s 3s/step - loss: 2.9824 - acc: 0.9171 - val_loss: 2.3491 - val_acc: 0.9712
Epoch 2/200
5/5 [==============================] - 4s 768ms/step - loss: 2.9726 - acc: 0.9822
42/42 [==============================] - 107s 3s/step - loss: 0.1563 - acc: 0.9952 - val_loss: 2.9726 - val_acc: 0.9822
Epoch 3/200
5/5 [==============================] - 4s 773ms/step - loss: 3.0985 - acc: 0.9833
42/42 [==============================] - 107s 3s/step - loss: 0.0482 - acc: 0.9994 - val_loss: 3.0985 - val_acc: 0.9833
Epoch 4/200
5/5 [==============================] - 4s 771ms/step - loss: 3.2479 - acc: 0.9833
42/42 [==============================] - 107s 3s/step - loss: 0.0247 - acc: 0.9997 - val_loss: 3.2479 - val_acc: 0.9833
Epoch 5/200
5/5 [==============================] - 4s 766ms/step - loss: 3.3612 - acc: 0.9839
42/42 [==============================] - 107s 3s/step - loss: 0.0156 - acc: 0.9998 - val_loss: 3.3612 - val_acc: 0.9839

Reproduce

Here is the colab notebook for reproduce this issue

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:39 (25 by maintainers)

github_iconTop GitHub Comments

2reactions
BrikerMancommented, Jun 11, 2019

Good news guys, fixed. After 10 epoch with 64 batch-size, here is the result.

embedding = BERTEmbedding('/input0/BERT/chinese_L-12_H-768_A-12',
                          task=kashgari.LABELING,
                          sequence_length=100,
                          layer_nums=4)
model = BLSTMModel(embedding)
model.fit(train_x,
          train_y,
          valid_x,
          valid_y,
          batch_size=64,
          epochs=10)
model.evaluate(test_x, test_y, batch_size=512)

           precision    recall  f1-score   support

      LOC     0.9265    0.9370    0.9317      3431
      ORG     0.8364    0.8808    0.8580      2147
      PER     0.9644    0.9644    0.9644      1797

micro avg     0.9084    0.9273    0.9177      7375
macro avg     0.9095    0.9273    0.9182      7375
Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Diagnose Overfitting and Underfitting of LSTM Models
How to diagnose an underfit, good fit, and overfit model. ... the loss optimized when fitting the model is called “loss” and accuracy...
Read more >
TF 2.0: Can't use tf.keras.layers.LSTM on GPU. #25843 - GitHub
I've been using tf.keras.layers.LSTM from TF2.0 and it was very slow while I was using activation='relu' , once I removed the ...
Read more >
Overfitting in LSTM even after using regularizers
First of all remove all your regularizers and dropout. You are literally spamming with all the tricks out there and 0.5 dropout is...
Read more >
Dealing with LSTM overfitting - Cross Validated
Your NN is not necessarily overfitting. Usually, when it overfits, validation loss goes up as the NN memorizes the train set, your graph...
Read more >
Overfit and underfit - TensorFlow for R - RStudio
Start with a simple model using only densely-connected layers ( tf$keras$layers$Dense ) as a baseline, then create larger models, and compare ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found