Keras callback creating .profile-empty file blocks loading data
See original GitHub issueRepro steps:
-
Create a virtualenv with
tf-nightly-2.0-preview==2.0.0.dev20190402
and open two terminals in this environment. -
In one terminal, run the following simple Python script (but continue to the next step while this script is still running):
from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorflow as tf DATASET = tf.keras.datasets.mnist INPUT_SHAPE = (28, 28) OUTPUT_CLASSES = 10 def model_fn(): model = tf.keras.models.Sequential([ tf.keras.layers.Input(INPUT_SHAPE), tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation="relu"), tf.keras.layers.BatchNormalization(), tf.keras.layers.Dense(256, activation="relu"), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(OUTPUT_CLASSES, activation="softmax"), ]) model.compile( loss="sparse_categorical_crossentropy", optimizer="adagrad", metrics=["accuracy"], ) return model def main(): model = model_fn() ((x_train, y_train), (x_test, y_test)) = DATASET.load_data() model.fit( x=x_train, y=y_train, validation_data=(x_test, y_test), callbacks=[tf.keras.callbacks.TensorBoard()], epochs=5, ) if __name__ == "__main__": main()
-
Wait for (say) epoch 2/5 to finish training. Then, in the other terminal, launch
tensorboard --logdir ./logs
. -
Open TensorBoard and observe that both training and validation runs appear with two epochs’ worth of data:
-
As training continues, refresh TensorBoard and/or reload the page. Observe that validation data continues to appear, but training data has stalled—even after well after the training has completed, the plot is incomplete:
-
Kill the TensorBoard process and restart it. Note that the data appears as desired:
The same problem occurs in tf-nightly
(non-2.0-preview
), but
manifests differently: because there is only one run (named .
) instead
of separate train
/validation
, all data stops being displayed after
the epoch in which TensorBoard is opened.
Note as a special case of this that if TensorBoard is running before
training starts, then train
data may not appear at all:
Issue Analytics
- State:
- Created 4 years ago
- Reactions:8
- Comments:12 (8 by maintainers)
Top GitHub Comments
Just to note explicitly: setting
profile_batch=0
in the Keras callback options is a workaround that disables profiling entirely.FYI: The problem is still here, in 1cf0898dd of TF v2.0.0. Workaround above works.