Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DP doesn't work in TF2.0rc0

See original GitHub issue

It seems that TF DP lib doesn’t work properly with TF2.0rc0. I ran mnist_dpsgd_tutorial_keras.py with and without DP. As expected, with TF v1.14 the loss decreases in both cases (and quite low), but less with DP, with runtime ~6s and ~700s per epoch without and with DP, respectively. However, with TF v2.0rc0 and DP the loss is stagnating (except the first epoch where the loss is very high) and the runtime is similar to that without DP (!). The number of microbatches has no effect neither on quality nor on performance.

TF DP lib v0.0.1, DP settings: ‘noise_multiplier’, 1.1, ‘l2_norm_clip’, 1.0, ‘batch_size’, 250 ‘microbatches’, 250

TF1.14 WITHOUT DP

Train on 60000 samples, validate on 10000 samples
Epoch 1/5
60000/60000 [==============================] - 5s 80us/sample - loss: 0.5633 - acc: 0.8220 - val_loss: 0.1077 - val_acc: 0.9655
Epoch 2/5
60000/60000 [==============================] - 5s 76us/sample - loss: 0.1152 - acc: 0.9640 - val_loss: 0.0810 - val_acc: 0.9750
Epoch 3/5
60000/60000 [==============================] - 5s 79us/sample - loss: 0.0758 - acc: 0.9765 - val_loss: 0.0519 - val_acc: 0.9832
Epoch 4/5
60000/60000 [==============================] - 5s 79us/sample - loss: 0.0580 - acc: 0.9827 - val_loss: 0.0441 - val_acc: 0.9858
Epoch 5/5
60000/60000 [==============================] - 5s 81us/sample - loss: 0.0488 - acc: 0.9853 - val_loss: 0.0374 - val_acc: 0.9881
Trained with vanilla non-private SGD optimizer

TF1.14 WITH DP

Train on 60000 samples, validate on 10000 samples
Epoch 1/5
60000/60000 [==============================] - 723s 12ms/sample - loss: 1.5364 - acc: 0.5419 - val_loss: 0.8097 - val_acc: 0.7255
Epoch 2/5
60000/60000 [==============================] - 621s 10ms/sample - loss: 0.6963 - acc: 0.7780 - val_loss: 0.5639 - val_acc: 0.8271
Epoch 3/5
60000/60000 [==============================] - 613s 10ms/sample - loss: 0.5671 - acc: 0.8376 - val_loss: 0.4849 - val_acc: 0.8686
Epoch 4/5
60000/60000 [==============================] - 664s 11ms/sample - loss: 0.5170 - acc: 0.8648 - val_loss: 0.4468 - val_acc: 0.8886
Epoch 5/5
60000/60000 [==============================] - 706s 12ms/sample - loss: 0.4694 - acc: 0.8850 - val_loss: 0.4182 - val_acc: 0.9004
For delta=1e-5, the current epsilon is: 1.22

TF2.0rc0 WITHOUT DP

Train on 60000 samples, validate on 10000 samples
Epoch 1/5
60000/60000 [==============================] - 4s 73us/sample - loss: 0.5679 - accuracy: 0.8147 - val_loss: 0.1167 - val_accuracy: 0.9640
Epoch 2/5
60000/60000 [==============================] - 4s 69us/sample - loss: 0.1001 - accuracy: 0.9693 - val_loss: 0.0715 - val_accuracy: 0.9778
Epoch 3/5
60000/60000 [==============================] - 4s 69us/sample - loss: 0.0701 - accuracy: 0.9781 - val_loss: 0.0541 - val_accuracy: 0.9823
Epoch 4/5
60000/60000 [==============================] - 4s 71us/sample - loss: 0.0589 - accuracy: 0.9818 - val_loss: 0.0473 - val_accuracy: 0.9854
Epoch 5/5
60000/60000 [==============================] - 5s 75us/sample - loss: 0.0497 - accuracy: 0.9848 - val_loss: 0.0416 - val_accuracy: 0.9868
Trained with vanilla non-private SGD optimizer

TF2.0rc0 WITH DP

Train on 60000 samples, validate on 10000 samples
Epoch 1/5
60000/60000 [==============================] - 4s 74us/sample - loss: 1364.3170 - accuracy: 0.1005 - val_loss: 6.0033 - val_accuracy: 0.0980
Epoch 2/5
60000/60000 [==============================] - 4s 66us/sample - loss: 6.6457 - accuracy: 0.0985 - val_loss: 6.3509 - val_accuracy: 0.0980
Epoch 3/5
60000/60000 [==============================] - 4s 69us/sample - loss: 6.5087 - accuracy: 0.1015 - val_loss: 6.7987 - val_accuracy: 0.1032
Epoch 4/5
60000/60000 [==============================] - 4s 72us/sample - loss: 6.6955 - accuracy: 0.1020 - val_loss: 6.5389 - val_accuracy: 0.1028
Epoch 5/5
60000/60000 [==============================] - 4s 74us/sample - loss: 6.6641 - accuracy: 0.1013 - val_loss: 5.5773 - val_accuracy: 0.0980
For delta=1e-5, the current epsilon is: 1.22

Issue Analytics

State:
Created 4 years ago
Comments:8

Top GitHub Comments

7reactions

selimyoussrycommented, Dec 2, 2019

I confirm, it’s December 2nd, and it still doesn’t work with TF2.0.0. There has been this commit https://github.com/tensorflow/privacy/commit/d69879d36011ab82dd15f3780bf7bb1ed7de83ca but it doesn’t seem to have fixed the training issue with TF2.0.0, stuck at ~0.1 accuracy.

TF1.15 works as expected though.

Any updates on where we could expect this repository to be compatible with tensorflow 2.0.0? Thank you

2reactions

psitroniccommented, Sep 24, 2019

The reason is the:

Unification of tf.train.Optimizers and tf.keras.Optimizers. Use tf.keras.Optimizers for TF2.0. compute_gradients is removed as public API, and use GradientTape to compute gradients.