Saved model behaves differently on different machines
See original GitHub issueAfter studying #439, #2228, #2743, #6737 and the new FAQ about reproducibility, I was able to get consistent, reproducible result on my development machines using Theano. If I run my code twice, I get the exact same results.
The problem is that the results are reproducible only on the same machine. In other words, if I
- Train a model on machine A
- Evaluate the model using
predict
- Save the model (using
save_model
, ormodel_to_json
andsave_weights
) - Transfer the model to machine B and load it
- Evaluate again the model on machine B using
predict
The results of the two predict
s are different. Using CPU or GPU makes no difference - after I copy the model file(s) from a machine to another, the performance of predict
changes dramatically.
The only difference on the two machines is the hardware (I use my laptop’s 980M and a workstation with a Titan X Pascal) and the NVIDIA driver version, which is slightly older on the workstation. Both computers run Ubuntu 16.04 LTS and Cuda 8 with cuDNN. All libraries are on the same version on both machines, and the Python version is the same as well (3.6.1).
Is this behavior intended? I expect that running a pre-trained model on with the same architecture and weights on two different machines yields the same results, but this doesn’t seem the case.
On a side note, a suggestion: on the FAQs about reproducibility, it should be explicitly stated that the development version of Theano is needed to get reproducible results.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:12
- Comments:29
@basaldella Have you fixed this issue? It seems that I have the same problem. I re-traind a model with fine tuning InceptionV3 with my own images on a GPU machine. After training, the accuracy could up to 91% which I am happy with it. During the training the improved model was saved with callbacks. So I can load the best retrained model with model.load_model(model_path), and I tested it with one image. The predict results are always the same and correct (because I know what this image belong to). the results is like this: [[ 0.00197385 0.01141251 0.02262068 0.9121536 0.00810914 0.01657074 0.00370198 0.00617629 0.00972648 0.00531203 0.00224261]]
Now, I try to copy the retrained model (HDF5 file) to my laptop, and load the model again, and test the model with the same image, then I got a totally different result. [[ 0.00373867 0.22160383 0.10066977 0.35440436 0.02839879 0.17799987 0.01744748 0.02645957 0.0299265 0.03026218 0.00908909]]
The python environment are the same in the two machine with keras 2.0.8: The result are always the same in the same machine. The weights are the same after I load the model file. …I checked many things.
Why the results are different in the two machine? Is there somebody know about this?
@basaldella Yes, turns out my issue was more along the lines of #4875, and was inconsistent between different Python sessions, not just different machines.