Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] After updating to 0.5.1 multi-gpu support have not been working

See original GitHub issue

Hi! Thank you for the great lib!

After updating from 0.2.4 to 0.5.1 only one gpu out of two is used. The code almost the same:

hyper_parameters = BLSTMModel.get_default_hyper_parameters()
hyper_parameters['layer_bi_lstm']['units'] = 1024

model = BLSTMModel(embedding, hyper_parameters=hyper_parameters)
model.build_model(X_train, y_train)
model.build_multi_gpu_model(gpus=2, x_train=X_train, y_train=y_train, x_validate=X_valid, y_validate=y_valid)
model.fit(X_train, y_train, epochs=15, batch_size=512, x_validate=X_valid, y_validate=y_valid, callbacks=[tf_board_callback, checkpoint_callback])

NVtop and nvidia-smi showing that only one gpu is working. In the previous version I have both of my gpus been used.

Issue Analytics

State:
Created 4 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

6reactions

jeshurencommented, Jul 23, 2019

I am also facing the same issue. Would be great if someone could help.

UPDATE:

Try without calling the build_model() method. It works!

model = BLSTMModel(embedding, hyper_parameters=hyper_parameters)

model.build_multi_gpu_model(gpus=2, x_train=X_train, y_train=y_train, x_validate=X_valid, y_validate=y_valid)

model.fit(X_train, y_train, epochs=15, batch_size=512, x_validate=X_valid, y_validate=y_valid, callbacks=[tf_board_callback, checkpoint_callback])

0reactions

sldcommented, Jul 24, 2019

Thanks! This https://github.com/BrikerMan/Kashgari/issues/170#issuecomment-513875786 worked!