Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to adjust learning rate as model trains

See original GitHub issue

Hi all –

Sorry for what might be an obvious question for some -

How do you go about adjusting the learning rate with the number of elapsed steps? I assume it’s by modifying some stuff in the train.py file - I’m just not exactly sure what as I have zero experience with tensorflow.

It looks to me like the actual execution of each training generation happens within the for loop that begins on line 295, with the sess.run( ... ) command.

I’m assuming I would do something that would check against which step we are on in the for loop, or, alternatively, what the current loss is, and then make a new optim (as seen on lines 256-60)? Then pass that in to sess.run…

So for instance (psuedocode) if (step % 1000 == 0) learningRate*=.5 o r if (step % 50 == 0) learningRate = f(loss) where f(loss) is a function that scales down the learning rate as the loss gets lower and lower…

Is this correct?

Let me know. Thanks.

Issue Analytics

State:
Created 6 years ago
Comments:8

Top GitHub Comments

3reactions

lemonzicommented, May 19, 2017

@cjonesy67 The sess.run method looks like:

learning_rate_placeholder = tf.placeholder(tf.float32, [], name='learning_rate')
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate_placeholder)
train_op = optimizer.minimize(loss)
for step in range(max_steps):
    learning_rate = 1e-3 if step < 1000 else 1e-4
    sess.run(train_op, feed_dict={learning_rate_placeholder: learning_rate})

You would need to edit line 256 as well as the main loop at the bottom of train.py.

1reaction

lemonzicommented, May 10, 2017

@delta-6400 Having a placeholder for the learning rate is easy and works. However, there are built-in functions that implement different strategies:tf.train.exponential_decay(), tf.train.inverse_time_decay(), tf.train.piecewise_constant()… I think they are TensorFlow-native and use Variables to keep state.