Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to approach loss of less than 0.7 even when testing multiple learning rates.

See original GitHub issue

I have tried many different learning rates and optimizers but I have not once seen a min loss drop below 0.69.

If I use learning_rate = 1e-2:

iter:    20, loss min|avg|max: 0.713|2.607|60.013, batch-p@3: 4.43%, ETA: 6:01:18 (0.87s/it)
iter:    40, loss min|avg|max: 0.696|1.204|21.239, batch-p@3: 5.99%, ETA: 5:58:53 (0.86s/it)
iter:    60, loss min|avg|max: 0.696|1.643|25.543, batch-p@3: 4.69%, ETA: 5:36:32 (0.81s/it)
iter:    80, loss min|avg|max: 0.695|1.679|42.339, batch-p@3: 7.03%, ETA: 5:58:01 (0.86s/it)
iter:   100, loss min|avg|max: 0.694|1.806|47.572, batch-p@3: 6.51%, ETA: 6:08:57 (0.89s/it)
iter:   120, loss min|avg|max: 0.695|1.200|21.791, batch-p@3: 4.43%, ETA: 6:14:15 (0.90s/it)
iter:   140, loss min|avg|max: 0.694|2.744|87.940, batch-p@3: 5.47%, ETA: 6:21:29 (0.92s/it)

If I use learning_rate = 1e-6:

iter:    20, loss min|avg|max: 0.741|14.827|440.151, batch-p@3: 1.04%, ETA: 6:23:26 (0.92s/it)
iter:    40, loss min|avg|max: 0.712|9.662|146.125, batch-p@3: 2.86%, ETA: 6:03:24 (0.87s/it)
iter:    60, loss min|avg|max: 0.697|3.944|100.707, batch-p@3: 4.17%, ETA: 6:10:44 (0.89s/it)
iter:    80, loss min|avg|max: 0.695|2.408|75.002, batch-p@3: 2.86%, ETA: 5:44:48 (0.83s/it)
iter:   100, loss min|avg|max: 0.694|2.272|67.504, batch-p@3: 2.86%, ETA: 6:03:45 (0.88s/it)
iter:   120, loss min|avg|max: 0.694|1.091|17.292, batch-p@3: 2.86%, ETA: 5:42:45 (0.83s/it)
iter:   140, loss min|avg|max: 0.693|1.069|15.975, batch-p@3: 5.73%, ETA: 5:46:48 (0.84s/it)
...
iter:   900, loss min|avg|max: 0.693|0.694| 0.709, batch-p@3: 2.08%, ETA: 5:15:00 (0.78s/it)
iter:   920, loss min|avg|max: 0.693|0.693| 0.701, batch-p@3: 2.34%, ETA: 5:39:12 (0.85s/it)
iter:   940, loss min|avg|max: 0.693|0.694| 0.704, batch-p@3: 5.99%, ETA: 5:46:12 (0.86s/it)
iter:   960, loss min|avg|max: 0.693|0.693| 0.705, batch-p@3: 2.86%, ETA: 5:24:59 (0.81s/it)
iter:   980, loss min|avg|max: 0.693|0.693| 0.700, batch-p@3: 3.65%, ETA: 5:39:47 (0.85s/it)
iter:  1000, loss min|avg|max: 0.693|0.693| 0.698, batch-p@3: 3.39%, ETA: 5:27:59 (0.82s/it)
iter:  1020, loss min|avg|max: 0.693|0.693| 0.700, batch-p@3: 6.51%, ETA: 5:36:38 (0.84s/it)
iter:  1040, loss min|avg|max: 0.693|0.694| 0.699, batch-p@3: 2.86%, ETA: 5:22:05 (0.81s/it)
...
iter:  1640, loss min|avg|max: 0.693|0.693| 0.694, batch-p@3: 2.60%, ETA: 5:09:58 (0.80s/it)
iter:  1660, loss min|avg|max: 0.693|0.693| 0.694, batch-p@3: 2.08%, ETA: 5:48:27 (0.90s/it)
iter:  1680, loss min|avg|max: 0.693|0.693| 0.694, batch-p@3: 4.43%, ETA: 5:23:23 (0.83s/it)
iter:  1700, loss min|avg|max: 0.693|0.693| 0.694, batch-p@3: 6.51%, ETA: 5:25:04 (0.84s/it)
iter:  1720, loss min|avg|max: 0.693|0.693| 0.694, batch-p@3: 3.12%, ETA: 5:39:08 (0.87s/it)

What does this affectively mean? “Nonzero triplets never decreases” - not quite sure what that means?

I am using the vgg dataset with the file structure like this:

class_a/file.jpg
class_b/file.jpg
class_c/file.jpg
...

I set the pids, fids = [], [] like this:

classes = [path for path in os.listdir(DATA_DIR) if os.path.isdir(os.path.join(DATA_DIR, path))]
for c in classes:
    for file in glob.glob(DATA_DIR+c+"/*.jpg"):
        pids.append(c)
        fids.append(file)

where DATA_DIR is the directory of the vgg dataset.

Issue Analytics

State:
Created 5 years ago
Comments:57 (57 by maintainers)

Top GitHub Comments

1reaction

maxismecommented, Apr 28, 2018

I have learn’t two important things in the last week. Check your dataset about 5,000,000 times or write tests 😉 . Don’t let iteration count impact your assumption at all. I have used the parameters and as a last effort let it run over night:

tech = 'batch_hard'# 'batch_hard' 'batch_sample' 'batch_all'
arch = 'resnet_v1_50'
batch_k = 2
batch_p = 18
learning_rate = 3e-5
epsilon = 1e-8
optimizer_name = 'Adam' # Adam MO RMS
train_iterations = 50000
decay_start_iteration = 20000
checkpoint_frequency = 1000
net_input_size = (256, 256)
embedding_dim = 128
margin = 'soft' # 'soft'
metric='euclidean' #sqeuclidean
output_model = config.feature_model_dir + "tmp"
out_dir = output_model + "/save/"
log_every = 5

by iteration 25,000 pretty much every mean loss is below 0.7 and averaging 0.312 average min loss is < 0.05 but slightly worryingly the max loss has not changed at all 1-4. Looking forward to implementation!

1reaction

maxismecommented, Apr 26, 2018

As in one of toys or a small easy dataset? haha!

Top Results From Across the Web

What should I do when my neural network doesn't learn?

This Medium post, "How to unit test machine learning code," by Chase ... (e.g. learning rate) is more or less important than another...

How to Identify and Diagnose GAN Failure Modes

This is important. A stable GAN will have a discriminator loss around 0.5, typically between 0.5 and maybe as high as 0.7 or...

Keras Loss Functions: Everything You Need to Know

In deep learning, the loss is computed to get the gradients with respect to model weights and update those weights accordingly via ...

Understanding Fastai's fit_one_cycle method | IconOf.com

The cycle's size must be smaller than the total number of iterations/epochs. After the cycle is complete, the learning rate should decrease even...

Intro to optimization in deep learning: Gradient Descent

As we oscillate our this region, the loss is almost the minimum we can achieve, and doesn't change much as we just keep...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Unable to approach loss of less than 0.7 even when testing multiple learning rates.

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Embed a single image

Tests fail with initializationError(de.uni_stuttgart.vis.vowl.owl2vowl.ConsoleMainTest)