question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support a balanced image dataset generator for TripletLoss function

See original GitHub issue

I have been advised by @fchollet to open the proposed solution in keras-cv instead of keras.

The problem when training a neural networ using TripletLoss is that as the number of classes increases, it is highely unlikely to produce postive pairs in every batch. This results in a division by zero and thus a NaN loss and no further training happens.

The proposed data generator makes sure that a user-defined number of classes and a user-defined number of samples per class is included in every batch. If follows most of the normal image_dataset_from_directory API

Please see this PR in Keras Core: https://github.com/keras-team/keras/pull/15990

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
ma7555commented, Feb 7, 2022

An example of the full TripletLoss workflow: Colab.

I guess I will close for now. If you feel this API is needed please reopen. Thanks!

1reaction
LukeWoodcommented, Feb 3, 2022

There is a project called Tensorflow Similarity, which specifically provides an API for metric learning and self-supervised learning.

They have a module with different data samplers, designed to sample classes properly for these types of losses. Specifically they have the MultiShotMemorySampler which could be used to solve your problem. They have a Hello World notebook where they use this sampler.

These samplers subclass from tf.keras.utils.Sequence, so they can be used in any keras workflow.

@LukeWood How will KerasCV position itself in relation to TF Similarity regarding a SSL API?

Yeah, tf similarity is happy to user KerasCV apis as we create them. They’re going to adopt our CutMix and RandAugment once I finish it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

TripletSemiHardLoss Returns NaN With Large Number of ...
Triplet loss needs triplets to work (anchor image, positive image, negative image). You need to make sure to always get enough triplets when...
Read more >
How To Train Your Siamese Neural Network
This mirrors the triplet loss function equation shown previously. Step 6: Defining the data generator. In order to pass our triplets to the ......
Read more >
How to make Dataset for triplet loss - tensorflow - Stack Overflow
I tried to do this with tf.data.Dataset.from_generator and with tf.data.experimental.choose_from_datasets but with no success. Do you have an ...
Read more >
TripletGAN: Training Generative Model with Triplet Loss
Theoretical proof based on IPM (Integral probability metric) demonstrates that such setting will help generator converge to the given distribution theoretically ...
Read more >
(PDF) TripletGAN: Training Generative Model with Triplet Loss
Theoretical proof based on IPM (Integral probability metric) demonstrates that such setting will help the generator converge to the given ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found