AutoKeras + Deep Metrics Learning (i.e. triplet loss)
See original GitHub issueFeature Description
Hi,
In TensorFlow Addons there are many Deep Metrics Learning (DML) losses been implemented (such as triplet loss (tutorial), contrastive loss, lifted struct loss, etc) that are used to learn an embedding space as output (a classic example is FaceNet) that uses triplet loss to learn an embedding space for clustering. DML losses are also useful for few shots learning (learning from very few examples).
Code Example
# Prepare the dataset.
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Initialize the ImageEmbedder.
embedder = ak.ImageEmbedder(max_trials=3)
# Search for the best model.
embedder.fit(x_train, y_train, epochs=5)
# Get the embeddings for test set
test_embeddings = embedder.predict(x_test)
Here is the notebook that shows a quick implementation of ImageEmbedder
and EmbeddingHead
, which is very straightforward with a few small changes to ClassificationHead
(instead of adding an output dense layer for num_class
size we added a FC layer with embedding_size
plus normalization layer in build()
)
Similarly we can have ak.TextEmbedder
and ak.StructuredDataEmbedder
using the same EmbeddingHead
.
Reason
This feature will be helpful for anyone who wants to build a DML model with AutoKeras and since there are many different types of losses that all try to learn the embedding space we can also set the type of loss as Hyperparameter to help users find the best DML model for their data/research.
Solution
I originally thought I can just use ImageClassification
with triplet_loss
from TensorFlow Addons to train a DML model with AutoKeras, but it turns out that this doesn’t work well.
notebook This notebook shows a quick implementation of EmbeddingHead
and ImageEmbedder
. I am very happy to contribute!
Issue Analytics
- State:
- Created 3 years ago
- Reactions:3
- Comments:15 (1 by maintainers)
Top GitHub Comments
and also @sidphbot about how many classes is your dataset? If you are not generating triplets from the input (instead of relying on the default sampling) you may want larger batch size (don’t forget to adjust your learning rate accordingly if needed) so your batch contains at least 2 samples from the same class (one for the anchor and another for positive), most of the DML losses implemented on TFA assumed that. But if you can’t increase your batch size, you may want to want to do something similar to https://stackoverflow.com/questions/55484923/how-to-make-dataset-for-triplet-loss for your input.
Thanks for asking that, i have been meaning to write about it and also get opinions but did not get time. The labels being taken as input may seem ambiguous but it is passed at the last only for computing loss and is not passed through any learnable layers, you can verify the same inside the arcface layer class implementation.
If you use the extracted encoder for a separate validation for image retrieval using a knn or index like NearPy(ANN) you can verify the test accuracy of about 85% top1 and 88% top5 accuracy score on unseen query images(the evaluated hyper-model has some differences so scores may vary slightly).
It might be possible to separate the logic for loss computation but i have left it according to the keras-arcface implementation linked in my page, i will look onto it. However I will shortly upload the validation example on the feature encoder.
Do let me know your thoughts on this.