Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

hyperperameters setting of MAML and test code

See original GitHub issue

For the MAML reproducing, the document says that Note that the original MAML paper trains with 5 fast adaptation step, but tests with 10 steps. This implementation only provides the training code. So does that mean the test codes in the file maml_miniimagenet.py and maml_omniglot.py are not correct? I found that it uses the same function (fast_adapt) with training code.

Also, could you please provide the specific hyperperameters setting for the reproducing table of MAML for MiniImageNet and Omniglot? I use the default setting and did not change the test code, the results are not good. It says Only the fast-adaptation learning rate needs a bit of tuning, and good values usually lie in a 0.5-2x range of the original value. So we only need to fine tune thefast_lr in the range [0.5, 2] and use the test code in the current maml_miniimagenet.py and maml_omniglot.py? The original value means the fast_lr=0.5 in the code or the original value in the original fast adaptation lr in the original MAML paper?

I am sorry for these naive questions, I have tried many times and did not get the results in the table, which has token too much much time.

Thank you so much!

Issue Analytics

State:
Created 3 years ago
Comments:12 (7 by maintainers)

Top GitHub Comments

4reactions

Kostis-S-Zcommented, Apr 6, 2020

The code for training and testing i think is correct. The only thing that needs modification is the hyper-parameter values.

What I meant by this is that if you modify the values of the hyper-parameters in different places in the code, you can make it work as you want.

For evaluation, instead of using the meta_batch_size as the number of tasks you want to evaluate you can just change it for the value that you want e.g 1024.
Same for the function fast_adapt. It works just fine for meta-testing as well but you need to make some changes to the arguments. For example, the argument adaptation_steps can be changed from 5 to 10 when calling the function during evaluating. Also the step size (or learning rate) can be easily changed during meta-testing by writing maml.lr = 0.5 after training and before evaluating.
For 5-way 5-shot I actually managed to get 61.6%, 63.7% and 64.8% across 3 different seeds (actual seed values were 1, 2 and 3) with the exact same hyper-parameters! I run the experiments for 10.000 or 20.000 iterations. I am not sure why you are getting worse results…
I run all the experiments on a GTX 1080 Ti, however I run multiple experiments in parallel in the same GPU so there is some overhead. If I were to run one experiment at a time I would expect slightly better times than the ones below. This is also why the times might not be very consistent.

Mini ImageNet 5ways 5shots | 20.000iter | 26hours
Mini ImageNet 5ways 1shot | 20.000iter | 11hours
Omniglot 20ways 5shots | 10.000iter | 29hours
Omniglot 20ways 1shot | 10.000iter | 15hours

By the way I am not a developer of this project, just a big fan and user 😃

2reactions

seba-1511commented, Apr 8, 2020

Thanks for starting an interesting conversation @Hugo101 and for great answers @Kostis-S-Z.

I think there are two scenarios we’re discussing.

Use 5 samples per class for adaptation but 15 samples for evaluation: This requires changing how we assign samples in fast_adapt. (i.e. manipulating some indices) I think the ProtoNet example already does something similar.
Use 5 samples for some classes and 15 samples for others: This requires to define which class gets how many samples. Slightly trickier to implement, but you can roll your own task transform to achieve this. Since classes don’t have the same number of samples, you’d also need to be careful when assigning data to adaptation and evaluation splits.

Is there an additional case I am missing ? If you need help with the task transform, I’d be happy to provide some information. (Also, making custom task transforms would be a very useful tutorial.)

Top Results From Across the Web

Meta-Learning with Adaptive Hyperparameters - NIPS papers

In particular, model-agnostic meta-learning (MAML) encodes the prior knowledge into a trainable initialization, which allowed for fast adaptation to few ...

Intro to Meta-Learning – Weights & Biases - WandB

In this report, I focus on MAML for few-shot image classification, instrumenting the original code for the paper.

Testing More Hyperparameters with Less Code in Machine ...

These are known as hyperparameters — parameters that are used to control the learning process, and whose value cannot really be estimated from ......

How to train your MAML: A step by step approach - BayesWatch

MAML optimizes the initialization parameters of a given model, such that after N steps on the support set task, it will generalize well...

Gradient sparsity in meta and continual learning - arXiv

MAML over a range of hyperparameter settings governing the meta-learning ... Right: Sparse-MAML reaches higher test set accuracy for higher ...