Issue with class-incremental SI and LwF
See original GitHub issueHi!
We are currently trying to use Avalanche in our research, as it looks like an amazing tool providing a lot of ready-to-use tools. However, we encountered some issues that stopped us from moving further.
Our goal is to work with class-incremental scenarios. We experimented on MNIST, built in two different ways - using SplitMNIST
benchmark and building a benchmark leveraging nc_benchmark
method.
scenario_toggle = 'nc_MNIST' # 'nc_MNIST' (nc_benchmark) or 'splitMNIST' (SplitMNIST)
task_labels = False
if scenario_toggle == 'splitMNIST':
scenario = SplitMNIST(n_experiences=5, return_task_id=task_labels, fixed_class_order=list(range(10)))
elif scenario_toggle == 'nc_MNIST':
train = MNIST(root=f'data', download=True, train=True, transform=train_transform)
test = MNIST(root=f'data', download=True, train=False, transform=test_transform)
scenario = nc_benchmark(
train, train, n_experiences=5, shuffle=False, seed=1234,
task_labels=task_labels, fixed_class_order=list(range(10))
)
We tried using two strategies: LwF and SI. We tried both values of scenario_toggle
: splitMNIST and nc_MNIST. However, the evaluation results in both cases suggest that only the last experience is remembered and recognized. All other experiences have an accuracy equal to 0.00, which is unexpected and suggests that something is wrong.
Sample results:
eval_exp,training_exp,eval_accuracy,eval_loss,forgetting
0,0,1.0000,0.0000,0
1,0,0.0000,13.4988,0
2,0,0.0000,12.4454,0
3,0,0.0000,16.2600,0
4,0,0.0000,16.5519,0
0,1,0.0000,17.6948,1.0000
1,1,0.9998,0.0011,0
2,1,0.0000,13.9904,0
3,1,0.0000,14.3507,0
4,1,0.0000,15.7323,0
0,2,0.0000,14.9688,1.0000
1,2,0.0000,23.8310,0.9998
2,2,1.0000,0.0000,0
3,2,0.0000,13.4626,0
4,2,0.0000,14.5919,0
0,3,0.0000,21.7956,1.0000
1,3,0.0000,26.1762,0.9998
2,3,0.0000,33.1996,1.0000
3,3,1.0000,0.0001,0
4,3,0.0000,21.8687,0
0,4,0.0000,18.1376,1.0000
1,4,0.0000,14.5067,0.9998
2,4,0.0000,20.7260,1.0000
3,4,0.0000,24.5977,1.0000
4,4,0.9990,0.0035,0
Similar behavior can be observed for all combinations (splitMNIST, nc_MNIST combined with LwF and EWC) when task_labels = False
. When we change the task_labels
to True, the results start to make sense with values between 0.6 and 1 for all previously learned experiences.
We are not sure whether the problem is in our approach, our code, or maybe if there is some bug impacting our results. Therefore, we have a few questions:
- Is our approach valid? Is setting
task_labels
toFalse
equal to creating class-incremental benchmarks? Andtask_labels = True
brings task-incremental scenario? - Is there any reason why the results look like that? Is it the issue with how we use benchmarks?
We will appreciate any suggestions, as we have already spent some time with Avalanche and we would love to leverage all the tools it provides.
I am providing the minimal test project we prepared. avalanche-test-project.zip
Issue Analytics
- State:
- Created a year ago
- Comments:16
Top GitHub Comments
task_labels=False
in both SplitMNIST and nc_benchmark. Task labels will always be 0 for each experience and targets will be in the range 0-9.Hope this helps 😄
Follow-up comment on @AntonioCarta’s: Here is a paper showing that SI and LwF almost fail in class-incremental scenarios for Split-MNIST: https://arxiv.org/pdf/1904.07734.pdf
By changing your architecture to the one used in the CL baselines repository, you may get a small increase in the average accuracy (better than complete forgetting) for LwF.