Possible Replay bug
See original GitHub issueHi all,
I’m working on implementing Class-Balanced and Experience-balanced Replay (See #479). I want to provide the opportunity for the user to select a fixed capacity prior or adaptive to all seen experiences so far.
However, when looking at the Replay code I find it hard to follow:
single_task_mem_size = min(self.mem_size, len(curr_data))
h = single_task_mem_size // (strategy.training_exp_counter + 1)
remaining_example = single_task_mem_size % (
strategy.training_exp_counter + 1)
# We recover it using the random_split method and getting rid of the
# second split.
rm_add, _ = random_split(
curr_data, [h, len(curr_data) - h]
)
In the first line:
https://github.com/ContinualAI/avalanche/blob/9cf3c53d83ffc1b3dfe4400d43356ebcd901cfc9/avalanche/training/plugins/replay.py#L64
What if len(curr_data) is smaller than self.mem_size? It would occur to me that ‘h’ is not correct anymore. Even though the current data length len(curr_data)
may not equal the full memory size, it can still have capacity for a mem_size/n_observed_exps
portion of the memory.
So h
should become self.mem_size // (strategy.training_exp_counter + 1)
?
When taking the random_split however, then we should take the ‘min’ operation into account:
h2 = min(capacity_per_exp, len(curr_data))
rm_add, _ = random_split(
curr_data, [h2, len(curr_data) - h2]
)
Am I missing something here or is this a bug?
Issue Analytics
- State:
- Created 2 years ago
- Comments:17 (3 by maintainers)
Top GitHub Comments
A single PR is ok.
Thank you @Mattdl, I’ll try this immediately!