Improve customizability of BopSee original GitHub issue
Currently when using the Bop optimizer, it will update the weights on all quantized layers: https://github.com/larq/larq/blob/b32830b4cca4a69ac1daf1d176239971112620e4/larq/optimizers_v2.py#L83-L84
This can be problematic if one wants to use a quantized layer with no
kernel_quantizer or a quantizer with higher precision.
It would be good to have a more fine grained control over which layers are trained with Bop and which use other precision.
One possible implementation would be to add a fake
lq.quantizers.bop function that doesn’t change the forward pass, but marks this kernel so Bop should handle weight updates. This could be achieved by using a specific name scope or adding an attribute that Bop recognises.
An other possibility would be to explicitly pass a list of layers or variables to Bop.
Can you think of a better way to handle this?
- Created 4 years ago
- Comments:8 (8 by maintainers)
Top GitHub Comments
Reopening for now because we haven’t addressed the
is_binary() part yet.
@lgeiger and I chatted about this earlier today. Outcome of our discussion:
We have the following three choices:
- Current setup: Bop has an
optimizerparameter to which we pass a real-valued optimizer. Bop takes care of whether it or the real-valued optimizer trains a given variable.
model.compilecan take more than one optimizer. Forking the
fitfunctions would then take care of which optimizer trains which weights.
- Creating an
OptimizerGroupwith the binary and real optimizers as attributes, which subclasses
tf.keras.optimizers.Optimizerwe passs as optimizer to
model.compile. This then takes care of which optimizer trains which weights.
We decided to go for 3. for several reasons. One is separation of concerns: the
OptimizerGroup will be able to take care of the complexities of selecting which optimizer trains which weights, so that each optimizer just has to do “optimizer stuff.” It will also be able to take care of things like the following, which currently Bop needs to do:
def __getattr__(self, name):
if name == "lr":
Another reason is that, for now, 2. looks like too much work because
model.fit do a lot of different things, and updating/maintaining those to work with multiple optimizers would be a lot of work.
is_binary more robust, we’re going to try setting a
_is_binary (or similar) attribute on the variables of the layers that need to be optimized by Bop. We’ll replace calls to this:
return "/kernel" in var.name and "quant_" in var.name
With just a
hasattr("is_binary") check. Setting this explicitly should be more robust than checking the generated names of layers.
I’ll begin with implementing the
OptimizerGroup and then move to the