Variable and Parameter specification details
See original GitHub issueThis issue is to discuss the behavior of Variable
and Parameter
.
1. Lazy initialization of Parameter
The constructor of Parameter
(__init__
) receives initializer
and shape
arguments. It initializes an internal _data
with the specified initializer
when its type is array even if shape
is unknown (None
).
It can be initialized lazily when once the shape
is specified.
2. Initialization of initialized Parameter
The current implementation allows us to initialize
a Parameter
even if shape
is already specified in the constructor. In this case, initialize
configures _grad
as None
because _grad_initializer is None
.
Some tests validates grad
is to be None
in this setting, but it seems it is not optimal behavior.
3. Handling of Variable(None)
At the moment, Variable(None).to_chainerx().xp
(and get_array_module(None)
) returns numpy
because Variable
treats numpy
as a default device.
This behavior makes it a little bit difficult to keep the consistency of the internal attributes because they can be configured against chainerx.ndarray
while xp is numpy.ndarray
(equivalent to self._data[0] is None
).
4. Reassigning array coming from another device (#6058)
At the moment, the setter of array
accepts an array located on any device. It can be a problem, for example, if array
receives chainerx.ndarray
while it has a cupy.ndarray
because Variable
handles its internal attributes in different manners depending on _data[0]
is a ChainerX array or not.
There are several options to deal with this issue:
- Raise an error when it receives an array from a different device
- Fix to handle the case when the device is changed
2nd option has been implemented in #6072, but it is unclear how to handle grad
when reassigning array
in this situation (can we overwrite grad
in ChainerX array with the existing variable’s grad
?).
5. copydata
implementation (#4074)
Variable.copydata()
cannot be used against Variable
because the implementation uses initialize
method in Parameter
. It should be fixed or moved to Parameter
class.
6. requires_grad
semantics not well documented/defined (#5454)
One point was that requires_grad is used mostly to optimize performance, not creating graphs when unnecessary. However, the semantics could be clarified somewhat; exactly when the requires_grad can be expected to be True or False and by whom.
Issue Analytics
- State:
- Created 5 years ago
- Comments:11 (7 by maintainers)
Top GitHub Comments
@okapies Should the
requires_grad
be covered in this issue as well you think? This issue addresses that fact that it’s not very well documented (although being a public interface) https://github.com/chainer/chainer/issues/5454This issue is closed as announced. Feel free to re-open it if needed.