ChainerX needs to support more routines
See original GitHub issueUpdate
We’ve introduced an Op
registration and dispatch mechanism. This practically means that Device
methods will be replaced by Op
implementations. E.g. Device::Arange
-> ArangeOp : public Op
. The following description will soon be updated to take this into account.
class FuncOp : public Op {
public:
static const char* name() { return "Func"; }
// Call is overridden per device. Does not create a graph but merely performs device computations such as calling a kernel in case of CUDA.
// Call can have any signature.
virtual void Call(..., const Array& out) = 0;
// Another definition would be
// virtual Array Call(..., const nonstd::optional<Array>& out) = 0;
};
class CudaFuncOp : public FuncOp {
// Override Call.
};
CHAINERX_REGISTER_OP_CUDA(FuncOp, CudaFuncOp); // Allows backend.CallOp<FuncOp>(...);
Array Func(...) { // A routine called `Func`.
Array out = ....;
{
NoBackpropModeScope scope{};
device.backend().CallOp<ArangeOp>(..., out);
}
// Create graph.
return out;
}
The current repository of backpropable operations, or “routines” in ChainerX is still limited. We’d like to open this as a “contribution-welcome”-labeled issue for any contributor to introduce new routines and take part in the early development of ChainerX.
References
- ChainerX routines spreadsheet including statuses and priorities.
- Available device methods (low-level data manipulating operations)
Implementing routines
What kinds of routines are missing?
Routines that need to be implemented probably fall into either of the following two categories.
- NumPy compatible
{numpy,chainerx}
functions and{chainerx,numpy}.ndarray
methods. - Deep learning routines such as convolutions, pooling, RNN-type routines, etc.
Please make sure that it’s not already implemented by checking the list of available routines.
How do you start implementing routines?
-
Make sure you can build ChainerX. Instructions here.
-
If you are unsure which routine to start working on, refer to this list or create an issue suggesting or asking for one. Some routines will require device implementation (for each backend, i.e. native and CUDA) while for some, it might be sufficient to use existing device methods. The latter is easier to work on but it might not be obvious at first which routine that applies to unless you know the implementation details beforehand and what device methods are already available (see list above).
-
Implement the routine.
- Check if the routine is temporarily made available via the NumPy/CuPy fallback mechanism. If it is, delete the fallback.
- Declare a routine interface.
- Define forward pass using device methods. If device methods are missing, implement them.
- Define backward pass using
chainerx::BackwardBuilder
. - Declare the routine as a
chainerx::Array
method if appropriate. - Write Python bindings and tests using test utilities.
Getting familiar with the ChainerX code base
Here are some starting points.
To get familiar with the C++ code base.
Array (with autograd)
chainerx::Array
: The interface to arrays.chainerx::ArrayBody
: The actual array implementation.
Routines
chainerx/routines
: Defines “routines”, i.e. forward/backward operations on theArray
such as taking the sum or applying a convolution.chainerx::BackwardBuilder
: Extends the computation graph and is used by routines.chainerx::Device
: A device interface with operations on arrays. The device interface is currently implemented bychainerx::native::NativeDevice
andchainerx::cuda::CudaDevice
. A routine delegates the actual computation to these devices. Note that these operations only operate on the raw data and should not involve any graph operations (this might change).chainerx/native
: Contains native implementations includingchainerx::native::NativeDevice
.chainerx/cuda
: Contains CUDA implementations includingchainerx::cuda::CudaDevice
.
Graph
chainerx::ArrayNode
: A node representing an array in the computational graph. It is owned by anchainerx::ArrayBody
.chainerx::OpNode
: A node representing an operation in the computational graph.
Other
chainerx::Context
: Manages the runtime state. A context has backends, which have devices.- Units tests are written next to their source files being tested, i.e.
chainerx/routines/logic.h
is tested bychainerx/routines/logic_test.cc
. You can take a look at the routine tests to see how arrays are used. - Python bindings are created with pybind11.
- ChainerX C++ MNIST example.
Please note that the descriptions above may change as ChainerX is being developed.
Coding style
Please refer to https://github.com/chainer/chainer/issues/5778
Ongoing / Status
NOT UPDATED (since there are more PRs than expected and it’s difficult to maintain the status here)
-
Minimum
https://github.com/chainer/chainer/pull/6477 (Missing Python bindings and tests. Open for contribution) - Array-Array
Minimum
https://github.com/chainer/chainer/pull/6541 - Array-Array
Maximum
https://github.com/chainer/chainer/pull/6570 -
Sigmoid
https://github.com/chainer/chainer/pull/6472 -
Square
https://github.com/chainer/chainer/pull/6486 -
Dot
for ndim > 2 https://github.com/chainer/chainer/pull/6476 -
Power
https://github.com/chainer/chainer/pull/6496 -
SquaredDifference
https://github.com/chainer/chainer/pull/6501 -
Pad
https://github.com/chainer/chainer/pull/6597/ -
Sin
,Cos
https://github.com/chainer/chainer/pull/6601 -
ArgMin
~https://github.com/chainer/chainer/pull/6650~ https://github.com/chainer/chainer/pull/6740 -
Meshgrid
https://github.com/chainer/chainer/pull/6668 -
Ceil
https://github.com/chainer/chainer/pull/6705 -
Floor
https://github.com/chainer/chainer/pull/6707 -
Tan,ArcSin,ArcCos,ArcTan
https://github.com/chainer/chainer/pull/6703 -
Min/AMin
https://github.com/chainer/chainer/pull/6752
Issue Analytics
- State:
- Created 5 years ago
- Reactions:4
- Comments:41 (35 by maintainers)
Top GitHub Comments
Thanks for showing interest. Please take a look at the top description since it includes a link to a spreadsheet of routines and their statuses. You might want to check the source code before starting on one to really make sure that it hasn’t been implemented though.
Just another heads up but we should to for each ChainerX routine that has a corresponding Chainer function (
chainer.functions.*
) also overrideforward_chainerx
in that Chainer function. This should improve the performance of the Chainer function when used with ChainerX since fallback is avoided.