Simple way to add Basic Ops in ChainerX
See original GitHub issueWhile working on the PR to add sin
and cos
functionality. I found that most of the code was same except for a verysmall part.
Also going through the PR #6590 , it seems that a lot of same code needs to be added/modified at various places.
So I tried to do it with a simple Macro based on Tanh implementation.
// Can be improved !!??
#define UnaryOp_Cuda(func, func_def) \
namespace { \
template <typename T> \
struct func##Impl { \
using CudaType = cuda_internal::DataType<T>; \
__device__ func_def \
}; \
} \
\
void CudaDevice::func(const Array& x, const Array& out) { \
CheckDevicesCompatible(x, out); \
CudaSetDeviceScope scope{index()}; \
const Array& x_cast = x.dtype() == out.dtype() ? x : x.AsType(out.dtype()); \
VisitFloatingPointDtype(out.dtype(), [&](auto pt) { \
using T = typename decltype(pt)::type; \
Elementwise<const T, T>(func##Impl<T>{}, x_cast, out); \
}); \
}
which allows to define a new Operator simply by
UnaryOp_Cuda(Sqrt, void operator()(int64_t /*i*/, CudaType x, CudaType& out) { out = cuda::Sqrt(x); })
From a glance it can also be extended for many standard Binary Operators.
I have tried this with macro for native and converted similar Unary Operators and it built successfully. All changes can be found here
Also it seems important to have simple code generation script to facilitate these simple cases.
- name : Sqrt
- operator_type : Unary
- transform : out = Sqrt(in)
- name : Cube
- operator_type : Unary
- transform : out = (in * in * in)
The above yaml example could be easily parsed to generate Header file and Implementation for Standard Operation for Cuda and Native or any other backend( Unary , Binary ). It will also facilitate making bulk changes with ease.
Would love to know your views on this. Thank You.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (7 by maintainers)
Top GitHub Comments
Oh, @hvy @niboshi , thanks for taking interest. Will soon hit you with a PR.
A drop by comment by how about ELTWISE instead of ELMWISE? The former seems to be more common (Google search hits). I observed it in contexts/source code of TF, CUDA and Caffe at least.