[RFC][Relay][HalideIR] Automatically generate the AST
See original GitHub issueI have begun to experiment with writing a new library called astgen
to replace the large quantity of boilerplate required by the AST today, and enable us to more flexibly evolve the node system, and its APIs.
The first version of this tool will take a Python file like this:
import astgen
import tvm
class Expr:
pass
@astgen.astgen
class Constant(Expr):
"""
\\brief Constant tensor, backed by an NDArray on the cpu(0) device.
\\note Scalar constants are represented by rank-0 constant tensors,
enabling uniform constant folding over scalars and tensors.
"""
"""The data of the tensor."""
data: tvm.ndarray.NDArray
astgen.generate_all("expr.h", "tvm::relay")
and produce this C++ file:
namespace tvm {
namespace relay {
/*!
* \brief Constant tensor, backed by an NDArray on the cpu(0) device.
* \note Scalar constants are represented by rank-0 constant tensors,
* enabling uniform constant folding over scalars and tensors.
*
*/
class Constant;
/*!
* \brief Constant container.
*
*/
class ConstantNode : public ExprNode {
public:
void VisitAttrs(tvm::AttrVisitor* v) final {
v->Visit("data", &data);
}
TVM_DLL static Constant make(runtime::NDArray data);
static constexpr const char* _type_key = "relay.Constant";
TVM_DECLARE_NODE_TYPE_INFO(ConstantNode, ExprNode);
};
}
RELAY_DEFINE_NODE_REF(Constant, ConstantNode, Expr);
} // relay
} // tvm
This compliments Tianqi’s recent proposal to evolve the low level IR see #3474.
Specifically by not hand writing all AST code, we should be able to flexibly change representation without requiring extensive refactors, and make unifying the IRs of TVM less effort as time goes on.
A secondary goal of mine is to allow any language with a C ABI compatible FFI to construct and manipulate TVM ASTs.
By supporting this we could allow users to build tools in languages of choice without having to change how we develop the core of TVM.
Furthermore this will improve Python interop. as we will no longer have to deal with hidden C++ fields as is the case today.
Unfortunately we have heavily relied on C++ objects, and C++ datatypes such as std::string
and resolving these are essential to provide an FFI friendly AST.
I hope the community can help come up with a design for Relay’s AST using a code generation based approach.
My goal is to first replace the AST today with little to no changes, and then incrementally evolve it over time.
I will follow up with more details on my proposed solutions over the next few days.
See this branch for more details: https://github.com/jroesch/tvm/tree/astgen.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:8
- Comments:5 (5 by maintainers)
Top GitHub Comments
cc @jermainewang @kazimuth @junrushao1994 @icemelon9 @ajtulloch @yzhliu @merrymercy who might be interested in this. Some initial thoughts:
tvm.schema.expr.py -> include/IR/expr.h
- Alternatively, allow declaration within each file.Yeah, I strongly agree with the point that we need to decouple schema reading and the generation.
This is somehow like LLVM’s tablegen, which manages repeat and regular codes in a centralized description file to minimize the changes we need to add new IR nodes.