[RFC][PASS][RUNTIME] Enable Slice on LHS during concat
See original GitHub issueThis was mentioned in https://github.com/dmlc/tvm/issues/2122#issuecomment-439581161; I also asked about it here a while back.
Basically I’d like to implement a compiler pass that replaces concat operations with simple slicing operations, where:
out[0:1] = concat(f1(), f2())
is replaced with
out[0] = f1()
out[1] = f2()
This can provide large improvements in some cases, e.g. DenseNet, which goes from having O(n^2) memory copies to having O(n) memory copies. It also reduces memory requirements in low-memory deployment scenarios. It can be implemented by partially specializing compute kernels over data size.
I’d like to implement this pass (partly just to get more used to working inside TVM), but I’m not exactly sure where in the stack this should go. It’s somewhat backend-dependent, and AFAICT Relay can’t really represent slicing in this way (although I might be wrong). Maybe a pass can be added that annotates data-movement operations, and backends can consume those annotations?
A place to start might be just implementing this in the graph runtime. If someone could provide some guidance that would be helpful ( @tqchen ?)
cc @areusch
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (11 by maintainers)
Top GitHub Comments
What I mean is that runtime need to be aware of the memory layout and provide out[slice] = f(inputs). Another possible “obstacle” is that TVM’s compute kernel requires the buffer to be somewhat aligned, and we need to generate a special kernel for
out[slice] = f(inputs)
, with a known offset(so we still benefit from good alignment). This is necessary for OpenCLI have not yet put very deep thoughts how can we generate annotation, this is something that worth some thoughts and discussion