Slow "common subexpression elimination" pass with large ROM design.
See original GitHub issueMy design has lots of constants, and it doesn’t scale well with design size. Generating the FIRRTL is quick, but FIRRTL is very slow during the “common subexpression elimination” pass relative to all the other passes. I assume this is because it’s trying to optimize a large ROM. Maybe this behavior is expected, but it would be nice to have a faster version, whether through better handling of this case or an option to turn off this pass.
The FIRRTL files and “log-level info” outputs for 4 design sizes are in /nscratch/stevo/firrtl-test.
This can also be replicated by compiling the FFT yourself. Clone the FFT repo (follow the setup instructions in the README), and change the design size (here) to 256, 512, 1024, or 2048. Compile with make verilog CONFIG=CustomStandaloneFFTConfig
.
Issue Analytics
- State:
- Created 7 years ago
- Comments:16 (16 by maintainers)
Top GitHub Comments
Update: Yes, this improves performance noticeably. CSE went from 586140ms to 1638ms for the 2048-point FFT! I.e. 10 minutes to less than 2 seconds.
Thanks, I get a little overzealous in prioritizing one-liners over efficient code, haha.