WebGLRenderer: Use faster program cache key computation.
See original GitHub issueI’ve recently did a profile of my project, haven’t done that in 4-5 months, since then three.js version has been upgraded several times. Currently on 0.132.2
So, profile shows up that getProgramCacheKey
takes up close to 20% of the entire time:
For a bit more context, this is in the scope of the entire game, with particle simulation, IK, sound and a bunch of other stuff going on. Not just pure rendering. Here’s a screenshot of what’s being rendered:
Here’s the WebGLRenderer.info
:
If you have a look at the cache key computation code, there’s a lot wrong there. It’s building an array of strings, that’s a lot of allocation, and then there’s array.join()
which results in another string being created. All of that, just to throw the result away in the end (since no new materials are created during profiling period).
I propose offering an override for the cache keys, or building keys in a way that avoids much of the allocation effort. Also, doing equality check on material against what’s already cached can be done without materializing this “cache key” at all. I understand that it’s “simple” to do what’s currently being done, and I understand that doing what I propose is less easy, but we’re talking about 20% of the CPU time being wasted on an application that’s hardly pure rendering.
As an alternative - perhaps offer an override for cache key computation, something like “customComputeMaterialCacheKey”, that would be dirty, but it would allow users like me to work around the issue.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:5
- Comments:30 (19 by maintainers)
Top GitHub Comments
I’m little worried about the added complexity of multiple cache keys, but one option might be to keep a cheaper “fastProgramCacheKey” to map a single material to its program variants. Since there are not many possible combinations there, it could just be a cheaper bit mask instead of a string. For example:
@gonnavis I’ll have a look at your example to see what’s going on between version r129 vs r136.