WEBGL_PACK_DEPTHWISECONV=true seems to cause significant first inference performance drop
See original GitHub issuePlease make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js): Yes. Production code at https://github.com/wingman-jr-addon/wingman_jr/pull/135, minimal reproduction at https://github.com/wingman-jr-addon/wingman_jr/pull/136
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win 10 Home 21H1
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A, but laptop specs:
ideapad FLEX5-1570
Processor Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz 2.90 GHz
Installed RAM 16.0 GB (15.9 GB usable)
System type 64-bit operating system, x64-based processor
Pen and touch Pen and touch support with 10 touch points
-
TensorFlow.js installed from (npm or script link): https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@3.4.0/dist/tf.min.js https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-wasm@3.4.0/dist/tf-backend-wasm.js https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-wasm@3.4.0/dist/tfjs-backend-wasm.wasm https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-wasm@3.4.0/dist/tfjs-backend-wasm-simd.wasm https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-wasm@3.4.0/dist/tfjs-backend-wasm-threaded-simd.wasm
-
TensorFlow.js version (use command below): 3.4.0
-
Browser version: Firefox 90.0 64 bit
-
Tensorflow.js Converter Version: Unknown, but probably 2.7.0
Current behavior - Upgrading from 3.3.0 to 3.4.0 experienced major performance drop on load+first inference time. 3.3.0 sees times of about 8.8s, 3.4.0 sees times about 14.4s. It pains me to report a bug related to WEBGL_PACK as so much work has gone into this feature, but … It appears that setting WEBGL_PACK_DEPTHWISECONV=false
on 3.4.0 returns to performance found in 3.3.0. Regression with default flags has been found to exist in at least 3.6.0 and 3.8.0 as well. (This was found on a bisection to upgrade from 2.7.0 to 3.8.0 to get the new shader compilation performance improvements started in #5205 )
Expected behavior - 3.4.0 with the flag default WEBGL_PACK_DEPTHWISECONV=true
offers similar or better performance to 3.3.0.
Minimal reproduction: https://github.com/wingman-jr-addon/wingman_jr/pull/136 Note this is a Firefox plugin, but TF.js is loaded via a content tab rather in the background context so it should be acting quite similarly to a normal browsing context.
Attached is output from Firefox’s about:support, which includes more detailed graphics issues that may be relevant to the matter at hand. FF90_about_support.txt
Issue Analytics
- State:
- Created 2 years ago
- Comments:17 (6 by maintainers)
Top GitHub Comments
closing the loop after testing using todays code in main branch:
warmup is now about 2x faster no material difference regardless if
WEBGL_PACK_DEPTHWISECONV
is enabled or disabeld so that issue is resolveddo note that enabling
WEBGL_USE_SHAPES_UNIFORMS
performs much better (2x faster warmup) regardless of packing (actually packing improvements make it even faster)!webgl default
webgl with uniforms enabled
@pyu10055 please consider enabling uniforms as default
@ahmedsabie @qjia7 @rthadur @pyu10055 Any updates on this? As you can see, WEBGL_PACK_DEPTHWISECONV=True (which is default value) has a massive negative performance impact - and it’s gotten far worse in newer versions of TFJS.
This is a major regression and it has very little updates.
And yes, using WEBGL_USE_SHAPES_UNIFORMS is much better, but - a) it’s not a solution, it’s an alternative, b) it’s not widely implemented, c) almost nobody knows about it.