Mismatch in packed depthwise conv 2d results on Mali GPU
See original GitHub issuePlease make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js): Yes, test case shared below
- Mobile device : Pixel 6 Pro (Reproduces on any android device with Mali GPU that I tried)
- TensorFlow.js installed from (npm or script link): 3.19.0
- Browser version: Chrome 103.0.5060.53
Describe the current behavior
Packed depthwise conv2d produces incorrect result on Mali GPUs when WEBGL_MAX_TEXTURE_SIZE
is left at the default value (which is 4096 on most modern android devices). In one of our networks, we end up creating a 3672x1 sized texture for weights, which produces incorrect outputs (presumably some error in sampling the texture, but that is just a guess). Setting max texture size lower than 3672 fixes the issue.
I have attached sample code below to produce the issue (uses the same filter dims as the layer which caused inaccuracy in our original network). The code does the following:
- We first run packed depthwise conv with 4096 as the texture size (the default value of max texture size on all browsers I tried, the value is hardcoded for consistent results)
- Next, we re-run the same node with a max size of 2048.
- Finally, we set backend to cpu to get the reference output.
- With size 2048, the outputs match the reference, but with the default size, the outputs do not match.
Note: The mismatch occurs only on Mali GPUs with Android based on my tests. iOS, MacOS chrome, Androids with Adreno GPU all produce the correct result with default texture size (of 4096)
Standalone code to reproduce the issue
tf.ENV.set('WEBGL_PACK_DEPTHWISECONV', true)
let w = Array.from({length: 3 * 3 * 816}, () => Math.random())
let x = Array.from({length: 12 * 10 * 816}, () => Math.random())
let inputs = {
filter: tf.tensor(w, [3, 3, 816, 1]),
x: tf.tensor(x, [1, 12, 10, 816]),
strides: 1,
pad: [[0, 0], [1, 1], [1, 1], [0, 0]],
dataFormat: "channelsLast",
dilations: 1,
activation: 'relu'
};
tf.setBackend('webgl')
tf.ENV.set('WEBGL_MAX_TEXTURE_SIZE', 4096)
let out_4096 = tf.fused.depthwiseConv2d(inputs);
tf.ENV.set('WEBGL_MAX_TEXTURE_SIZE', 2048)
inputs.x = tf.tensor(x, [1, 12, 10, 816])
inputs.filter = tf.tensor(w, [3, 3, 816, 1])
let out_2048 = tf.fused.depthwiseConv2d(inputs);
tf.setBackend('cpu')
inputs.x = tf.tensor(x, [1, 12, 10, 816])
inputs.filter = tf.tensor(w, [3, 3, 816, 1])
let out_reference = tf.fused.depthwiseConv2d(inputs);
const doTensorsDiffer = function(t0, t1) {
return tf.any(tf.greater(tf.abs(tf.sub(t0, t1)), tf.scalar(1e-2))).dataSync()[0];
}
console.log("Default and 2048 differ? " + doTensorsDiffer(out_4096, out_2048));
console.log("Reference and 2048 differ? " + doTensorsDiffer(out_reference, out_2048));
console.log("Reference and 4096 differ? " + doTensorsDiffer(out_reference, out_4096));
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
Issue Analytics
- State:
- Created a year ago
- Comments:27
Top GitHub Comments
tf.ENV.set(‘WEBGL_PACK_DEPTHWISECONV’, true)
let w = Array.from({length: 3 * 3 * 816}, () => Math.random()) let x = Array.from({length: 12 * 10 * 816}, () => Math.random())
let inputs = { filter: tf.tensor(w, [3, 3, 816, 1]), x: tf.tensor(x, [1, 12, 10, 816]), strides: 1, pad: [[0, 0], [1, 1], [1, 1], [0, 0]], dataFormat: “channelsLast”, dilations: 1, activation: ‘relu’ };
tf.setBackend(‘webgl’) tf.ENV.set(‘WEBGL_MAX_TEXTURE_SIZE’, 4096) let out_4096 = tf.fused.depthwiseConv2d(inputs);
tf.ENV.set(‘WEBGL_MAX_TEXTURE_SIZE’, 2048) inputs.x = tf.tensor(x, [1, 12, 10, 816]) inputs.filter = tf.tensor(w, [3, 3, 816, 1]) let out_2048 = tf.fused.depthwiseConv2d(inputs);
tf.setBackend(‘cpu’) inputs.x = tf.tensor(x, [1, 12, 10, 816]) inputs.filter = tf.tensor(w, [3, 3, 816, 1]) let out_reference = tf.fused.depthwiseConv2d(inputs);
const doTensorsDiffer = function(t0, t1) { return tf.any(tf.greater(tf.abs(t0.sub(t1)), tf.scalar(1e-2))).dataSync()[0]; }
console.log("Default and 2048 differ? " + doTensorsDiffer(out_4096, out_2048)); console.log("Reference and 2048 differ? " + doTensorsDiffer(out_reference, out_2048)); console.log("Reference and 4096 differ? " + doTensorsDiffer(out_reference, out_4096));
Thanks for the fix @Linchenn!