running some models using webgl is 10x slower than using nodejs
See original GitHub issueModel executes in NodeJS using tensorflow
backend in ~100ms, but above 1sec in browser using WebGL
backend
That is 10x difference in performance between NodeJS and WebGL
I’ve tried to figure out why a model is executing an order of magnitude slower than expected using WebGL
backend,
but profiler info for it makes no sense
const t0 = performance.now();
const res = await tf.profile(() => model.executeAsync(input));
const t1 = performance.now();
const wallTime = t1 - t0;
const kernelTime = res.kernels.reduce((a, b) => a += b.kernelTimeMs, 0);
wallTime is 900-1200ms
kernelTime is ~20ms
I’m re-running inference on the same input twice and looking at second run to allow for warmup time of WebGL
backend (shader compile, etc.)
I even tried tf.enableDebugMode()
and I still don’t see anything that gets even close to overall wall time
And I have no idea where is time spent?
Model in question: https://github.com/vladmandic/nanodet
Environment: TFJS 3.3.0 on Ubuntu 20.10 and Chrome 89
Issue Analytics
- State:
- Created 2 years ago
- Comments:16 (8 by maintainers)
Top GitHub Comments
inference time is simple:
in browser:
and in nodejs:
and i’m only measuring subsequent runs, skipping first run so webgl has time to warmup (compile&upload shaders)
for most models, webgl is fast and it’s the best option (since node-gpu doesn’t work due to obsolete cuda dependency from tf1). however, for some models (i’ve provided examples), webgl is 10x slower than nodejs. but running
profile()
shows nothing useful.@OlivierMns I never stated I saw the problem with EfficientNet, I spoke about EfficientDet. But in general, issue in my case was clipping due to quantization - try unquantized model first. Also, EfficientNet is notoriously slow for warmup, I suggest to enable WebGL uniforms which speeds up warmup 2-4x (no difference on actual inference)