Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

model inference causes unresponsiveness and even system crash when using `webgl` backend

See original GitHub issue

model inference causes unresponsiveness and even system crash when using webgl backend
while running perfectly fine using tfjs-node

using `webgl` in browser

gpu memory usage raises to >4GB although model is not that heavy at all

inference time is measured as below 40 ms, but actual wall time between frames is closer to 3,000 ms!

during that time browser is completely unresponsive (not just active tab)
and overall system responsiveness is reduced

and after just several frames it will result in either browser crash or webgl error logged in console

it even resulted in a system crash - BSOD with stop code VIDEO_SCHEDULER_INTERNAL_ERROR !!!

yes, its a client-side code that can result in a system crash - doesn’t get much worse than that

its almost like a some webgl call is causing something really bad to happen between browser and gpu drivers

using `tfjs-node-gpu` in node

works without any problems
low memory usage and inference time below 30ms

even round-trip workflow (browser client talking via websockets to server that does processing) results in pretty good frame rate and no overall issues

model

model itself is a simple tfjs graph model with 8.8MB weights
it takes 720x720 image as input and produces 720x720 image as output
converted from tf saved model with original at https://systemerrorwang.github.io/White-box-Cartoonization/

reproduction

full reproduction using https://github.com/vladmandic/anime

browser code that causes problems is https://github.com/vladmandic/anime/blob/main/src/anime-clientside.ts
nodejs code entry point is https://github.com/vladmandic/anime/blob/main/src/node.ts (which works just fine)

environment

tensorflow 3.19.0
chome 103.0.1264.71
windows 11 build 22621.436
nvidia drivers 516.59

Issue Analytics

State:
Created a year ago
Comments:17 (8 by maintainers)

Top GitHub Comments

2reactions

vladmandiccommented, Jul 28, 2022

@pyu10055 confirmed!

actually, its not 10x, its closer to 20x on my system plus no crashes
great job with #6639

1reaction

vladmandiccommented, Jul 29, 2022

setInterval would cause a crash due to exploding overlapping inference requests, but why would setTimeout cause any problems?

Top Results From Across the Web

[tfjs-backend-wasm] - chrome tab becomes unresponsive ...

I suspect the unresponsiveness is due to the WASM computation taking too long. Have to tried measuring the time of a single inference...

Site using WebGL rendering crashes… - Apple Developer

Site using WebGL rendering crashes in Safari browser with iOS 14.2 Beta. The website (Babylon. js playground) can not open in the latest...

WebGL animation slows down TensorflowJS model inference

I'm running a model in TensorflowJS, in a web worker, using the WebGL backend. The model is queried once every 64ms and does...

Fast client-side ML with TensorFlow.js, by Ann Yuan (Google)

This table shows how our WebGL, WebAssembly, and plain JS backends compare when it comes to inference on MobileNet, a medium-sized model ......

Firefox 4 Beta 9 Fixes - Mozilla

Star button flashes distractingly when switching tabs, loading pages, going back or forward, etc. 620684, Possible attempt to use invalid mGLContext. 622733, A ......