question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

model inference causes unresponsiveness and even system crash when using `webgl` backend

See original GitHub issue

model inference causes unresponsiveness and even system crash when using webgl backend
while running perfectly fine using tfjs-node

using webgl in browser

gpu memory usage raises to >4GB although model is not that heavy at all

inference time is measured as below 40 ms, but actual wall time between frames is closer to 3,000 ms!

during that time browser is completely unresponsive (not just active tab)
and overall system responsiveness is reduced

and after just several frames it will result in either browser crash or webgl error logged in console

it even resulted in a system crash - BSOD with stop code VIDEO_SCHEDULER_INTERNAL_ERROR !!!

yes, its a client-side code that can result in a system crash - doesn’t get much worse than that

its almost like a some webgl call is causing something really bad to happen between browser and gpu drivers

using tfjs-node-gpu in node

works without any problems
low memory usage and inference time below 30ms

even round-trip workflow (browser client talking via websockets to server that does processing) results in pretty good frame rate and no overall issues

model

model itself is a simple tfjs graph model with 8.8MB weights
it takes 720x720 image as input and produces 720x720 image as output
converted from tf saved model with original at https://systemerrorwang.github.io/White-box-Cartoonization/

reproduction

full reproduction using https://github.com/vladmandic/anime

environment

  • tensorflow 3.19.0
  • chome 103.0.1264.71
  • windows 11 build 22621.436
  • nvidia drivers 516.59

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:17 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
vladmandiccommented, Jul 28, 2022

@pyu10055 confirmed!

actually, its not 10x, its closer to 20x on my system plus no crashes
great job with #6639

1reaction
vladmandiccommented, Jul 29, 2022

setInterval would cause a crash due to exploding overlapping inference requests, but why would setTimeout cause any problems?

Read more comments on GitHub >

github_iconTop Results From Across the Web

[tfjs-backend-wasm] - chrome tab becomes unresponsive ...
I suspect the unresponsiveness is due to the WASM computation taking too long. Have to tried measuring the time of a single inference...
Read more >
Site using WebGL rendering crashes… - Apple Developer
Site using WebGL rendering crashes in Safari browser with iOS 14.2 Beta. The website (Babylon. js playground) can not open in the latest...
Read more >
WebGL animation slows down TensorflowJS model inference
I'm running a model in TensorflowJS, in a web worker, using the WebGL backend. The model is queried once every 64ms and does...
Read more >
Fast client-side ML with TensorFlow.js, by Ann Yuan (Google)
This table shows how our WebGL, WebAssembly, and plain JS backends compare when it comes to inference on MobileNet, a medium-sized model ......
Read more >
Firefox 4 Beta 9 Fixes - Mozilla
Star button flashes distractingly when switching tabs, loading pages, going back or forward, etc. 620684, Possible attempt to use invalid mGLContext. 622733, A ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found