Error: Invalid TF_Status: 8, Message: OOM when allocating tensor
See original GitHub issueMy GPU is the NVIDIA GTX 970, and if it isn’t clear by the issue, I am using tfjs-node-gpu
TensorFlow.js version
tfjs:“1.5.2” tfjs-converter:“1.5.2” tfjs-core:“1.5.2” tfjs-data:“1.5.2” tfjs-layers:“1.5.2” tfjs-node:“1.5.2”
Browser version
ares:“1.15.0” brotli:“1.0.7” chrome:“78.0.3904.130” electron:“7.1.7” http_parser:“2.8.0” icu:“64.2” llhttp:“1.1.4” modules:“75” napi:“4” nghttp2:“1.39.2” node:“12.8.1” openssl:“1.1.0”
Describe the problem or feature request
I get the following stacktrace after a while of using face-api.js’s detectAllFaces
method which utilizes tensorflow.js:
(node:18420) UnhandledPromiseRejectionWarning: Error: Invalid TF_Status: 8
Message: OOM when allocating tensor with shape[1,3688,3688,3] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
at Object.<anonymous> (<anonymous>)
at NodeJSKernelBackend.executeSingleOutput (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\dist\nodejs_kernel_backend.js:193:43)
at NodeJSKernelBackend.concat (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\dist\nodejs_kernel_backend.js:405:21)
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\face-api.js\node_modules\@tensorflow\tfjs-core\dist\ops\concat_split.js:184:78
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:528:55
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:388:22
at Engine.scopedRun (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:398:23)
at Engine.tidy (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:387:21)
at kernelFunc (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:528:29)
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:539:27
(node:18420) UnhandledPromiseRejectionWarning: Error: Invalid TF_Status: 8
Message: OOM when allocating tensor with shape[1,3688,3688,3] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
at Object.<anonymous> (<anonymous>)
at NodeJSKernelBackend.executeSingleOutput (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\dist\nodejs_kernel_backend.js:193:43)
at NodeJSKernelBackend.concat (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\dist\nodejs_kernel_backend.js:405:21)
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\face-api.js\node_modules\@tensorflow\tfjs-core\dist\ops\concat_split.js:184:78
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:528:55
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:388:22
at Engine.scopedRun (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:398:23)
at Engine.tidy (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:387:21)
at kernelFunc (C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:528:29)
at C:\Users\Infinity\Documents\Code\NodeJS\face-rec\node_modules\@tensorflow\tfjs-node-gpu\node_modules\@tensorflow\tfjs-core\dist\engine.js:539:27
I’ve tried looking around for a possible solution, but in all honesty, I am unaware how to implement the solutions offered. The solutions were involving processing the images in batch, but I am using an API that is built upon tfjs, so I don’t even know if I could implement the solution, let alone how to. That being said, I read elsewhere that setting the TF_FORCE_GPU_ALLOW_GROWTH
environmental variable to true
could potentially fix the solution; however, the only issue that solved in the past was “Invalid TF_Status: 2. Message: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize”.
–
FWIW: Invalid TF_Status: 8, Message: OOM when allocating tensor with shape
occurrs on the 78th image. Additionally, I noticed once that it went ~5 images past that on, so around the 83rd image or so, on one random run.
Code to reproduce the bug / link to feature request
const scrapped = [];
const faceapiOptions = new faceapi.SsdMobilenetv1Options({ minConfidence: 0.9 });
// file = json object {name, id} | allImagesInFolder = array of files (type object {name, id})
for (const file of allImagesInFolder) {
// Get the image as a Buffer from Google Drive, and decode it using tensorflow
const image = tf.node.decodeImage(await googleUtils.getImageAsBuffer(file.id));
console.log(`Detecting faces in ${file.id} of size ${image.size}`);
// Detect faces within the image and the landmarks for each face
const detections = await faceapi.detectAllFaces(image, faceapiOptions).withFaceLandmarks(); // ERROR HERE
if (detections.length == 1) {
scrapped.push(detections[0]);
}
}
For what it’s worth, I created a simple benchmark for myself in order to test the time it took to detect faces using face-api.js. The following is a snippet that runs just fine and doesn’t result in Error: Invalid TF_Status: 8
. Why is it that the following snippet executes with no errors, but the snippet above does?
const testingImg = tf.node.decodeImage(await googleUtils.getImageAsBuffer('1wBKoDbjr8qGfVrzyhzFkiZ1i-O4FLhU3'));
const detection = await faceapi.detectAllFaces(testingImg).withFaceLandmarks();
console.log(detection);
const withoutPromise = [];
const withPromise = [];
for (let i = 0; i < 1000; i++) {
withoutPromise.push(faceapi.detectAllFaces(img));
withPromise.push(faceapi.detectAllFaces(img).withFaceLandmarks());
}
const start = new moment();
Promise.all(withoutPromise).then(() => {
console.log(`Time to detect all faces w/o landmarks: ${moment.duration(new moment().diff(start)).asMilliseconds()} ms`);
});
Promise.all(withPromise).then(() => {
console.log(`Time to detect all faces w/ landmarks: ${moment.duration(new moment().diff(start)).asMilliseconds()} ms`);
});
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (2 by maintainers)
Top GitHub Comments
@Infinitay try adding
image.dispose()
after callingfaceapi.detectAllFaces
. It might free up intermediate tensors faster.I would most likely interpret that graph as saying your program doesn’t saturate the GPU. If face-api takes images in batches you might be able to pass in more images at a time and thus utilize more of your GPU (you would need to investigate the face-api api to find out if this is doable).