question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Zombie Process problem.

See original GitHub issue

Hello,

Recently we talked about this problem in the issues #1823 and #1791.

Environment:

Use Case:

We are using puppeteer on AWS Lambda. We take a screenshot of given HTML template and upload it to S3 and use this image for future requests It handles over 100 million requests each month. That’s why every process should be atomic and immutable. (AWS Lambda has a disk and process limit.)

Example Code:

const browser = await puppeteer.launch({
  args: ['--disable-gpu', '--no-sandbox', '--single-process', 
             '--disable-web-security', '--disable-dev-profile']
});
const page = await browser.newPage();
await page.goto('https://s3bucket.com/markup/a.html');
const response = await page.screenshot({{ type: 'jpeg', quality: 95 }});
browser.close();

Problem

When we are using example code, we got disk error from AWS Lambda.

Example /tmp folder:

2018-01-12T14:55:38.553Z    a6ef3454-f7a8-11e7-be0f-17f405d5a180    start stdout: total 226084
drwx------ 3 sbx_user1067 479 4096 Jan 12 14:55 .
drwxr-xr-x 21 root root 4096 Jan 12 10:53 ..
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:33 core.headless-chromi.129
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:15 core.headless-chromi.131
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:49 core.headless-chromi.135
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:52 core.headless-chromi.137
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:50 core.headless-chromi.138
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:51 core.headless-chromi.14
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:49 core.headless-chromi.15
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:36 core.headless-chromi.169
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:15 core.headless-chromi.174
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:52 core.headless-chromi.178
-rw------- 1 sbx_user1067 479 15126528 Jan 12 14:50 core.headless-chromi.180
drwx------ 3 sbx_user1067 479 4096 Jan 12 14:14 .pki

When we investigated these files, we understood that it is a core dump. We removed these files after the process completed.

When we monitored process list, we saw zombie processes Zombie chrome processes have been growing increasingly. We can’t kill them. AWS Lambda has a maximum process limit. (max 1024 process) That’s why we reach the lambda limits.

483 1 3.3 1.6 1226196 65408 ? Ssl 22:07 0:05 /var/lang/bin/node --max-old-space-size=870 --max-semi-space-size=54 --max-executable-size=109 --expose-gc /var/runtime/node_modules/awslambda/index.js
483 22 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 73 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 119 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 166 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 214 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 262 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 307 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 353 0.0 0.0 0 0 ? Z 22:07 0:00 [headless-chromi] <defunct>
483 1915 0.0 0.0 0 0 ? Z 22:09 0:00 [sh] <defunct>

We couldn’t use dump-init on lambda. Because lambda already has an init system.

How did we fix it? (very hacky method)

We used browser.disconnect() instead of browser.close(). We manualy managed chrome processes such as kill.

Example Code:

browser.on('disconnected', () => {
    console.log('sleeping 100ms'); //  sleep to eliminate race condition  
    setTimeout(function(){
    console.log(`Browser Disconnected... Process Id: ${process}`);
    child_process.exec(`kill -9 ${process}`, (error, stdout, stderr) => {
        if (error) {
        console.log(`Process Kill Error: ${error}`)
        }
        console.log(`Process Kill Success. stdout: ${stdout} stderr:${stderr}`);
    });
}, 100);

Firstly we didn’t use this method. We only killed the process after browser disconnect. We got the following error:

Error: read ECONNRESET at exports._errnoException (util.js:1018:11) at TCP.onread (net.js:568:26)

I think it looks like a puppeteer process management problem. When we used this method, we didn’t receive any puppeteer related errors. How can we fix it?

Thanks.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:92
  • Comments:51 (3 by maintainers)

github_iconTop GitHub Comments

23reactions
marius080commented, Jun 30, 2020

I’ve overcome these issues by adding the flags for chrome headless:

const chromeFlags = [
    '--headless',
    '--no-sandbox',
    "--disable-gpu",
    "--single-process",
    "--no-zygote"
]

I think the child processes are orphaned when the parent is killed and that leads to the zombies. With this, I only get one process and it works pretty well

17reactions
leobudimacommented, Mar 11, 2018

@bahattincinic - thanks, I’ve tried your method of disconnecting + killing the process, and while it does kill the “main” process returned by puppeteer.launch(), each run seems to leave another defunct zombie with a PID that is different than the killed one…

What’s worse, when I run ps aux right after puppeteer.launch(), aside from the “main” process, there is already one that’s defunct, right away, before running code or trying to kill anything.

I’ve tried sending a kill -15, hoping that will allow the main process to clean up its children, but -15 or -9 doesn’t make any difference, so I’m still stuck with an ever-growing list of zombies and rising memory…

Do you have any advice on how you managed to keep it clean of those as well (if you had a similar experience)? I’m also running on Lambda, same args used, puppeteer 1.1.1. Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Zombie Processes and their Prevention - GeeksforGeeks
Zombie state: When a process is created in UNIX using fork() system call, the parent process is cloned. If the parent process calls...
Read more >
Zombie Processes - Back 2 Code
Zombie processes, a short definition. The first step is an orphaned process, a process that has lost his parent. Suppose the parent process ......
Read more >
Java generates a lot of zombie process problems - JavaIsland
Roughly speaking Zombie process appears on child processes: when a child process finishes execution, but its parent does not read its exit ......
Read more >
What Are Zombie Processes & How Do You Kill Them?
Generally zombie processes don't cause performance issues on the servers. These are just leftover bits of dead processes that haven't been cleaned properly...
Read more >
Any issue if Zombie state is not cleared? - Unix Stack Exchange
Each zombie process retains its process ID . Linux systems have a finite number of process IDs – 32767 by default on 32-bit...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found