question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Archive Method: Chrome timing out for many sites when running in Docker

See original GitHub issue

Describe the bug

When running ArchiveBox in the docker container, frequent errors are displayed such as the one below:

Steps to reproduce

Steps to reproduce the behavior:

  1. Follow the wiki instructions to run the docker container
  2. Wait for errors

Screenshots or log output

Failed: TimeoutExpired Command 'google-chrome-unstable' timed out after 60 seconds
        Run to see full output:
            cd /data/archive/1552194240.610;
            google-chrome-unstable --headless --no-sandbox --user-data-dir=/chrome --dump-dom --timeout=60000 https://my-url-here.com

Software versions

(please complete the following information)

  • OS: Host OS is Ubuntu 18.04
  • ArchiveBox version: 4a7f1d5
  • Docker version: 18.09.2
  • Chrome version: Google Chrome 74.0.3724.8 dev

More Info

I wanted to see more about the errors I was encountering in Chrome, so I started a terminal in the docker container and tried it out. I reduced the timeout because 60000 seemed high. This command still blocks for a very long time though.

pptruser@ffc8e33b6840:/data/archive/1552194240.936$ google-chrome-unstable --headless --no-sandbox --user-data-dir=/chrome --print-to-pdf --hide-scrollbars --timeout=60 https://grosinger.net
Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
[0310/055813.147638:ERROR:command_buffer_proxy_impl.cc(125)] ContextResult::kTransientFailure: Failed to send GpuChannelMsg_CreateCommandBuffer.
[0310/055813.173440:INFO:headless_shell.cc(308)] Timeout.

Looks like maybe the lack of GPU in the container is causing an issue. Let’s disable that.

pptruser@ffc8e33b6840:/data/archive/1552194240.936$ google-chrome-unstable --headless --no-sandbox --user-data-dir=/chrome --print-to-pdf --hide-scrollbars --disable-gpu --timeout=60 https://grosinger.net
[0310/055837.659065:WARNING:discardable_shared_memory_manager.cc(188)] Less than 64MB of free space in temporary directory for shared memory files: 63
Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
[0310/055837.767788:INFO:headless_shell.cc(308)] Timeout.
[0310/055839.691556:ERROR:service_worker_storage.cc(2196)] Failed to delete the database: Database IO error

Not sure how to get past this issue though. Any suggestions?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
tgrosingercommented, Mar 12, 2019

Will do. I’ll close this issue and reopen if I continue having trouble. Thanks!

0reactions
piratecommented, Jul 24, 2020

Now that we’re a handful of major versions ahead with Chrome, please give this a shot on the latest django branch, if you still see any issues with timing out comment back here and I’ll reopen the ticket.

git checkout django
git pull
docker build . -t archivebox
docker run -v $PWD/output:/data archivebox init
docker run -v $PWD/output:/data archivebox add 'https://example.com'
Read more comments on GitHub >

github_iconTop Results From Across the Web

Chrome timing out for many sites when running in Docker ...
Try increasing your ArchiveBox TIMEOUT to 70 or 80 and running it again to capture those sites that take longer than 60s.
Read more >
Configuration — ArchiveBox 0.6.2 documentation
Maximum allowed download time per archive method for each link in seconds. ... it will cause Chrome to hang indefinitely and many sites...
Read more >
Building your own docker images for different browser versions
Docker images/containers for different versions of Firefox and Chrome.
Read more >
Buildpacks, Jib, or Dockerfile: Which method should you ...
tar files usually not dealt with directly but through a "container registry"; Containers == Docker, Kubernetes, Cloud Run. Java Sample ...
Read more >
Fix list for IBM WebSphere Application Server V8.5
PH46332, IBM WebSphere Application Server is vulnerable to Cross-site Scripting ... PH39030, WebSphere batch job dispatch can timeout under load.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found