question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FR: Use custom cookies for PDF and screenshot generation

See original GitHub issue

Type

  • General Question or Disussion
  • Propose a brand new feature
  • Request modification of existing behavior or design

What is the problem that your feature request solves

I am archiving some websites with 18+ confirmation (mildly adult contents) and it has a entering confirmation to verify your age. I have exported the cookies.txt file and linked to the configuration and the Local Archive has passed the confirmation successfully; however, I noticed that other types of archive (e.g. HTML, PDF and screenshot) are not applied to the cookie file in my configuration file so everything it captured was just an 18+ confirmation screen.

Describe the ideal specific solution you’d want, and whether it fits into any broader scope of changes

FYI this is the web page I tried to archive (and other posts under this sub forum). Hope that the archiving process includes the custom cookie file I provided in the configuration and print out correctly for the PDF, HTML and screenshot archive.

What hacks or alternative solutions have you tried to solve the problem?

I’ve looked into each index.json file for each archive and found out that it would be nice to include cookies flag for the headless chrome/chromium command. Only wget has been assigned with my cookie file.

How badly do you want this new feature?

  • It’s an urgent deal-breaker, I cant live without it
  • It’s important to add it in the near-mid term future
  • It would be nice to have eventually

  • I’m willing to contribute to development / fixing this issue
  • I like ArchiveBox so far / would recommend it to a friend

P.S. I don’t have code experience and excuse me for the lacking knowledge of IT knowledge.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
piratecommented, Aug 13, 2019

The trick is your Chrome data dir used for archiving needs to be from a Chrome instance that’s logged into the site. Try opening chromium-browser or whatever binary is the same chrome instance you’re using for archiving, and logging into the site, then running archivebox. If you’re doing it on a remote server you’ll need to rsync your chrome data dir to the server.

You can find it at one of these paths depending on what OS you’re on and what Chrome version you’re using:

            # if using chromium
            '~/.config/chromium',                      # linux
            '~/Library/Application Support/Chromium',   # mac
            '~/AppData/Local/Chromium/User Data',   # windows

            # if using normal Google Chrome
            '~/.config/chrome',
            '~/.config/google-chrome',
            '~/Library/Application Support/Google/Chrome',
            '~/AppData/Local/Google/Chrome/User Data',
            '~/.config/google-chrome-stable',

           # If using beta/canary chrome
            '~/.config/google-chrome-beta',
            '~/Library/Application Support/Google/Chrome Canary',
            '~/AppData/Local/Google/Chrome SxS/User Data',
            '~/.config/google-chrome-unstable',
            '~/.config/google-chrome-dev',
0reactions
piratecommented, Mar 23, 2022

Are you using the same Chromium version inside and outside Docker to generate that profile? It must be exactly the same version, architecture, release type, etc. for it to work. @terxw You can try setting CHROME_HEADLESS=False and checking the GUI that pops up to make sure it’s using it correctly.

Read more comments on GitHub >

github_iconTop Results From Across the Web

FR: Use custom cookies for PDF and screenshot generation ...
The trick is your Chrome data dir used for archiving needs to be from a Chrome instance that's logged into the site.
Read more >
Generate & send PDFs from Google Sheets | Apps Script
Automatically create PDFs with information from sheets in a Google Sheets spreadsheet. Once the PDFs are generated, you can email them out directly...
Read more >
Capture a website screenshot online. / Dataflow kit
The simplest solution to get an array of cookies for specific websites is to use a web browser and EditThisCookie extension. Copy a...
Read more >
iText DITO® | iText PDF
Together you have a template solution capable of generating a few hundred to a few hundred thousand PDF documents per day. iText DITO...
Read more >
Welcome to Report and Run
Generate and send custom PDF reports complete with photos, textboxes and drawings all from your phone. This tool will help with your next...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found