question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot return blob or arrayBuffer from page.evaluate

See original GitHub issue

I am not sure if this is an issue with Puppeteer but I looked around and could not find a proper answer to my question. So I am asking it here.

  • Puppeteer version: 1.3.0
  • Platform / OS version: Windows 10
  • Node.js version: 8.9.4

I’m trying to download a PDF by making a POST request(using fetch). The following is the code:

const puppeteer = require('puppeteer');
const fse = require('fs-extra');

(async () => {
	const browser = await puppeteer.launch({ headless: false, devtools: true });
	const page = await browser.newPage();

	await page.goto('http://jeffe.cs.illinois.edu/');
    // This is just an example. Actual PDFs can't be accessed with a URL or opened in the PDF viewer.
	let url = 'http://jeffe.cs.illinois.edu/teaching/algorithms/book/!!-preface.pdf';
     
	console.log("Initiating download...");
	const pdfData = await getPdf();
	console.log("PDF data: ", pdfData);
	fse.writeFileSync('./file.pdf', pdfData);
	async function getPdf() {
		return page.evaluate(url => {
			return window.fetch(url)
				.then(response => response.blob())
				.then(data => {
					debugger;  // Here data is Blob(204711) {size: 204711, type: "application/pdf"}
					return data;
				})
				.catch(err => {
					debugger;
				});
		}, url);
	}
})();

The pdfData in this case is {}. Even if I use response.arrayBuffer instead of response.blob, pdfData is still {}. To circumvent this issue, I am converting the blob to a string before returning it from page.evaluate in the following way:

	async function getPdf() {
		return page.evaluate(url => {
			return new Promise(async resolve => {
				const reader = new FileReader();
				const response = await window.fetch(url)
				const data = await response.blob();
				debugger;
				reader.readAsBinaryString(data);
				reader.onload = () => resolve(reader.result);
				reader.onerror = () => reject('Error occurred while reading binary string');
			});
		}, url);
	}
    const pdfString = await getPdf();
    const pdfData = Buffer.from(pdfString, 'binary');
	fse.writeFileSync('./file.pdf', pdfData);

Although I’ve a solution to my problem, I am unable to comprehend why we can’t return a Blob from Chromium to Node’s context. Is this an issue with Blob’s serializability or it has something to do with Chrome dev tools protocol’s handling of Blobs?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:24
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

4reactions
aslushnikovcommented, Jan 10, 2019

Is this an issue with Blob’s serializability or it has something to do with Chrome dev tools protocol’s handling of Blobs?

@sbmthakur Exactly; this happens because there’s only a set of types that DevTools protocol can reliably transfer over the wire; blobs and array buffers are not the ones. Your solution is the way to go.

2reactions
conmacommented, Aug 25, 2020

Puppeteer return respone file from page.evalute():

const response = await page.evaluate(() => new Promise((resolve, reject) => {
// convert to .doc file
          const html = "<html xmlns:o='urn:schemas-microsoft-com: office:office"
            + " xmlns:w='urn:schemas-microsoft-com: office:word' xmlns='http://www.w3.org/TR/REC-html40'>"
            + "<head><meta charset='utf-8'><title>Export HTML To Doc</title></head><style></style><div id="content">xxxxxxxxxxxxxxxxxxxxxx</div><body></body></html>";
       
          const blob = new Blob(['\ufeff', html], {
            type: 'application/msword'
          });

          const reader = new FileReader();
          reader.readAsBinaryString(blob);
          reader.onload = () => resolve(reader.result);
          reader.onerror = () => reject('Error occurred while reading binary string');
        }));

        const file = Buffer.from(response , 'binary');
        return file;
Read more comments on GitHub >

github_iconTop Results From Across the Web

How to go from Blob to ArrayBuffer - Stack Overflow
You can use FileReader to read the Blob as an ArrayBuffer . Here's a short example: var arrayBuffer; var fileReader = new FileReader(); ......
Read more >
Blob.arrayBuffer() - Web APIs - MDN Web Docs
The arrayBuffer() method in the Blob interface returns a Promise that resolves with the contents of the blob as binary data contained in...
Read more >
the first argument must be of type string or an instance of buffer or ...
If the function passed to the page.evaluate returns a non-Serializable value, then page.evaluate resolves to undefined . DevTools Protocol also supports ...
Read more >
ArrayBuffer, binary arrays - The Modern JavaScript Tutorial
ArrayBuffer , Uint8Array , DataView , Blob , File , etc. ... It has a fixed length, we can't increase or decrease it....
Read more >
Using Buffers in Node.js - DigitalOcean
js, they return data streams that are temporarily stored in an internal buffer when the client cannot process the stream all at once....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found