question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PDF Files cause `fromStream` to never finish

See original GitHub issue

Using 16.5.3, I’m seeing an issue where an await’ed call to fromStream with a stream of a PDF file never actually resolves

e.g.:

const fileType = await FileType.fromStream(stream);

I haven’t yet been able to test the 17.x release series due other blocking dependencies.

It seems that removing the await from line 592 of node_modules\file-type\core.js causes the function to return properly; and suggests a reference to https://github.com/Borewit/strtok3/issues/551

i.e. this “works”:

if (checkString('%PDF')) {
	await tokenizer.ignore(1350);
	const maxBufferSize = 10 * 1024 * 1024;
	const buffer = Buffer.alloc(Math.min(maxBufferSize, tokenizer.fileInfo.size));
	tokenizer.readBuffer(buffer, {mayBeLess: true});

and this doesn’t:

if (checkString('%PDF')) {
	await tokenizer.ignore(1350);
	const maxBufferSize = 10 * 1024 * 1024;
	const buffer = Buffer.alloc(Math.min(maxBufferSize, tokenizer.fileInfo.size));
	await tokenizer.readBuffer(buffer, {mayBeLess: true});

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9

github_iconTop GitHub Comments

1reaction
crossan007commented, Feb 25, 2022

@Borewit thanks for the tips.

I think we can close this issue

1reaction
Borewitcommented, Feb 25, 2022

fileTypeStream(), in v16 this was stream(), is probably the closest thing to work around your issue. But it comes at a price, it a has a limited sample size preventing the read to much at the cost of some file type determination.

Read more comments on GitHub >

github_iconTop Results From Across the Web

iTextSharp + FileStream = Corrupt PDF file - Stack Overflow
I think your problem was that you weren't properly adding content to your PDF. This is done through the Document.Add() method and you...
Read more >
Troubleshoot viewing PDF files on the web - Adobe Support
Follow these steps to solve the common issues around viewing PDF files from a website.
Read more >
Open and Save PDF file in C# and VB.NET - Syncfusion
This page describes how to open and save PDF file from or to file system, and stream in C# and VB.NET using Syncfusion...
Read more >
Parse PDF Files While Retaining Structure with Tabula-py
It's hard to copy-and-paste rows of data out of PDF files. Try tabula-py to extract data into a CSV or Excel spreadsheet using...
Read more >
Inside the PDF File Format - Command Line Fanatic
So what are these PDFs? Why PDF rather than HTML? The truth is that PDF, or Portable Document Format , gets sort of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found