question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Processing through pdf-lib causes xref/corrupt message

See original GitHub issue

Hello

Thanks once again for supporting such a fantastic library. I’ve always had trouble with the below PDF.

https://tinyurl.com/y9cwewbz

It appears that whenever I process it through pdf-lib, it opens in every PDF Reader except Adobe Reader. I think it’s quite important it works in Adobe Reader.

When I run qpdf on it, I get the below error:

WARNING: /Users/me/Downloads/corrupt.pdf: file is damaged WARNING: /Users/me/Downloads/corrupt.pdf (offset 6711353): xref not found WARNING: /Users/me/Downloads/corrupt.pdf: Attempting to reconstruct cross-reference table

My code is below, in case relevant.

Thanks a lot

// Get the bundle
const bundle = req.project.bundles[req.bundleKey];
// Get list of all entries
const entries = bundle.sections.reduce(
  (acc, section) => [...acc, ...section.entries],
  []
);

// Initiate PDFDocument obejct
let mergedPdf = await PDFDocument.create();
const font = await mergedPdf.embedFont(StandardFonts.HelveticaBold);
const paginationFont = await mergedPdf.embedFont(
  StandardFonts.CourierBold
);

// First, put the cover page on
/*  const cover = await readFileAsync("./tmp/cover.pdf");
const coverPdf = await PDFDocument.load(cover);
const coverPage = await mergedPdf.copyPages(coverPdf, [0]);


coverPage[0].drawText(req.project.name, {
  x: 50,
  y: 75,
  font,
  size: 16,
  color: rgb(0.26, 0.25, 0.24),
});

// Put the name of the bundle
coverPage[0].drawText(req.project.bundles[req.bundleKey].name, {
  x: 50,
  y: 50,
  font,
  size: 12,
  color: rgb(0.41, 0.4, 0.37),
});

mergedPdf.addPage(coverPage[0]);*/

// Get dividers for future usage
//  const divider = await readFileAsync("./tmp/divider.pdf");

// Set page count at 1
let pageNumber = 1;

// Array of PDFDocument objects for all files to be merged
for (key in entries) {
  const entry = entries[key];
  if (!entry.uploaded) return false;
  const filename = `./tmp/${entry.uploaded.replace("/", "")}`;

  // Download file to the tmp folder
  await downloadFile(filename, entry.uploaded);

  // First, add the divider
  /*    const dividerPdf = await PDFDocument.load(divider);
  const dividerPage = await mergedPdf.copyPages(dividerPdf, [0]);
  const dividerNo = parseInt(parseInt(key) + 1).toString();
  dividerPage[0].drawText(dividerNo, {
    x: 560,
    y: 710,
    font,
    size: 14,
    color: rgb(0.22, 0.22, 0.22),
  });
  dividerPage[0].drawText(parseFilename(entry.name, key), {
    x: 50,
    y: 710,
    font,
    size: 14,
    maxWidth: 350,
    color: rgb(0.22, 0.22, 0.22),
  });
  mergedPdf.addPage(dividerPage[0]);*/

  // Now load into a PDFDocument object
  const file = await readFileAsync(filename);
  const entryPdf = await PDFDocument.load(file);

  const copiedPages = await mergedPdf.copyPages(
    entryPdf,
    entryPdf.getPageIndices()
  );

  // Now add each page into the merged PDF
  copiedPages.forEach((page) => {
    const width = page.getWidth();
    const bottomRight = width - 20;
    // Draw a background circle for hte number
    page.drawCircle({
      x: bottomRight,
      y: 20,
      size: 15,
      color: rgb(1, 1, 1),
    });
    // Put the number on the page
    page.drawText(pageNumber.toString(), {
      x:
        pageNumber < 10
          ? bottomRight - 3
          : pageNumber < 99
          ? bottomRight - 7
          : bottomRight - 11,
      y: 16,
      font: paginationFont,
      size: 12,
      color: rgb(0, 0, 0),
    });
    // Increment the page nunber
    pageNumber = pageNumber + 1;

    // And add the page in
    mergedPdf.addPage(page);
  });

  // Now delete them
  await unlinkAsync(filename);
}

// Finalise the pdf file
const finalPdf = await mergedPdf.save();

const uploadPdf = new Buffer.from(finalPdf);

// Now write to the file sync
//await writeFileAsync("./tmp/merged.pdf", finalPdf);

// Set filename for the bundle
const bundleFilenameLong = `${bundle._id.toString()}.pdf`;
const bundleFilenameShort = `${req.project.name} - ${bundle.name}`;

// Now upload to s3
const upload = await new Promise((resolve, reject) => {
  s3.putObject(
    {
      Bucket: keys.awsBucket,
      Key: bundleFilenameLong,
      Body: uploadPdf,
    },
    (err, url) => {
      if (err) reject(err);
      else resolve(url);
    }
  );
});
const url = await getFile(bundleFilenameLong, bundleFilenameShort);

return res.status(200).send(url);

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Hopdingcommented, May 25, 2020

@jackwshepherd Version 1.6.1 is now published. It introduces a capNumbers option that resolves this issue. You can use it like so:

const pdfDoc = await PDFDocument.load(bytes, { capNumbers: true });

The full release notes are available here.

You can install this new version with npm:

npm install pdf-lib@1.6.1

It’s also available on unpkg:

As well as jsDelivr:

1reaction
Hopdingcommented, May 24, 2020

I just published 1.6.1-rc1. It contains a fix for this issue (see #458). The problem seems to be that the PDF you are copying from contains a number that is too large to fit into a 64 bit integer. Presumably, this causes Acrobat to throw a fit and not render the document. Let me know how this version works for you!

You can install this new version with npm:

npm install pdf-lib@1.6.1-rc1

It’s also available on unpkg:

As well as jsDelivr:

Read more comments on GitHub >

github_iconTop Results From Across the Web

PDFlib-9.3.1-tutorial.pdf
Error handling in C. PDFlib supports structured exception handling with try/catch clauses. This allows C and C++ clients to catch exceptions which are ......
Read more >
PDFlib Cookbook - error-handling
Error handling : * Demonstrate different strategies with respect to exception handling. * * Example 1: Set the "errorpolicy" parameter to "return".
Read more >
changes-PDFlib-9.3.1.txt
2020-07-23 (bug #6762) Fixed a performance problem in the PDF optimization algorithm that caused long processing times for PDF output documents containing a ......
Read more >
changes-PDFlib-9.1.2.txt
- 2017-10-05 (bug #5974) Fixed a problem with the optimized PDI page import (feature #1872) for PDF/VT output, which could cause XObjects created...
Read more >
13 PPS and the PDFlib Block Plugin
PDFlib Blocks make it easy to place variable text, images, PDF pages or vector graphics on imported pages. In contrast to simple PDF...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found