question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`customTextRenderer` occasionally inserts HTML content into the <br> tags generated by PDF.js

See original GitHub issue

Before you start - checklist

  • I followed instructions in documentation written for my React-PDF version
  • I have checked if this bug is not already reported
  • I have checked if an issue is not listed in Known issues
  • If I have a problem with PDF rendering, I checked if my PDF renders properly in PDF.js demo

Description

PDF.js inserts a DOM node containing the associated text content if present:

https://github.com/mozilla/pdf.js/blob/c7d6ab2f7123c5a65155c55aa19d9d9abd8c2ff2/src/display/text_layer.js#L368

It then inserts a <br> element if hasEOL is also true for the same associated text content:

https://github.com/mozilla/pdf.js/blob/c7d6ab2f7123c5a65155c55aa19d9d9abd8c2ff2/src/display/text_layer.js#L371

Because the behavior of react-pdf is to iterate through each element of text content and then assume that the DOM node found by index is the only associated DOM node, the count drifts every time an element of text content with length and hasEOL is hit.

I implemented a disgusting hack that looks like

const textContentItems = [...textContent.items].flatMap((textContentItem) =>
  textContentItem.hasEOL && textContentItem.str
    ? [textContentItem, null]
    : [textContentItem]
);

textContentItems.forEach(function (item, itemIndex) {
  if (!item) {
    return;
  }
  ...

Lmk what you think! Love the ease-of-use of this component.

Steps to reproduce

All the PDF examples I have are confidential but I can try to create one if requested.

  1. Load PDF containing multi-line statements such that PDF.js parses one line with str present and hasEOL: true.
  2. customTextRenderer={({ str }) => str}
  3. Select all text on page, doesn’t match.

Expected behavior

Text selection should match.

Actual behavior

Text selection doesn’t match.

Additional information

No response

Environment

  • Browser (if applicable): Chrome
  • React-PDF version: 6.1.0
  • React version: 16.8
  • Webpack version (if applicable):

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Reactions:6
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
wojtekmajcommented, Nov 20, 2022

Thanks guys for all the info - this really helped me out! v6.1.1 released.

0reactions
etripiercommented, Nov 17, 2022

@wojtekmaj The issue is specifically that some tokens containing text and a line break will render both a <span> and a <br>, meaning that the number of rendered elements no longer matches 1-1 with the input. You can see the conditions that lead to this result here.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Is there any way to convert html tags to react-pdf tags in React?
I have a helper function that uses DraftJS to filter through the HTML and convert to react-pdf friendly tags.
Read more >
Custom PDF Rendering in JavaScript with Mozilla's PDF.Js
Take control of rendering PDF documents in the browser. ... PDF.js, created by Mozilla Labs, which can render PDF documents in your browser....
Read more >
wojtekmaj/react-pdf: Display PDFs in your React app ... - GitHub
Display PDFs in your React app as easily as if they were images. ... of React-PDF, use dropdown on top of GitHub page...
Read more >
Generate a PDF with JavaScript - Medium
We can start by introducing jsPDF, jsPDF is an open-source library for generating pdf using only JavaScript. It simply creates a pdf page...
Read more >
How to Embed a PDF in an HTML Website | PDF.js Express
PDF.js Express Viewer allows you to render PDFs inside a web page by using JavaScript instead of the browser's built-in PDF support.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found