question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Text of the PDF is messing up

See original GitHub issue

Recently the PDF rendering get a messed up text layer where text gets duplicated with the grey colored overlay. No idea about how to fix it as lack of documentation over those functionalities Im using pdfjsViewer.PDFPageView and it gives this behavior now my code as follows

 getPdf() {

    var pdfDocument;

    if ( this._state !== 'inDOM' ) return false;

    pdfjsLib.disableRange = true;
    pdfjsLib.disableStream = true;

    let self = this;
    pdfDocument = pdfjsLib.getDocument(this.src);
    pdfDocument.promise.then(function(pdf) {
      self.set( 'pdfDocument', pdf );
      self.set( 'maxNumPages',  pdf.numPages );
      self.set( 'prevBtnDisabled', true );
      self.set( 'documentRendered', true );

      self.setViewportWidth();
      self.renderPdf();
    });

    return pdfDocument;
  },

  renderPdf() {

    var pdf = this.pdfDocument,
        maxNumPages,
        pagePromise;

    if ( !pdf ) return false;

    maxNumPages  = this.maxNumPages;

    pagePromise = this.getAndRenderPage( pdf, 1 );

    Array.apply( null, new Array( maxNumPages - 1 ) ).forEach( ( value, index ) => {

      pagePromise = pagePromise.then( () => this.getAndRenderPage( pdf, index + 2 ) );
    } );
  },

  getAndRenderPage( pdf, index ) {

    return pdf.getPage( index ).then( page => this.renderPage( page, index ) );
  },
  

  renderPage( pdfPage, pageNum ) {

    var parentWidth       = this.$().parent().width(),
        pageViewportScale = ( parentWidth >= this.get( 'breakpoints.mobile' ) ) ? 1.5 : 1.3,
        viewport          = pdfPage.getViewport( { scale: parentWidth / pdfPage.getViewport( { scale: pageViewportScale } ).width } ),
        container         = this.$().find( '.pdf_viewer--container' )[ 0 ],
        pdfPageView;

    pdfPageView = new pdfjsViewer.PDFPageView( {
      container: container,
      id: pageNum,
      scale: viewport.scale,
      defaultViewport: viewport,
     textLayerFactory: new pdfjsViewer.DefaultTextLayerFactory()

    } );
    var pages = this.get('pages');
    // Associates the actual page with the view, and drawing it
     pages.push( pdfPageView );
    this.set( 'pages', pages );
    this.set( 'scale', viewport.scale );

    pdfPageView.setPdfPage( pdfPage );
   
    return pdfPageView.draw();
  },

i have seen some of them asked this question in here. but the authors are closing those issues without giving a proper answer ! therefore please dont close these kinda of issue which users are facing.

Configuration:

Web browser and its version: Chrome 78.0.3904.108 Operating system and its version: Mac OS 10.15 PDF.js version:2.3.200 Is a browser extension: No

Steps to reproduce the problem:

  1. Code using pdfviewer.pdfpageview option
renderPage( pdfPage, pageNum ) {

   var parentWidth       = this.$().parent().width(),
       pageViewportScale = ( parentWidth >= this.get( 'breakpoints.mobile' ) ) ? 1.5 : 1.3,
       viewport          = pdfPage.getViewport( { scale: parentWidth / pdfPage.getViewport( { scale: pageViewportScale } ).width } ),
       container         = this.$().find( '.pdf_viewer--container' )[ 0 ],
       pdfPageView;

   pdfPageView = new pdfjsViewer.PDFPageView( {
     container: container,
     id: pageNum,
     scale: viewport.scale,
     defaultViewport: viewport,
    textLayerFactory: new pdfjsViewer.DefaultTextLayerFactory()

   } );
   var pages = this.get('pages');
   // Associates the actual page with the view, and drawing it
    pages.push( pdfPageView );
   this.set( 'pages', pages );
   this.set( 'scale', viewport.scale );

   pdfPageView.setPdfPage( pdfPage );
  
   return pdfPageView.draw();
 },
  1. Open the PDF document
  2. grey colored text appears with main text ( duplication of same text in the page twice)

What is the expected behavior? correct pdf document to view ( it works when i removed textLayerFactory: new pdfjsViewer.DefaultTextLayerFactory() )

What went wrong? image

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension): None, Private Env

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:2
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
Snuffleupaguscommented, Dec 3, 2019

Please keep in mind that this is, first and foremost, an open source bug tracker and not a general support forum; hence everyone opening an issue are required to provide actionable information.

Given that the “pageviewer” example linked to above does work, that would suggest an error in your code (e.g. that you didn’t include the pdf_viewer.css file in your HTML code); hence why https://github.com/mozilla/pdf.js/issues/11379#issuecomment-561066363 specifically asked you to provide a runnable example here.

1reaction
Snuffleupaguscommented, Dec 3, 2019

The only suggestion here would be to refer to the “pageviewer” example in https://github.com/mozilla/pdf.js/tree/master/examples/components


However, this issue is currently missing all of required information necessary for it to be valid, and as-is it will be closed as INCOMPLETE.

First of all, you need to provide all of the details requested in https://github.com/mozilla/pdf.js/blob/master/.github/ISSUE_TEMPLATE.md; and please also see https://github.com/mozilla/pdf.js/blob/master/.github/CONTRIBUTING.md (emphasis mine):

If you are developing a custom solution, first check the examples at https://github.com/mozilla/pdf.js#learning and search existing issues. If this does not help, please prepare a short well-documented example that demonstrates the problem and make it accessible online on your website, JS Bin, GitHub, etc. before opening a new issue or contacting us on the IRC channel – keep in mind that just code snippets won’t help us troubleshoot the problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Missing or garbled text when converting or combining PDF ...
Solution: Embed fonts to avoid font substitution · Launch Acrobat. · Choose Advanced > Print Production > Acrobat Distiller. · In the Default ......
Read more >
How to solve common font issues in editable PDFs
STEP 1: CHECK THAT THE FONTS HAVE BEEN EMBEDDED · Open the PDF in Adobe Reader · Right click on the PDF and...
Read more >
What to do when a PDF document is converted to garbled ...
To fix unreadable text issues, go to the Preprocessing settings inside of your Document Parser (SETTINGS > PREPROCESSING) and set the option " ......
Read more >
I converted a PDF to word, why is the text messed up
The PDF to word conversion didn't work properly. The images are no longer visible in the word document and the text is just...
Read more >
Why is my text not displayed correctly in PDF?
When your PDF reports contain texts that do not stretch properly (i.e. are truncated or leave some space unused), or have improper line...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found