All text missing from node.js rasterization
See original GitHub issueAttach (recommended) or Link to PDF file here: 1120.pdf
Configuration:
- Operating system and its version: ubuntu 18, node v10.13.0
- PDF.js version: HEAD, e9661edda74963dbbba411529fe48f8e8cd01c5c
- Is a browser extension: no
Steps to reproduce the problem:
- Clean build,
mv build/dist node_modules/pdfjs-dist
- Move the sample
1120.pdf
tocompressed.tracemonkey-pldi-09.pdf
cd examples/node/pdf2png/; node pdf2png.js
What is the expected behavior? An accurate rasterization of the first page of the source document, i.e.
What went wrong? None of the text in the document is rendered:
stderr
contains a bunch of warnings like:
Warning: getPathGenerator - ignoring character: "Error: Requesting object that isn't resolved yet Helvetica_path_$.".
Note that the document contains a single embedded font, and it’s not Helvetica, so the message doesn’t really make sense to me.
Are there things that need to be done in a node context to ensure reliably correct rasterization that aren’t included in the example code?
Issue Analytics
- State:
- Created 5 years ago
- Reactions:2
- Comments:8 (4 by maintainers)
Top Results From Across the Web
Errors | Node.js v19.3.0 Documentation
Error objects capture a "stack trace" detailing the point in the code at which the Error was instantiated, and may provide a text...
Read more >Text as a vector drawing? - Beginners' Questions - Inkscape
I assumed that it would be possible to type text into an Inkscape document, select it, break it apart, and convert it to...
Read more >Inside look at modern web browser (part 3)
Inner workings of a Renderer Process. This is part 3 of 4 part blog series looking at how browsers work.
Read more >Software vs. GPU Rasterization in Chromium* - Intel
This article is a general overview of the ways that web browsers can rasterize website information into actual pixels you can.
Read more >Using Custom Fonts With SVG in an Image Tag | CSS-Tricks
To be more precise, if you have any text in your SVG, unless you embed ... porting our codes to work on Node.js...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
If I add
disableFontFace:false
to the.getDocument
call, the rasterization output is as expected.It seems that
disableFontFace:true
is the default on node, perhaps given the workaround advice provided in e.g. #7929 and elsewhere?It seems like that “compatibility” setting is counterproductive in this case? It seems at least one other person determined this, https://github.com/mozilla/pdf.js/issues/4244#issuecomment-463865708
Adding to https://github.com/mozilla/pdf.js/issues/10623#issuecomment-470199768, I could possibly provide a basic patch which allows loading of external font files[1] such that they are available to the font code on the worker-thread.
However, please note I’d only consider doing so in return for reasonable (monetary) compensation.
For anyone using PDF.js in e.g. a commercial setting, and thus willing to pay for a patch, please feel free to contact me directly to discuss this further (email address is in my GitHub profile).
[1] Obviously I wouldn’t be able, for e.g. copyright reasons, to provide the actual font files themselves; however I’ve used these ones when taking the screen-shot in https://github.com/mozilla/pdf.js/issues/10623#issuecomment-470154820.