Merging some pdfs results in ExternalDocument returning error
See original GitHub issueHi,
I am using your library for merging multiples pdfs in one after uploading various documents from aa web application. While this is working without problems most of the time, certain documents fail the EternalDocument stage, without a specific error being returned. The process is the following :
- documents are uploaded as data-uri from a web application
- if documents are images they are converted to pdf before being uploaded,
- when uploaded pdfs are stored as blob ( base64 ) in a database
- when all documents have been successfully uploaded they are merged in a single pdf that is stored on disk;
The merge process, extract each pdf from the database blob, turn it back to data-uri and convert it to a buffer to pass it to ExternalDocument before to turn it in a recognized pdfjs pdf and add it to the merged pdf.
async function _mergeFiles(files, fileName) {
var pdf = requireNode('pdfjs');
var fs = requireNode('fs');
var toBuffer = requireNode('data-uri-to-buffer');
try {
var doc = new pdf.Document();
for (var i = 0; i < files.length; i++) {
var file = files[i];
var src,
ext;
src = file.dataUri.toBuffer();
var dataUri = 'data:application/pdf;base64,' + src.toString('base64');
src = toBuffer(dataUri);
ext = new pdf.ExternalDocument(src);
doc.setTemplate(ext);
doc.addPagesOf(ext);
}
var writeStream = doc.pipe(fs.createWriteStream(fileName));
await doc.end();
var writeStreamClosedPromise = new Promise((resolve, reject) => {
try {
writeStream.on('close', () => resolve())
} catch (e) {
reject({file: file.name, sequence: file.sequence, reason: e});
}
})
src = null;
ext = null;
doc = null;
dataUri = null;
return writeStreamClosedPromise;
} catch (e) {
reject({file: file.name, sequence: file.sequence, reason: e});
}
}
this process works fine most of the time but some pdfs won’t pass the ExternalDocument stage.
ACTU 04-20.pdf CCF_000001.pdf LONGY_JULIE_Complément dossier_LONGY Julie.pdf LONGY_JULIE_DAMA_LONGY Julie.pdf LONGY_JULIE_detail dossier_longy julie.pdf
the above files are examples of pdfs that won’t pass the External Document step.
I am using the latest version of pdfjs : v2.3.7
thanks for your help
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (3 by maintainers)

Top Related StackOverflow Question
Hello!
I don’t know if it would help, but I needed to merge PDF too, and this library saved me (so thanks a lot @rkusa, very nice job!).
My PDF files are generated with
puppeeterand others are stored in AWS. I work with “Buffers” only, and it seems it’s what you need @rernens too.Here’s my method:
So far so good, I like when it’s simple. I hope it can help somehow.
@rkusa Hi Markus. Even if this was not the main use-case of pdfjs, so far it has proven to be the lighter weight et most reliable one for merging pdfs altogether. Tried many libraries and yours remains unmatchable so far even if some parsing issues remain. Thanks for that.