question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Extract from a buffer

See original GitHub issue

I am using Node.js and downloading .doc files using superagent. This gives me a buffer object that I would like to parse and extract text from. However, word-extractor only seems to support files.

How do I extract the text from a .doc in memory, not in a file?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:1
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
morungoscommented, Jan 15, 2018

That’s mainly an issue of the underlying OLE implementation, which is very much wired to use files. All the logic that depends on fs is local to OleCompoundDoc, so one solution would be to build an alternative implementation of that classes that is backed by a buffer rather than a file. Or, perhaps better, to refactor the file system access to a separate set of methods that could be overridden more easily.

It’s a nice and important addition. If I can get the time for this, I will.

2reactions
olsonpmcommented, Oct 31, 2018

I implemented buffer support at gmr-fms/node-word-extractor if you guys are willing to switch to the npm package @gmr-fms/word-extractor. I didn’t want to work with coffeescript hence the js source and slightly modified api.

const fs = require('fs')
const extract = require('@gmr-fms/word-extractor')

const buf = fs.readFileSync('path/to/file.doc')

extract.fromBuffer(buf).then(doc => {
  // do stuff with doc here
})

I really appreciate this library though, the code was very clean and easy to follow.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Buffer and Extraction Buffer- Definition, Components ...
Extraction buffers, also sometimes referred to as the lysis buffer is a buffer solution used for the purpose of breaking open cells for...
Read more >
Different Types of Extraction Buffers and When to Use Them
Different Types of Extraction Buffers and When to Use Them · It improves the stability of protein molecules as they are subjected to...
Read more >
DATA: Extracting the data into an extract buffer - IBM
The extract buffer is a header that is followed by one or more entries describing each valid backup version. There can be up...
Read more >
Extraction Buffer - an overview | ScienceDirect Topics
A suitable extraction buffer is 25 mM K phosphate, pH 7.5; 2 mM MgCl2; 2 mM EDTA; 15% (v/v) glycerol and 0.2% (v/v)...
Read more >
javascript - Extract a buffer having different types of data with it
You can use ArrayBuffer to create a buffer to hold the data. ... is one way of extracting the different types of data...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found