Extract from a buffer
See original GitHub issueI am using Node.js and downloading .doc
files using superagent
. This gives me a buffer object that I would like to parse and extract text from. However, word-extractor
only seems to support files.
How do I extract the text from a .doc
in memory, not in a file?
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:10 (5 by maintainers)
Top Results From Across the Web
Buffer and Extraction Buffer- Definition, Components ...
Extraction buffers, also sometimes referred to as the lysis buffer is a buffer solution used for the purpose of breaking open cells for...
Read more >Different Types of Extraction Buffers and When to Use Them
Different Types of Extraction Buffers and When to Use Them · It improves the stability of protein molecules as they are subjected to...
Read more >DATA: Extracting the data into an extract buffer - IBM
The extract buffer is a header that is followed by one or more entries describing each valid backup version. There can be up...
Read more >Extraction Buffer - an overview | ScienceDirect Topics
A suitable extraction buffer is 25 mM K phosphate, pH 7.5; 2 mM MgCl2; 2 mM EDTA; 15% (v/v) glycerol and 0.2% (v/v)...
Read more >javascript - Extract a buffer having different types of data with it
You can use ArrayBuffer to create a buffer to hold the data. ... is one way of extracting the different types of data...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
That’s mainly an issue of the underlying OLE implementation, which is very much wired to use files. All the logic that depends on
fs
is local toOleCompoundDoc
, so one solution would be to build an alternative implementation of that classes that is backed by a buffer rather than a file. Or, perhaps better, to refactor the file system access to a separate set of methods that could be overridden more easily.It’s a nice and important addition. If I can get the time for this, I will.
I implemented buffer support at gmr-fms/node-word-extractor if you guys are willing to switch to the npm package
@gmr-fms/word-extractor
. I didn’t want to work with coffeescript hence the js source and slightly modified api.I really appreciate this library though, the code was very clean and easy to follow.