WordprocessingDocument.Open is very slow
See original GitHub issueDescription
WordprocessingDocument.Open
is very very slow when reading big .docx document.
i’m trying to read 10 mb sized .docx document and it takes about 1 minute just to open it.
Information
- .NET Target: .NET Core 2.2
- DocumentFormat.OpenXml Version: 2.9.0
Repro
Console.WriteLine("Creating filter")
using (var doc = WordprocessingDocument.Open(path, false))
{
Console.WriteLine("Creating BodyReader");
_bodyReader = OpenXmlReader.Create(doc.MainDocumentPart.Document);
}
Link to the file: https://drive.google.com/file/d/1_InQLbZ19KCUgkuePAiLXvUuLcZl6Qu7/view?usp=sharing
Uploaded to GitHub: 10mb_file.docx
Observed
I put to lines of Console.WriteLine
so the time between “Creating filter” and “Creating BodyReader” is about 1 min. It doesn’t matter if i opening file from memory stream or just giving it a real path to the file.
Expected
Instant open expected 😃
Issue Analytics
- State:
- Created 4 years ago
- Comments:15 (8 by maintainers)
Top Results From Across the Web
WordprocessingDocument.Open is very slow · Issue #628
Open is very very slow when reading big .docx document. i'm trying to read 10 mb sized .docx document and it takes about...
Read more >OpenXML is very slow (more than 2 hours) to read .xlsx file ...
I generated an excel file (.xlsx) using OpenXML in .Net 6 application. The file has 2 lakh rows and 8 columns.
Read more >The WordprocessingMLPackage load file is extremely slow
When I use DOCX4J to manipulate the docx file, the line wordMLPackage = WordprocessingMLPackage.load(file) takes me 45 seconds to finish.
Read more >OpenXML read performance too slow while parsing excel file
Am I doing something wrong? using (SpreadsheetDocument excelDoc = SpreadsheetDocument.Open(filename, true)) { WorkbookPart workbookPart = ...
Read more >Word Slow to Open Documents (Microsoft Word) - Word Tips
If you've noticed a slowdown in Word when it is opening a document, you probably would like to speed up the operation. Here...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’m going to open as I recently made a change to System.IO.Packaging that may help this and I want to verify. See https://github.com/dotnet/runtime/pull/35978
FYI the fix that I got into System.IO.Packaging fixes this!
Before:
After:
Since the package is still in preview, we won’t be upgrading it for the project at this time. Note, since the major version of this is changing, we won’t actually be able to bring it into the repo until v3.0 (due to semantic versioning). That’ll probably happen sometime soon, but not for a bit. However, you may manually bring in 5.0.0-preview.6.20305.6+ of the package to get the benefit (although it will only help you if you are on .NET Core)