question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

WordprocessingDocument.Open is very slow

See original GitHub issue

Description

WordprocessingDocument.Open is very very slow when reading big .docx document. i’m trying to read 10 mb sized .docx document and it takes about 1 minute just to open it.

Information

  • .NET Target: .NET Core 2.2
  • DocumentFormat.OpenXml Version: 2.9.0

Repro

Console.WriteLine("Creating filter")
using (var doc = WordprocessingDocument.Open(path, false))
{
        Console.WriteLine("Creating BodyReader");
        _bodyReader = OpenXmlReader.Create(doc.MainDocumentPart.Document);
}

Link to the file: https://drive.google.com/file/d/1_InQLbZ19KCUgkuePAiLXvUuLcZl6Qu7/view?usp=sharing

Uploaded to GitHub: 10mb_file.docx

Observed

I put to lines of Console.WriteLine so the time between “Creating filter” and “Creating BodyReader” is about 1 min. It doesn’t matter if i opening file from memory stream or just giving it a real path to the file.

Expected

Instant open expected 😃

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:15 (8 by maintainers)

github_iconTop GitHub Comments

3reactions
twsouthwickcommented, May 30, 2020

I’m going to open as I recently made a change to System.IO.Packaging that may help this and I want to verify. See https://github.com/dotnet/runtime/pull/35978

1reaction
twsouthwickcommented, Jul 10, 2020

FYI the fix that I got into System.IO.Packaging fixes this!

Before:

| Method |    Mean |    Error |   StdDev |
|------- |--------:|---------:|---------:|
|   Open | 9.167 s | 0.1791 s | 0.2265 s |

After:

| Method |     Mean |   Error |  StdDev |
|------- |---------:|--------:|--------:|
|   Open | 184.5 ms | 3.64 ms | 4.85 ms |

Since the package is still in preview, we won’t be upgrading it for the project at this time. Note, since the major version of this is changing, we won’t actually be able to bring it into the repo until v3.0 (due to semantic versioning). That’ll probably happen sometime soon, but not for a bit. However, you may manually bring in 5.0.0-preview.6.20305.6+ of the package to get the benefit (although it will only help you if you are on .NET Core)

Read more comments on GitHub >

github_iconTop Results From Across the Web

WordprocessingDocument.Open is very slow · Issue #628
Open is very very slow when reading big .docx document. i'm trying to read 10 mb sized .docx document and it takes about...
Read more >
OpenXML is very slow (more than 2 hours) to read .xlsx file ...
I generated an excel file (.xlsx) using OpenXML in .Net 6 application. The file has 2 lakh rows and 8 columns.
Read more >
The WordprocessingMLPackage load file is extremely slow
When I use DOCX4J to manipulate the docx file, the line wordMLPackage = WordprocessingMLPackage.load(file) takes me 45 seconds to finish.
Read more >
OpenXML read performance too slow while parsing excel file
Am I doing something wrong? using (SpreadsheetDocument excelDoc = SpreadsheetDocument.Open(filename, true)) { WorkbookPart workbookPart = ...
Read more >
Word Slow to Open Documents (Microsoft Word) - Word Tips
If you've noticed a slowdown in Word when it is opening a document, you probably would like to speed up the operation. Here...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found