question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SAX Approach Replace Example

See original GitHub issue

I’m looking for an example of using the SAX approach (instead of DOM) to open a large file and perform a replace on a given value, then save the file.

I think this would be a useful addition to the Documentation.

How to: Search and replace text in a document part (Open XML SDK) https://docs.microsoft.com/en-us/office/open-xml/how-to-search-and-replace-text-in-a-document-part

This uses a Stream.

How to: Parse and read a large spreadsheet document (Open XML SDK) https://docs.microsoft.com/en-us/office/open-xml/how-to-parse-and-read-a-large-spreadsheet

// The SAX approach.
static void ReadExcelFileSAX(string fileName)
{
    using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
    {
        WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
        WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();

        OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
        string text;
        while (reader.Read())
        {
            if (reader.ElementType == typeof(CellValue))
            {
                text = reader.GetText();
                Console.Write(text + " ");
            }
        }
        Console.WriteLine();
        Console.ReadKey();
    }
}

If I read in the “text” and wish to replace this:

using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filePath, true))
{
    Document document = wordDoc.MainDocumentPart.Document;
    OpenXmlReader reader = OpenXmlReader.Create(document);
    while (reader.Read())
    {
        //OpenXmlElement element = reader.LoadCurrentElement();
        //text = element.InnerText;
        text = reader.GetText();
    }
}

OpenXmlWriter expects an OpenXmlPart or Stream when you create it.

OpenXmlWriter writer = OpenXmlWriter.Create(#);
writer.WriteStartElement(reader);
writer.WriteElement(#);
writer.WriteEndElement();
writer.Close();

What is the supported approach/method for this?

Using the DOM approach on large files can cause memory exceptions.


There are a number of blog posts documenting how to use the OpenXmlWriter but this is for creating new files or adding new elements to an existing file, not updating existing data.

Parsing and Reading Large Excel Files with the Open XML SDK http://blogs.msdn.com/b/brian_jones/archive/2010/05/27/parsing-and-reading-large-excel-files-with-the-open-xml-sdk.aspx [Dead Link] https://web.archive.org/web/20151205145806/http://blogs.msdn.com/b/brian_jones/archive/2010/05/27/parsing-and-reading-large-excel-files-with-the-open-xml-sdk.aspx

Writing Large Excel Files with the Open XML SDK http://blogs.msdn.com/b/brian_jones/archive/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk.aspx [Dead Link] https://web.archive.org/web/20160216062257/http://blogs.msdn.com/b/brian_jones/archive/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk.aspx

Performance issue while reading/writing large excel files using OpenXML SDK http://tech-turf.blogspot.com/2015/10/performance-issue-while-readingwriting.html

How to read and write Excel cells with OpenXML and C# http://fczaja.blogspot.com/2013/05/how-to-read-and-write-excel-cells-with.html

How to properly use OpenXmlWriter to write large Excel files http://polymathprogrammer.com/2012/08/06/how-to-properly-use-openxmlwriter-to-write-large-excel-files/

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:8
  • Comments:9 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
sorensenmatiascommented, Oct 5, 2020

+1 for this - I am currently faced with a presentation containing lots of vector graphics that ends up allocating 15.000.000 objects in memory using the DOM approach. This basically makes our product unusable. I would also love to hear about workaround to avoid loading everything to memory before it is possible to do manipulation on elements.

1reaction
twsouthwickcommented, May 22, 2020

Sorry for the auto close. I enabled a bot that went through and closed issues that had no comments on it. Happy to reopen and take a look.

Read more comments on GitHub >

github_iconTop Results From Across the Web

SAX Approach Replace Example · Issue #566
I'm looking for an example of using the SAX approach (instead of DOM) to open a large file and perform a replace on...
Read more >
OpenXML SAX approach when reading from database?
With the SAX approach, both my original and replacement sheets are originally empty. What is the point of reading from the original sheet...
Read more >
Template Based Approach to Export Data to Excel: Part III
In this article you will learn how to work with a template based approach to export data to Excel.
Read more >
Parsing an XML File Using SAX Parser
SAX is an API used to parse XML documents. It is based on events generated while reading through the document. Callback methods receive ......
Read more >
Java SAX Parser Example
SAXParser provides method to parse XML document using event handlers. This class implements XMLReader interface and provides overloaded versions ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found