SAX Approach Replace Example
See original GitHub issueI’m looking for an example of using the SAX approach (instead of DOM) to open a large file and perform a replace on a given value, then save the file.
I think this would be a useful addition to the Documentation.
How to: Search and replace text in a document part (Open XML SDK) https://docs.microsoft.com/en-us/office/open-xml/how-to-search-and-replace-text-in-a-document-part
This uses a Stream
.
How to: Parse and read a large spreadsheet document (Open XML SDK) https://docs.microsoft.com/en-us/office/open-xml/how-to-parse-and-read-a-large-spreadsheet
// The SAX approach.
static void ReadExcelFileSAX(string fileName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
string text;
while (reader.Read())
{
if (reader.ElementType == typeof(CellValue))
{
text = reader.GetText();
Console.Write(text + " ");
}
}
Console.WriteLine();
Console.ReadKey();
}
}
If I read in the “text” and wish to replace this:
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filePath, true))
{
Document document = wordDoc.MainDocumentPart.Document;
OpenXmlReader reader = OpenXmlReader.Create(document);
while (reader.Read())
{
//OpenXmlElement element = reader.LoadCurrentElement();
//text = element.InnerText;
text = reader.GetText();
}
}
OpenXmlWriter
expects an OpenXmlPart
or Stream
when you create it.
OpenXmlWriter writer = OpenXmlWriter.Create(#);
writer.WriteStartElement(reader);
writer.WriteElement(#);
writer.WriteEndElement();
writer.Close();
What is the supported approach/method for this?
Using the DOM approach on large files can cause memory exceptions.
There are a number of blog posts documenting how to use the OpenXmlWriter
but this is for creating new files or adding new elements to an existing file, not updating existing data.
Parsing and Reading Large Excel Files with the Open XML SDK http://blogs.msdn.com/b/brian_jones/archive/2010/05/27/parsing-and-reading-large-excel-files-with-the-open-xml-sdk.aspx [Dead Link] https://web.archive.org/web/20151205145806/http://blogs.msdn.com/b/brian_jones/archive/2010/05/27/parsing-and-reading-large-excel-files-with-the-open-xml-sdk.aspx
Writing Large Excel Files with the Open XML SDK http://blogs.msdn.com/b/brian_jones/archive/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk.aspx [Dead Link] https://web.archive.org/web/20160216062257/http://blogs.msdn.com/b/brian_jones/archive/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk.aspx
Performance issue while reading/writing large excel files using OpenXML SDK http://tech-turf.blogspot.com/2015/10/performance-issue-while-readingwriting.html
How to read and write Excel cells with OpenXML and C# http://fczaja.blogspot.com/2013/05/how-to-read-and-write-excel-cells-with.html
How to properly use OpenXmlWriter to write large Excel files http://polymathprogrammer.com/2012/08/06/how-to-properly-use-openxmlwriter-to-write-large-excel-files/
Issue Analytics
- State:
- Created 5 years ago
- Reactions:8
- Comments:9 (7 by maintainers)
Top GitHub Comments
+1 for this - I am currently faced with a presentation containing lots of vector graphics that ends up allocating 15.000.000 objects in memory using the DOM approach. This basically makes our product unusable. I would also love to hear about workaround to avoid loading everything to memory before it is possible to do manipulation on elements.
Sorry for the auto close. I enabled a bot that went through and closed issues that had no comments on it. Happy to reopen and take a look.