OutOfMemoryException when parsing Excel document / endless while-loop
See original GitHub issueI’m trying to parse an Excel document (xls) using the OpenMcdf.Extensions package.
We call AsOLEProperties extension method and eventually, the execution gets stuck in this while loop (https://github.com/ironfede/openmcdf/blob/master/sources/OpenMcdf/CompoundFile.cs#L1493-L1514) :
while (true)
{
if (nextSecID == Sector.ENDOFCHAIN)
break;
Sector ms = new Sector(Sector.MINISECTOR_SIZE, sourceStream);
byte[] temp = new byte[Sector.MINISECTOR_SIZE];
ms.Id = nextSecID;
ms.Type = SectorType.Mini;
miniStreamView.Seek(nextSecID * Sector.MINISECTOR_SIZE, SeekOrigin.Begin);
miniStreamView.Read(ms.GetData(), 0, Sector.MINISECTOR_SIZE);
result.Add(ms);
miniFATView.Seek(nextSecID * 4, SeekOrigin.Begin);
nextSecID = miniFATReader.ReadInt32();
}
When the loop is entered, the nextSecID is 27. At the end of the loop the nextSecID is set to 0. And the nextSecID keeps being null as the same data is read on each loop.
Is 0 even a valid value for nextSecID?
Any idea, what to do about this?
The document in question can be opened fine in Excel. Unfortunately, we got it from a customer of ours, so we can’t share it.
Issue Analytics
- State:
- Created 5 years ago
- Comments:11 (7 by maintainers)
Top Results From Across the Web
OutOfMemory issue while creating XSSFWorkbook ...
This can either be an error in the library which leads to an endless loop (which will end in an OOME no matter...
Read more >System.OutOfMemoryException when reading corrupt ...
OutOfMemoryException with the call stack: at System. ... OutOfMemoryException when parsing Excel document / endless while-loop #30.
Read more >Getting Out of Memory Exception when downloading large ...
I am trying to download the document from file server. FileSize maybe more than 1GB. Here the thing is, very first time file...
Read more >System.OutOfMemoryException While loading data from ...
Excel cannot hold an infinite amount of data. You probably really are running the machine out of memory or you're Exceeding the amount...
Read more >How to avoid 'system.outofmemoryexception' error in ...
So, I tried fetching the data in ranges(row limits) to a datatable in a dataset and write the corresponding data to an Output...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@ironfede Of course, this doesn’t fix the underlying problem, but at least the load operation fails fast and does not cause an OutOfMemoryException. So it’s a quick fix.
That being said, I don’t even think that this PR fixed all problems - only the simple ones. In multiple places, the following check is performed:
This certainly helps when you have a corrupt file where one sector points to itself. But what if you have something like a cyclic chain?
1 -> 2 -> 3 -> 1
This wouldn’t be catched and we would be in the same mess as before. I would propose a change in that check that we keep a list of all processed ids and if we hit one we had before, we’ll throw the CFCorruptedFileException.
As for the source file: It was created by one of our customers. I’ll try to strip out all the content and to reproduce the error. If it still happens, I might be able to share it with you.
Yes, there are potentially issues in multiple places (there is a file attached to #40 that makes it blow up in GetDifatSectorChain).