question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Huge memory consumption when processing file

See original GitHub issue

Hello!

When I’m reading file manually with code like this: ` int rowCntr = 0;

            List<List<object>> rows = new List<List<object>>();

            List<object> row;

            using (var reader = ExcelReaderFactory.CreateReader(stream))
            {
                try
                {
                    do
                    {
                        rowCntr++;
                        while (reader.Read())
                        {
                            int columnsCount = reader.FieldCount;
                            row = new List<object>();
                            for (int i = 0; i < columnsCount; i++)
                            {
                                row.Add(reader.GetValue(i));
                            }
                            rows.Add(row);
                        }
                    } while (reader.NextResult());
                }
                catch (Exception)
                {
                    MessageBox.Show(rowCntr.ToString(), "");
                }
            }

` my application grows to approx 400-500 mbytes of operative memory.

When I’m trying to read xls file like this: `

            using (var reader = ExcelReaderFactory.CreateReader(stream))

            {
                try
                {
                    result = reader.AsDataSet();
                }
                catch (Exception)
                {
                    MessageBox.Show(", "");
                }
            }

` my application grows up to approx 800-900 mbytes of operative memory. It looks like a problem. When I debugged first approach, I found that rows collection contains… 1.048.567 elements! My file contains just 17 (!!!) rows. It looks like Excel Reader doesn’t see the end of non-empty part of the file.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:13

github_iconTop GitHub Comments

1reaction
appel1commented, Aug 31, 2017

For AsDataSet I think it makes more sense to include empty lines except for trailing empty lines by default to better match what you see in Excel which I think most expect. The DataSet can’t afaik include metadata so trailing lines with just metadata should count as empty solving this issue. Doesn’t hurt with a row filter callback also for more advanced scenarios.

0reactions
andersnmcommented, Aug 31, 2017

I think we can resolve this issue (and #160) by implementing a new row filtering callback option for AsDataSet(). By default this option can exclude ALL empty rows, and the caller can override it if needed.

(it won’t allow for skipping only trailing empty rows, but isn’t a requirement)

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to optimize the memory usage for large file processing
Increasing the heap size may not be a good solution as someday you may get a file even bigger than your heap size....
Read more >
Fix High RAM Memory Usage Issue on Windows 11/10 ...
File system error is a major reason that may cause high memory usage or high CPU usage on Windows 11/10/8/7 computers. Professional partition ......
Read more >
Windows 10 High Memory Usage [Causes and Solutions]
Windows 10 high memory usage is usually related to RAM and virtual memory. Although memory is tightly connected with CPU and hard drive,...
Read more >
What to Do When Your Data Is Too Big for Your Memory?
Time-costing solution: Your RAM might be too small to handle your data, but often, your hard drive is much larger than your RAM....
Read more >
Creating large file ties up memory after process is done
Before starting the script the memory usage is about 20%. Once the process is done the memory usage is about 90% based on...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found