question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Large .xlsx file not getting .Sheets

See original GitHub issue

Hello, I am using the newest library of xlsx.js on the client-side (browser). It works reading xlsx files and rendering to HTML table for smaller files.

Now I have a larger xlsx file with 1mil rows. I am getting the sheetNames attributes, but I am not getting the .Sheets object. It is empty. A smaller sub-set of this file reads perfectly.

What could be the error? Thank you very much!

excel1

Update: I used the WTF: true read options and receive following errror:

xlsx.self-4eba00dc4b89315f21b672499210ec51d971b33bc29739752e402683284d8353.js:17317 Uncaught RangeError: Invalid string length
    at Array.join (native)
    at Object.arrayLikeToString [as string] (http://localhost:3000/assets/jszip.self-dcc1caf9dd7a61892574ee6089fd0084b3f35682d2f9a06604edf5ba62b4c505.js?body=1:1932:19)
    at Object.exports.transformTo (http://localhost:3000/assets/jszip.self-dcc1caf9dd7a61892574ee6089fd0084b3f35682d2f9a06604edf5ba62b4c505.js?body=1:2052:50)
    at ZipObject.dataToString (http://localhost:3000/assets/jszip.self-dcc1caf9dd7a61892574ee6089fd0084b3f35682d2f9a06604edf5ba62b4c505.js?body=1:712:24)
    at ZipObject.asBinary (http://localhost:3000/assets/jszip.self-dcc1caf9dd7a61892574ee6089fd0084b3f35682d2f9a06604edf5ba62b4c505.js?body=1:760:29)
    at getdatastr (http://localhost:3000/assets/xlsx.self-4eba00dc4b89315f21b672499210ec51d971b33bc29739752e402683284d8353.js?body=1:1556:38)
    at getdata (http://localhost:3000/assets/xlsx.self-4eba00dc4b89315f21b672499210ec51d971b33bc29739752e402683284d8353.js?body=1:1573:95)
    at getzipdata (http://localhost:3000/assets/xlsx.self-4eba00dc4b89315f21b672499210ec51d971b33bc29739752e402683284d8353.js?body=1:1594:19)
    at safe_parse_sheet (http://localhost:3000/assets/xlsx.self-4eba00dc4b89315f21b672499210ec51d971b33bc29739752e402683284d8353.js?body=1:17300:14)
    at parse_zip (http://localhost:3000/assets/xlsx.self-4eba00dc4b89315f21b672499210ec51d971b33bc29739752e402683284d8353.js?body=1:17429:3)

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:6
  • Comments:9 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
kisaragi99commented, Dec 16, 2021

Need help. I have the same error message as jensJJ had. Tried to parse calendar_stress_test file. (74mb, 1 million rows) https://github.com/SheetJS/sheetjs/issues/61#issuecomment-47676371 Converted it to xlsx.

Got this error.

image

If im not mistaken - the issue with V8 max string length is obsolete and the size of sheet1 (572mb) should be fine?

Archive:  calendar_stress_test_copy.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     1348  01-01-1980 00:00   [Content_Types].xml
      733  01-01-1980 00:00   _rels/.rels
      831  01-01-1980 00:00   xl/_rels/workbook.xml.rels
     1393  01-01-1980 00:00   xl/workbook.xml
      383  01-01-1980 00:00   xl/sharedStrings.xml
     7646  01-01-1980 00:00   xl/theme/theme1.xml
     2285  01-01-1980 00:00   xl/styles.xml
572078353  01-01-1980 00:00   xl/worksheets/sheet1.xml
    38292  01-01-1980 00:00   docProps/thumbnail.jpeg
      792  01-01-1980 00:00   docProps/app.xml
168383066  01-01-1980 00:00   xl/calcChain.xml
      593  01-01-1980 00:00   docProps/core.xml
---------                     -------
740515715                     12 files

But, it worked perfectly in Firefox - approx 2 minutes.

(No browser extensions enabled in both chrome and firefox)

1reaction
Kaiidocommented, Mar 6, 2021

Today was holiday here and I had time to make a simple Proof-Of-Concept of my idea in https://github.com/SheetJS/sheetjs/issues/792#issuecomment-785733626

You can see it at https://xml-stream-parsing-poc.glitch.me/

Sorry it is very fast written since I lack time, and it might make your eyes get out of their orbit seeing how bad my xlsx parser is (I really just wanted to somehow log the values in my sheet), but it at least shows that it should be possible to implement such an xml stream parser in your library too.
Using that I was able to parse 60MB+ files in Chrome, while SheetJS fails to read these.

However the speed of parsing obviously suffers from this streaming method, but I believe it’s preferable to have a slower parser than to not be able to parse some files, moreover when this allows to add some kind of a progress event.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What to do if a data set is too large for the Excel grid
Open the file in Excel for PC using Get Data- If you have the Excel app for PC, you can use Power Query...
Read more >
I am getting an error message when trying to load a large ...
One thing to try is to open the file in Microsoft Excel and use copy and paste to move the data to a...
Read more >
Spreadsheet import error: 'This worksheet is too large to import'
Even if your file size is lower than the 5MB limit, .xlsx files use ZIP compression within them. This means a 2–3MB file...
Read more >
java - Processing large xlsx file - Stack Overflow
If you are writing to XLSX, I found an improvement by writing to different sheets of the same Excel file.
Read more >
How I finally learned to deal with very large .xlsx files - Medium
openpyxl — allows you to get sheet names without loading the file, but not much else. xlrd — besides being deprecated, it didn't...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found