question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow skipping EOF validation in row block processing

See original GitHub issue

Because of this piece of code:

https://github.com/nissl-lab/npoi/blob/408ec5aaa9cff0e359c1e837074b161de18ed609/main/HSSF/Model/RecordOrderer.cs#L436-L438

We are having trouble reading XLS files generated by a 3rd party. This has been discussed for example here. So Excel itself and validators allow such invalid format and fix it during saving, but in POI and NPOI this validation kicks in and throws an error that really cannot be easily circumvented.

Would you accept a PR that would check for example a global static flag, like RecordOrderer.ValidateEofOnEndOfRowBlock which would otherwise default to true but would allow reading such files that do not fully follow the spec?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

3reactions
SheetJSDevcommented, Mar 13, 2022

@tonyqus there are errors in the specs and errors in plenty of third party software, so it’s generally best to start with Excel’s behavior. Attached is a sample BIFF5 XLS file generated using SheetJS https://jsfiddle.net/bk3pjs74/

var ws = XLSX.utils.aoa_to_sheet([
  ["a","b","c"],
  [1,2,3]
]);
var wb = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(wb, ws, "Sheet1");
XLSX.writeFile(wb, "npoi776.xls", { bookType: "biff5" });

BIFF5 file: npoi776.xls

To see a BIFF8 version of this, the library has to be patched to remove the current Window2 write call:

$ git clone --depth=1 https://github.com/sheetjs/sheetjs
$ cd sheetjs
$ <xlsx.flow.js sed '/if.*write_Window2/d' > testlib.js
$ node -pe 'var XLSX = require("./testlib"); var ws = XLSX.utils.aoa_to_sheet([["a","b","c"],[1,2,3]]);var wb = XLSX.utils.book_new();XLSX.utils.book_append_sheet(wb, ws, "Sheet1");XLSX.writeFile(wb, "npoi776.biff8.xls", { bookType: "biff8" });'

BIFF8 file: npoi776.biff8.xls

Both files pass Microsoft’s Binary File Format Validator and open in Excel 2019 for Windows and for Mac. It’s reasonable to assume spec error.

PS: From our old test machine, the program files have the following checksums:

MD5 (BFFWrapper.dll) = ae7fca7869a62b88b3e595aafbd8c81d
MD5 (docValidation.dll) = ef0da1baf8dea3b635ae4be478f643b6
MD5 (pptValidation.dll) = 28171b2b28ea74e03bd1ae07f36d50fd
MD5 (xlsValidation.dll) = 79a2a96d28fb630509b2d88e6a6c8676
MD5 (BFFValidator.exe) = b643c993e358c87423f47c8e8e99aa6f
1reaction
tonyquscommented, Apr 12, 2023

Can you create a PR for this? I think I can review the code. @lahma

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to get around issue with `tail -f` not emitting EOF and ...
If you pipe the output of tail -f into a program that reads the whole input before it starts emitting output, the program...
Read more >
Checking for EOF and skipping lines in java
I am trying to read from a text file and use the input to create multiple different objects. I obviously don't want to...
Read more >
MOVE-TO-EOF( ) method - Progress Documentation
MOVE-TO-EOF( ) method ... Moves the cursor position in an editor to the end of the current text. Return type: LOGICAL. Applies to:...
Read more >
Problem with BULK COLLECT with million rows - Ask TOM
Hi, We have a requirement where are supposed to load 58 millions of rows into a FACT Table in our DATA WAREHOUSE.
Read more >
VALIDATE statement - Progress Documentation
Verifies that a record complies with mandatory field and unique index definitions. Syntax VALIDATE record [ NO-ERROR ] record The name of the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found