Unexpected Automatic Date Conversion
See original GitHub issueIssue Overview
One of our systems which uses sheetjs
for parsing csv/xlsx data produced an unexpected result this week. The data which caused the issue was "Mayslanding, NJ 08234"
, which the library coerced to a date
as { t: 'n', v: 2313568, w: '5/1/34' }
. My hunch is that the code which attempts to guess if a cell might be a date is too lax given some inputs.
This happens when raw: false
, which is the default. We have worked around it by simply setting raw: true
and doing our own type conversions. Still, this unexpected result seems likely to trip up others using the default setting.
I am willing to volunteer to submit a PR to fix this issue but would like some feedback from a maintainer before writing any code. My guess is that any changes to this date conversion code could very easily cause other unexpected side effects, so advice would be appreciated.
Steps to Reproduce:
Given the (somewhat fabricated) csv data below, the following code will produce the result:
const wkbk = xlsx.readFile('./datesville.csv')
CSV input:
Line 1
"Januaryville, NJ 08234"
"Februaryville, NJ 08234"
"Marchville, NJ 08234"
"Aprilville, NJ 08234"
"Mayslanding, NJ 08234"
"Juneville, NJ 08234"
"Julyville, NJ 08234"
"Augustville, NJ 08234"
"Septemberville, NJ 08234"
"Octoberville, NJ 08234"
"Novemberlanding, NJ 08330"
"Decembertown, NJ 08330"
Unexpected result:
{
A1: { t: 's', v: 'Line 1' },
A2: { t: 'n', v: 2313448, w: '1/1/34' },
A3: { t: 'n', v: 2313479, w: '2/1/34' },
A4: { t: 'n', v: 2313507, w: '3/1/34' },
A5: { t: 'n', v: 2313538, w: '4/1/34' },
A6: { t: 'n', v: 2313568, w: '5/1/34' },
A7: { t: 'n', v: 2313599, w: '6/1/34' },
A8: { t: 'n', v: 2313629, w: '7/1/34' },
A9: { t: 'n', v: 2313660, w: '8/1/34' },
A10: { t: 'n', v: 2313691, w: '9/1/34' },
A11: { t: 'n', v: 2313721, w: '10/1/34' },
A12: { t: 'n', v: 2348815, w: '11/1/30' },
A13: { t: 'n', v: 2348845, w: '12/1/30' },
'!ref': 'A1:A13'
}
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:10 (5 by maintainers)
@jbull328 If this is showing up when you write a CSV, that’s Excel’s automatic conversion. For example, consider the CSV
If you write that as plaintext to a file and open in Excel, it will interpret those as dates:
In any case that’s an issue unrelated to this one.
To be clear, this ultimately boils down to how V8 (Chrome/Node) handles dates:
We try to correct for it in
fuzzydate
by looking for one of the month labels. I suspect the month name regex can be cleaned up a bit to explicitly match the short or long form for names, like