How to set data type to whole Table.
See original GitHub issueHello tablesaw team. I’ve question regarding setup of data type for the Table if I don’t know the amount of colums. I have 30_000x2000 feature csv file with 0.0 and some other amount of Double numbers. If I call csv parsing via:
CsvReadOptions options = CsvReadOptions.builder(csv)
.header(false)
.maxNumberOfColumns(50_000).build();
Table t = Table.read().csv(options);
I got Number format exception, as all 0.0 number are treated as Short 0. So when reader gets to real numbers like 13.5 if throws NFE.
But if I add sample(false)
to reader options if takes about 2:40 to parse such file.
How can I setup data type for whole Table, as far as I can see only by setting columnType in parser option, but it’s won’t work as I don’t know a number of columns on csv file?
P.S. I used com.univocity.parsers.csv.CsvParser separately to read the same file so it takes 2:20 for parser.parsAll and 1:20 for parsing file by row.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:17 (8 by maintainers)
Top GitHub Comments
Yes, exactly!
if you’re reading a CSV file, there is an option in CsvReadOptions that lets you specify the types to be used when you read the file. If you just provide one type (e.g. STRING), they would all be the same. IDK if this is implemented for Excel, however.
On Tue, Jan 25, 2022 at 9:28 AM Wesley @.***> wrote: