Comments in data file break parsing
See original GitHub issue- Operating System: Mac OS X
- Node Version: 10.15.0
- NPM Version: 6.4.1
- csv-parser Version: 2.1.0
Expected Behavior
USGS files should be parsed, but instead their leading hash tag comments break the parsing. Simple data file and code is pasted below.
I think skiplines might let me ignore those leading hash lines, but question: what if the number of hashed comments changes (e.g. when government shutdown finally stops, hopefully) Anyway to ignore/filter the hashtag lines?
[ Row { agency_cd: '5s', site_no: '15s', datetime: '20d', tz_cd: '6s', '174907_72019': '14n', '174907_72019_cd': '10s' }, Row { agency_cd: 'USGS', site_no: '174237064474900', datetime: '2019-01-21 00:00', tz_cd: 'AST', '174907_72019': '14.09', '174907_72019_cd': 'P' }, Row { agency_cd: 'USGS', site_no: '174237064474900', datetime: '2019-01-21 01:00', tz_cd: 'AST', '174907_72019': '14.09', '174907_72019_cd': 'P' }, Row { agency_cd: 'USGS', site_no: '174237064474900', datetime: '2019-01-21 02:00', tz_cd: 'AST', '174907_72019': '14.07', '174907_72019_cd': 'P' }, Row { agency_cd: 'USGS', site_no: '174237064474900', datetime: '2019-01-21 03:00', tz_cd: 'AST', '174907_72019': '14.07', '174907_72019_cd': 'P' }, Row { agency_cd: 'USGS', site_no: '174237064474900', datetime: '2019-01-21 04:00', tz_cd: 'AST', '174907_72019': '14.07', '174907_72019_cd': 'P' } ]
Actual Behavior
[ Row { '# ---------------------------------- WARNING ----------------------------------------': '# Some of the data that you have obtained from this U.S. Geological Survey database' }, Row { '# ---------------------------------- WARNING ----------------------------------------': '# may not have received Director\'s approval. Any such data values are qualified' }, Row { '# ---------------------------------- WARNING ----------------------------------------': '# as provisional and are subject to revision. Provisional data are released on the' }, Row { '# ---------------------------------- WARNING ----------------------------------------': '# condition that neither the USGS nor the United States Government may be held liable' }, Row { '# ---------------------------------- WARNING ----------------------------------------': '# for any damages resulting from its use.' }, Row { '# ---------------------------------- WARNING ----------------------------------------': '#' }, Row { '# ---------------------------------- WARNING ----------------------------------------': 'agency_cd' }, Row { '# ---------------------------------- WARNING ----------------------------------------': '5s' }, Row { '# ---------------------------------- WARNING ----------------------------------------': 'USGS' }, Row { '# ---------------------------------- WARNING ----------------------------------------': 'USGS' }, Row { '# ---------------------------------- WARNING ----------------------------------------': 'USGS' }, Row { '# ---------------------------------- WARNING ----------------------------------------': 'USGS' }, Row { '# ---------------------------------- WARNING ----------------------------------------': 'USGS' } ]
How Do We Reproduce?
Super simple test data file: https://gist.githubusercontent.com/mishawagon/d8047ae5aaf29e63bcf1b6348318b7a4/raw/c9ddff257eef6ba430e6024bdea8813c00dd7f2a/USGS%2520Test%2520file
Super simple node.js program to parse it: https://gist.githubusercontent.com/mishawagon/6ff68b94c87c8e3fddd703c92a92e4af/raw/afc6b0858f4bf2e452beffc03d00d7df2d7dd922/gistfile1.txt
Issue Analytics
- State:
- Created 5 years ago
- Comments:6
Top GitHub Comments
Thank you very much and totally amazing how you can do this while daddying. Truly you are a gentleperson and a scholar. Thanks again
I realized I never thanked you for taking the time to look into this, thanks again!