duplicate columns in ascii tables quietly break column name and format parsing
See original GitHub issueI have found that when a column name is duplicated in an ascii table the column name and format parsing quietly breaks e.g.
This works:
data = """
day precip type
Mon 1.5 rain
Tues 0.0 rain
Wed 1.1 snow
"""
table = ascii.read(data)
table.info()
<Table length=3>
name dtype
------ -------
day str4
precip float64
type str4
If you have a duplicate column name the parsing quietly fails.
data = """
day precip type day
Mon 1.5 rain Mon
Tues 0.0 rain Tues
Wed 1.1 snow Wed
"""
table = ascii.read(data)
table.info()
<Table length=4>
name dtype
---- -----
col1 str4
col2 str6
col3 str4
col4 str3
Issue Analytics
- State:
- Created 7 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Column name duplicated when use Join operation leading to ...
I am working on a Golang project using Gorm for database manipulations. When I perform a Join() operator on two tables that have...
Read more >Read a delimited file (including CSV and TSV) into a tibble
Duplicate column names will generate a warning and be made unique, see name_repair ... (ASCII spaces and tabs) be trimmed from each field...
Read more >C.3 Version History
New ASCII format output handler can write tables in the same text-based format used by the ASCII input handler. JoinStarTable can now deduplicate...
Read more >COPY INTO <table> - Snowflake Documentation
Specifies the positional number of the field/column (in the file) that contains the data to be loaded ( 1 for the first field,...
Read more >IO tools (text, CSV, HDF5, …) — pandas 1.5.2 documentation
General parsing configuration#. dtypeType name or dict of column -> type, default None. Data type for data or columns ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Since this is an expected feature, can we close the issue?
I think that the original issue here, namely reading the file as a different format from expected, has been resolved.
io.ascii
is doing the correct and expected behavior given the requirement of unique column names.So I’m closing this, but with the follow-on issue #5374 to consider modifying that requirement and allowing duplicates in the input.