question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

duplicate columns in ascii tables quietly break column name and format parsing

See original GitHub issue

I have found that when a column name is duplicated in an ascii table the column name and format parsing quietly breaks e.g.

This works:

data = """
day precip type
Mon  1.5   rain
Tues 0.0  rain
Wed  1.1 snow
"""

table = ascii.read(data)
table.info()

<Table length=3>
 name   dtype
------ -------
   day    str4
precip float64
  type    str4

If you have a duplicate column name the parsing quietly fails.

data = """
day precip type day
Mon  1.5   rain  Mon
Tues 0.0  rain   Tues
Wed  1.1 snow    Wed
"""

table = ascii.read(data)
table.info()


<Table length=4>
name dtype
---- -----
col1  str4
col2  str6
col3  str4
col4  str3

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
pllimcommented, Oct 3, 2016

Since this is an expected feature, can we close the issue?

0reactions
taldcroftcommented, Oct 3, 2016

I think that the original issue here, namely reading the file as a different format from expected, has been resolved. io.ascii is doing the correct and expected behavior given the requirement of unique column names.

So I’m closing this, but with the follow-on issue #5374 to consider modifying that requirement and allowing duplicates in the input.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Column name duplicated when use Join operation leading to ...
I am working on a Golang project using Gorm for database manipulations. When I perform a Join() operator on two tables that have...
Read more >
Read a delimited file (including CSV and TSV) into a tibble
Duplicate column names will generate a warning and be made unique, see name_repair ... (ASCII spaces and tabs) be trimmed from each field...
Read more >
C.3 Version History
New ASCII format output handler can write tables in the same text-based format used by the ASCII input handler. JoinStarTable can now deduplicate...
Read more >
COPY INTO <table> - Snowflake Documentation
Specifies the positional number of the field/column (in the file) that contains the data to be loaded ( 1 for the first field,...
Read more >
IO tools (text, CSV, HDF5, …) — pandas 1.5.2 documentation
General parsing configuration#. dtypeType name or dict of column -> type, default None. Data type for data or columns ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found