question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support nulls in ECSV boolean columns?

See original GitHub issue

Description

The ECSV format is described, and as far as I know defined, in https://github.com/astropy/astropy-APEs/blob/master/APE6.rst. That document does not discuss in any detail the encoding of blank values, but e.g. an empty string value in an int32-typed column can be read by the Astropy implementation, and is represented by a masked value. However an empty string value in a bool-typed column causes a read error. If file refers to the following text:

# %ECSV 0.9
# ---
# datatype: [
#   { name: AA, datatype: int32   },
#   { name: BB, datatype: bool    },
#   { name: CC, datatype: float64 },
# ]
AA BB CC
1 True 2.1
2 False 5.4
"" "" nan
9 True -9.9

then astropy.table.Table.read(file, format='ascii.ecsv') gives me a ValueError: Column BB failed to convert: bool input strings must be only False or True. This is Astropy 4.0, Python 3.6.

Is this intended behaviour? There are cases where a boolean column could contain nulls as well as Trues and Falses (I came across this myself when trying to represent certain Gaia data in ECSV - the table in question had boolean columns containing null values, which meant the Astropy reader could not read my ECSV serialisation of the table). APE6 does say “Boolean fields are represented as the case-sensitive string False or True”, but since nulls are not discussed elsewhere (e.g. it doesn’t mention nulls in integer columns) it’s not clear whether this counts as a declaration that nulls are intentionally outlawed for the bool type.

The ECSV I/O handlers I have written for STIL/TOPCAT do work with blanks in bool-typed columns. I request that the Astropy ECSV implementation considers doing the same.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
mbtaylorcommented, Nov 4, 2020

Sure. What I’d expect/like to see is like this:

>>> from astropy.table import Table
>>> t2 = Table.read(s, format='ascii.ecsv')
>>> t2
  AA    BB     CC
int32  bool float64
----- ----- -------
    1  True     2.1
    2 False     5.4
   --    --     nan
    9  True    -9.9
>>> t2[2][0]
masked
>>> t2[2][1]
masked
>>> type(t2[1][1])
<class 'numpy.bool_'>
>>> type(t2[2][1])
<class 'numpy.bool_'>

For the io.ascii.read option, the BB column header does not report type bool and the cell types are not reported as numpy.bool_.

0reactions
mbtaylorcommented, Nov 5, 2020

Great! Many thanks for quick service.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Is it possible to report blanks and nulls on boolean fields?
Hello! I'm auditing a database at the moment, and I found this issue, I'm trying to report blank and null records on a...
Read more >
Treat omitted columns as null · Issue #307 · adaltas/node-csv
When preparing a CSV dump of a database, some table columns may be strings, and can accept empty strings. If said column is...
Read more >
How to pass a NULL value (of Boolean type) to SQL ... - MSDN
Hi all,. I am reading a CSV file where all columns are strings. When parsing a particular empty column (which is a boolean...
Read more >
How to fix Boolean value 'NULL' is not recognized Error while ...
I want to load a tab separated value file into snowflake table, where some of the columns are Boolean and I am expecting...
Read more >
Spark treating null values in csv column as null datatype
Question : "This is only when I know which columns will be treated as null datatype. When a large number of files are...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found