Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ascii.write fails with nondescript error on structured (multi-D) columns

See original GitHub issue

Description

When trying to write a table with structured columns (having additional dimensions) to an ascii format with a default fast_writer, the latter fails with relatively confusing errors.

Expected behavior

Since the fast writer does not support such dtypes it should either raise an exception clarifying that the 'fast_writer'=False option should be used, or directly fall back to this (with appropriate warning).

Steps to Reproduce

>>> import numpy as np
>>> from astropy.table import Table, Column
>>> t = Table([Column(np.ones(3, dtype=('f', (2,))), name='R')])
>>> t
<Table length=3>
  R [2]   
 float32  
----------
1.0 .. 1.0
1.0 .. 1.0
1.0 .. 1.0
>>> t.write('test.txt', format='ascii.basic')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/lib/python3.8/site-packages/astropy/table/connect.py", line 127, in __call__
    registry.write(instance, *args, **kwargs)
  File "/opt/lib/python3.8/site-packages/astropy/io/registry.py", line 563, in write
    writer(data, *args, **kwargs)
  File "/opt/lib/python3.8/site-packages/astropy/io/ascii/connect.py", line 26, in io_write
    return write(table, filename, **kwargs)
  File "/opt/lib/python3.8/site-packages/astropy/io/ascii/ui.py", line 851, in write
    writer.write(table, output)
  File "/opt/lib/python3.8/site-packages/astropy/io/ascii/fastbasic.py", line 167, in write
    self._write(table, output, {})
  File "/opt/lib/python3.8/site-packages/astropy/io/ascii/fastbasic.py", line 181, in _write
    writer.write(output, header_output, output_types)
  File "astropy/io/ascii/cparser.pyx", line 1125, in astropy.io.ascii.cparser.FastWriter.write
TypeError: unhashable type: 'list'

Note that it would probably generally be wise to warn the user about the limited support of such columns in ASCII formats (which will still be written as str columns like "1.0 .. 1.0" by the Python writer), advising them to replace them with individual 1D columns if needed. See also https://mail.python.org/pipermail/astropy/2020-December/004839.html

Even nicer would of course be a (semi-)automatic “unfolding” functionality for such columns. 😉

System Details

macOS-10.14.6-x86_64-i386-64bit Python 3.8.6 (default, Oct 15 2020, 01:14:04) [Clang 11.0.0 (clang-1100.0.33.17)] Numpy 1.20.0rc1 astropy 4.2

Issue Analytics

State:
Created 3 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

taldcroftcommented, Dec 9, 2020

@dhomeier - good idea to provide a workaround. Here is a generalized version that makes a new table:

import itertools

def flatten_nd_columns(tbl):
    """Return a new table with any multidimensional columns flattened to a set
    of 1-d columns.

    Example
    -------

      >>> t = Table()
      >>> t['a'] = np.arange(12).reshape(2, 2, 3)
      >>> t['b'] = ['a', 'b']
      >>> flatten_nd_columns(t)
      <Table length=2>
      a.0_0 a.0_1 a.0_2 a.1_0 a.1_1 a.1_2  b
      int64 int64 int64 int64 int64 int64 str1
      ----- ----- ----- ----- ----- ----- ----
          0     1     2     3     4     5    a
          6     7     8     9    10    11    b
    """
    out = {}
    for col in tbl.itercols():
        if col.ndim == 1:
            out[col.info.name] = col
        else:
            ranges = [range(ii) for ii in col.shape[1:]]
            for dims in itertools.product(*ranges):
                name = col.info.name + '.' + '_'.join(str(ii) for ii in dims)
                out[name] = col[(...,) + dims]
    return tbl.__class__(out)

1reaction

taldcroftcommented, Dec 8, 2020

I have thought about adding support in ECSV for N-d columns. In theory one can make column names like R.0 or R.1.2. This would be the right format for such support since there can be appropriate metadata in the file to allow round-trip.

Note that ECSV does already detect N-d columns and give a good message, so this can be a template for other readers:

ValueError: ECSV format does not support multidimensional column 'd'
One can filter out such columns using:
names = [name for name in tbl.colnames if len(tbl[name].shape) <= 1]
tbl[names].write(...)