question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ak.Array constructor from {"key": whole_array}

See original GitHub issue

In one of my analysis frameworks I introduced a Table class, which is a thin wrapper to numpy.recarray and has also a constructor which takes a dictionary as input. It became quite popular among our users due to the ability to quickly create type-safe 2D recarrays. Here is an example

In [1]: import km3pipe as kp

In [2]: t = kp.Table({'a': [1,2,3], 'b': [4.5, 6.7, 8.9], 'c': False})

In [3]: t
Out[3]: Generic Table <class 'km3pipe.dataclasses.Table'> (rows: 3)

In [4]: print(t)
Generic Table <class 'km3pipe.dataclasses.Table'>
HDF5 location: /misc (no split)
<i8 (dtype: a) = [1 2 3]
<f8 (dtype: b) = [4.5 6.7 8.9]
|b1 (dtype: c) = [False False False]

In [5]: t.dtype
Out[5]: dtype((numpy.record, [('a', '<i8'), ('b', '<f8'), ('c', '?')]))

In [6]: t[0]
Out[6]: (1, 4.5, False)

In [7]: type(t[0])
Out[7]: numpy.record

In [8]: t[0].b
Out[8]: 4.5

In [9]: t[1:3]
Out[9]: Generic Table <class 'km3pipe.dataclasses.Table'> (rows: 2)

The inspiration came from Pandas which offers a similar constructor:

In [5]: import pandas as pd

In [6]: df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': True})

In [7]: df
Out[7]:
   a  b     c
0  1  4  True
1  2  5  True
2  3  6  True

With awkward array, the corresponding constructor only does a “for element in dict”-like iteration, which in Python boils down to an iteration over the dictionary .keys(). This might confuse people who already worked with pandas.DataFrames or similar classes. For the sake of completeness, here is what we get when passing a dictionary (no surprise for us but I post it here in for future reference):

In [1]: import awkward1 as ak

In [2]: arr = ak.Array({'a': [1,2,3], 'b': [4,5,6,7,8]})

In [3]: arr
Out[3]: <Array ['a', 'b'] type='2 * string'>

In [4]: arr.a
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-7be3c5184803> in <module>
----> 1 arr.a

~/Dev/awkward-1.0/awkward1/highlevel.py in __getattr__(self, where)
     95                     raise AttributeError("while trying to get field {0}, an exception occurred:\n{1}: {2}".format(repr(where), type(err), str(err)))
     96             else:
---> 97                 raise AttributeError("no field named {0}".format(repr(where)))
     98
     99     def __dir__(self):

AttributeError: no field named 'a'

With pandas it’

Of course the DataFrame or Table classes are mostly simple 2D table with the requirement of equal lengths for each field (that’s why it e.g. auto-repeat-expands single value attributes) but I think this constructor would be a handy addition to akward.Arrays, especially since it accepts awkward data shapes and layouts 😉

What do you think? I am not sure if I find time to come up with a solid PR until 1.0.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jpivarskicommented, Mar 10, 2020

I’m going to open a PR to handle these constructor issues. I’m not sure how much it will try to address, but there will be a place to look.

  • The ak.Array and ak.Record constructor bugs (iterating over the dict keys and implementing that FIXME)
  • Make fromiter iterate over py::array (mentioned in #146)
  • Possibly, but not necessarily, ak.zip.
0reactions
jpivarskicommented, Mar 11, 2020

I really should have linked these issues to the PR. The signature you were asking for is available in ak.zip, but it’s not an ak.Array signature.

>>> import awkward1 as ak
>>> array = ak.zip({"x": ak.Array([[1, 2], [], [3]]), "y": ak.Array([1.1, 2.2, 3.3])})

>>> print(array)
[{x: [1, 2], y: 1.1}, {x: [], y: 2.2}, {x: [3], y: 3.3}]

>>> ak.typeof(array)
3 * {"x": var * int64, "y": float64}

Since Python lists are “array-like,” they can be used for small examples:

>>> array = ak.zip({"x": [[1, 2], [], [3]], "y": [1.1, 2.2, 3.3]})

>>> print(array)
[{x: [1, 2], y: 1.1}, {x: [], y: 2.2}, {x: [3], y: 3.3}]

>>> ak.typeof(array)
3 * {"x": var * int64, "y": float64}

It would be confusing to allow this syntax in the ak.Array constructor because that constructor argument is interpreted as the inverse of ak.list:

>>> not_array = ak.Record({"x": [[1, 2], [], [3]], "y": [1.1, 2.2, 3.3]})
>>> print(not_array)
{x: [[1, 2], [], [3]], y: [1.1, 2.2, 3.3]}
>>> ak.typeof(not_array)
{"x": var * var * int64, "y": var * float64}

>>> ak.Array({"x": [[1, 2], [], [3]], "y": [1.1, 2.2, 3.3]})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/irishep/awkward-1.0/awkward1/highlevel.py", line 34, in __init__
    raise TypeError("could not convert dict into an awkward1.Array; try awkward1.Record")
TypeError: could not convert dict into an awkward1.Array; try awkward1.Record
Read more comments on GitHub >

github_iconTop Results From Across the Web

ak.Array — Awkward Array 2.0.0 documentation
The behavior parameter passed into this Array's constructor. ... See ak.behavior for a list of recognized key patterns and their meanings. ak.Array.mask#.
Read more >
ak.pad_none — Awkward Array 2.0.2 documentation
At axis=0 , this operation pads the whole array, adding None at the ... is significant, for instance in broadcasting rules (see ak.broadcast_arrays...
Read more >
Whole Array and Additional Array Features | SpringerLink
To introduce ways in which we can initialise arrays using array constructors. To introduce the where statement and array masking.
Read more >
Array() constructor - JavaScript - MDN Web Docs
The Array() constructor is used to create Array objects. Syntax. new Array(element0, element1, /* … , ...
Read more >
6.2. Traversing Arrays with For Loops — AP CSAwesome
This allows our code to generalize to work for the whole array. Coding Exercise ... It is created in the constructor and changed...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found