ak.count with axis=0 is producing invalid arrays
See original GitHub issueI am trying to count the number of elements in a nested data structure using awkward1.count following the docs, I tried to apply different axis values to count elements at different levels of nesting.
axis (None or int) – If None, combine all values from the array into a single scalar result; if an int, group by that axis: 0 is the outermost, 1 is the first level of nested lists, etc., and negative axis counts from the innermost: -1 is the innermost, -2 is the next level up, etc.
Here is how I proceed:
import uproot
import numpy as np
import awkward1 as ak1
fit = ak1.Array(uproot.open(my_file)["E/Evt/trks/trks.fitinf"].array())
print(fit)
>>> [[[0.00496, 0.00342, -295, 142, 99.1, 1.8e+308, 4.24e-12, 10, ... [], [], [], []]]
type(fit)
>>> awkward1.highlevel.Array
- Counting the outermost level of nesting:
axis=0
I tried the following to get the outermost level of nesting:
ak1.count(fit, axis=0)
this raises the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
700 type_pprinters=self.type_printers,
701 deferred_pprinters=self.deferred_printers)
--> 702 printer.pretty(obj)
703 printer.flush()
704 return stream.getvalue()
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
392 if cls is not object \
393 and callable(cls.__dict__.get('__repr__')):
--> 394 return _repr_pprint(obj, self, cycle)
395
396 return _default_pprint(obj, self, cycle)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
682 """A pprint that just redirects to the normal repr function."""
683 # Find newlines and replace them with p.break_()
--> 684 output = repr(obj)
685 lines = output.splitlines()
686 with p.group():
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/highlevel.py in __repr__(self, limit_value, limit_total)
1134 value = awkward1._util.minimally_touching_string(limit_value,
1135 self._layout,
-> 1136 self._behavior)
1137
1138 name = getattr(self, "__name__", type(self).__name__)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in minimally_touching_string(limit_length, layout, behavior)
997 leftlen += len(l)
998
--> 999 r = next(rightgen)
1000 if r is not None:
1001 if (leftlen + rightlen + len(r)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in forever(iterable)
976
977 def forever(iterable):
--> 978 for token in iterable:
979 yield token
980 while True:
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in backward(x, space, brackets, wrap)
938 sp = ""
939 for i in range(len(x) - 1, -1, -1):
--> 940 for token in backward(x[i], sp):
941 yield token
942 sp = ", "
ValueError: in ListArray64 attempting to get 27, starts[i] != stops[i] and stops[i] > len(content)
It seems like there is an error with the repr of the returned object, I tried the following:
t = ak1.count(fit, axis=0)
print(t.shape)
>>> (56,)
So this did not raise an error, but the shape of t can’t be 56.
The expected result of the outermost nesting level count is 10:
ak1.num(fit, axis=0)
>>> 10
same with
len(fit)
>>>10
- Counting the first level of nesting:
axis=1
counts = ak1.count(fit, axis=1)
counts
>>> <Array [[20, 20, 20, 20, 20, ... 1, 1, 1, 1]] type='10 * var * int64'>
np.array(counts)
>>> array([[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[19, 19, 19, 19, 19, 19, 19, 19, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[18, 18, 18, 18, 18, 18, 18, 18, 2, 2, 2, 1, 1, 1, 1, 1,
1],
[20, 20, 20, 20, 20, 20, 20, 20, 2, 2, 2, 1, 1, 1, 1, 1,
1]])
So I understand from this result that fit has 10 elements (the outermost nesting level), in element 0, there are 17 elements (let’s call this the first nesting level):
counts[0]
>>> <Array [20, 20, 20, 20, 20, ... 1, 1, 1, 1, 1] type='17 * int64'>
But I am not sure if this result actually makes sense? I am not able to interpret these values.
When I check to verify if this counting is correct, I do the following:
np.array(ak1.num(fit, axis=1))
>>> array([56, 55, 56, 56, 56, 56, 56, 56, 54, 56])
both results don’t match? am I misundersting something here?
same with
[len(i) for i in fit]
>>> [56, 55, 56, 56, 56, 56, 56, 56, 54, 56]
- Counting the second level of nesting:
axis=2
This one seems to work fine:
counts = ak1.count(fit, axis=2)
>>> <Array [[17, 11, 8, 8, 8, ... 0, 0, 0, 0, 0]] type='10 * var * int64'>
counts[0]
>>> <Array [17, 11, 8, 8, 8, 8, ... 0, 0, 0, 0, 0] type='56 * int64'>
checking the results with awkward1.num
counts_with_num = ak1.Array([ak1.num(i, axis=1) for i in fit])
counts_with_num
>>> <Array [[17, 11, 8, 8, 8, ... 0, 0, 0, 0, 0]] type='10 * var * int64'>
counts_with_num[0]
>>> <Array [17, 11, 8, 8, 8, 8, ... 0, 0, 0, 0, 0] type='56 * int64'>
and with a simple loop
[len(i) for i in fit[0]]
>>> [17, 11, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
I also tried to apply awkward1.count on one element of my fit data using axis=1 and it worked just fine:
ak1.count(fit[0], axis=1)
>>> <Array [17, 11, 8, 8, 8, 8, ... 0, 0, 0, 0, 0] type='56 * int64'>
I also noticed that the same behaviour is true for awkward1.count_nonzero, when axis=1 is specified:
np.array(ak1.count_nonzero(fit, axis=1))
>>> array([[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[19, 19, 19, 19, 1, 1, 19, 19, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[17, 17, 18, 18, 1, 1, 18, 18, 2, 2, 2, 0, 0, 1, 1, 1,
1],
[20, 20, 20, 20, 1, 1, 20, 20, 2, 2, 2, 0, 0, 1, 1, 1,
1]])
when axis=2 it works just fine:
counts_nonzero = ak1.count_nonzero(fit, axis=2)
counts_nonzero[0]
>>> <Array [15, 9, 6, 6, 6, 6, ... 0, 0, 0, 0, 0] type='56 * int64'>
when axis=0 an error is raised:
ak1.count_nonzero(fit, axis=0)
>>> ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
700 type_pprinters=self.type_printers,
701 deferred_pprinters=self.deferred_printers)
--> 702 printer.pretty(obj)
703 printer.flush()
704 return stream.getvalue()
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
392 if cls is not object \
393 and callable(cls.__dict__.get('__repr__')):
--> 394 return _repr_pprint(obj, self, cycle)
395
396 return _default_pprint(obj, self, cycle)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
682 """A pprint that just redirects to the normal repr function."""
683 # Find newlines and replace them with p.break_()
--> 684 output = repr(obj)
685 lines = output.splitlines()
686 with p.group():
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/highlevel.py in __repr__(self, limit_value, limit_total)
1134 value = awkward1._util.minimally_touching_string(limit_value,
1135 self._layout,
-> 1136 self._behavior)
1137
1138 name = getattr(self, "__name__", type(self).__name__)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in minimally_touching_string(limit_length, layout, behavior)
997 leftlen += len(l)
998
--> 999 r = next(rightgen)
1000 if r is not None:
1001 if (leftlen + rightlen + len(r)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in forever(iterable)
976
977 def forever(iterable):
--> 978 for token in iterable:
979 yield token
980 while True:
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in backward(x, space, brackets, wrap)
938 sp = ""
939 for i in range(len(x) - 1, -1, -1):
--> 940 for token in backward(x[i], sp):
941 yield token
942 sp = ", "
ValueError: in ListArray64 attempting to get 27, starts[i] > stops[i]
The file I am reading data from is attached to this git issue.
Thank you for your help.
I ping @tamasgal
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (7 by maintainers)

Top Related StackOverflow Question
It will be in the next release, which will probably wait for #216. I’m deploying releases a little less often than I used to because these include a lot of binaries and PyPI is instituting a quota.
What you want for your analysis/framework is probably not
ak.countwithaxis=0. I don’t know exactly what you want to do, but it probably involvesak.num.In fact, all reducers do; it shows up when
axisis less than the nesting depth and there are trailing empty lists in the output. Thestartsandstopsof the trailing lists were left uninitialized, rather than being set to equal values (indicating an empty list).I just fixed it. Actually, the new system is quite nice to debug! Since there are for loops at the deepest level, I can insert print statements.
:)