question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ak.count with axis=0 is producing invalid arrays

See original GitHub issue

I am trying to count the number of elements in a nested data structure using awkward1.count following the docs, I tried to apply different axis values to count elements at different levels of nesting.

axis (None or int) – If None, combine all values from the array into a single scalar result; if an int, group by that axis: 0 is the outermost, 1 is the first level of nested lists, etc., and negative axis counts from the innermost: -1 is the innermost, -2 is the next level up, etc.

Here is how I proceed:

import uproot
import numpy as np
import awkward1 as ak1

fit = ak1.Array(uproot.open(my_file)["E/Evt/trks/trks.fitinf"].array())
print(fit)
>>> [[[0.00496, 0.00342, -295, 142, 99.1, 1.8e+308, 4.24e-12, 10, ... [], [], [], []]]
type(fit)
>>> awkward1.highlevel.Array
  1. Counting the outermost level of nesting: axis=0

I tried the following to get the outermost level of nesting:

ak1.count(fit, axis=0)

this raises the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    682     """A pprint that just redirects to the normal repr function."""
    683     # Find newlines and replace them with p.break_()
--> 684     output = repr(obj)
    685     lines = output.splitlines()
    686     with p.group():

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/highlevel.py in __repr__(self, limit_value, limit_total)
   1134         value = awkward1._util.minimally_touching_string(limit_value,
   1135                                                          self._layout,
-> 1136                                                          self._behavior)
   1137 
   1138         name = getattr(self, "__name__", type(self).__name__)

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in minimally_touching_string(limit_length, layout, behavior)
    997             leftlen += len(l)
    998 
--> 999         r = next(rightgen)
   1000         if r is not None:
   1001             if (leftlen + rightlen + len(r)

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in forever(iterable)
    976 
    977     def forever(iterable):
--> 978         for token in iterable:
    979             yield token
    980         while True:

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in backward(x, space, brackets, wrap)
    938                 sp = ""
    939                 for i in range(len(x) - 1, -1, -1):
--> 940                     for token in backward(x[i], sp):
    941                         yield token
    942                     sp = ", "

ValueError: in ListArray64 attempting to get 27, starts[i] != stops[i] and stops[i] > len(content)

It seems like there is an error with the repr of the returned object, I tried the following:

t = ak1.count(fit, axis=0)
print(t.shape)
>>> (56,)

So this did not raise an error, but the shape of t can’t be 56.

The expected result of the outermost nesting level count is 10:

ak1.num(fit, axis=0)
>>> 10

same with

len(fit)
>>>10
  1. Counting the first level of nesting: axis=1
counts = ak1.count(fit, axis=1)
counts
>>> <Array [[20, 20, 20, 20, 20, ... 1, 1, 1, 1]] type='10 * var * int64'>
np.array(counts)
>>> array([[20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [19, 19, 19, 19, 19, 19, 19, 19,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [18, 18, 18, 18, 18, 18, 18, 18,  2,  2,  2,  1,  1,  1,  1,  1,
         1],
       [20, 20, 20, 20, 20, 20, 20, 20,  2,  2,  2,  1,  1,  1,  1,  1,
         1]])

So I understand from this result that fit has 10 elements (the outermost nesting level), in element 0, there are 17 elements (let’s call this the first nesting level):

counts[0]
>>> <Array [20, 20, 20, 20, 20, ... 1, 1, 1, 1, 1] type='17 * int64'>

But I am not sure if this result actually makes sense? I am not able to interpret these values.

When I check to verify if this counting is correct, I do the following:

np.array(ak1.num(fit, axis=1))
>>> array([56, 55, 56, 56, 56, 56, 56, 56, 54, 56])

both results don’t match? am I misundersting something here?

same with

[len(i) for i in fit]
>>> [56, 55, 56, 56, 56, 56, 56, 56, 54, 56]
  1. Counting the second level of nesting: axis=2

This one seems to work fine:

counts = ak1.count(fit, axis=2)
>>> <Array [[17, 11, 8, 8, 8, ... 0, 0, 0, 0, 0]] type='10 * var * int64'>
counts[0]
>>> <Array [17, 11, 8, 8, 8, 8, ... 0, 0, 0, 0, 0] type='56 * int64'>

checking the results with awkward1.num

counts_with_num = ak1.Array([ak1.num(i, axis=1) for i in fit])
counts_with_num
>>> <Array [[17, 11, 8, 8, 8, ... 0, 0, 0, 0, 0]] type='10 * var * int64'>
counts_with_num[0]
>>> <Array [17, 11, 8, 8, 8, 8, ... 0, 0, 0, 0, 0] type='56 * int64'>

and with a simple loop

[len(i) for i in fit[0]]
>>> [17, 11, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

I also tried to apply awkward1.count on one element of my fit data using axis=1 and it worked just fine:

ak1.count(fit[0], axis=1)
>>> <Array [17, 11, 8, 8, 8, 8, ... 0, 0, 0, 0, 0] type='56 * int64'>

I also noticed that the same behaviour is true for awkward1.count_nonzero, when axis=1 is specified:

np.array(ak1.count_nonzero(fit, axis=1))
>>> array([[20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [19, 19, 19, 19,  1,  1, 19, 19,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [17, 17, 18, 18,  1,  1, 18, 18,  2,  2,  2,  0,  0,  1,  1,  1,
         1],
       [20, 20, 20, 20,  1,  1, 20, 20,  2,  2,  2,  0,  0,  1,  1,  1,
         1]])

when axis=2 it works just fine:

counts_nonzero = ak1.count_nonzero(fit, axis=2)
counts_nonzero[0]
>>> <Array [15, 9, 6, 6, 6, 6, ... 0, 0, 0, 0, 0] type='56 * int64'>

when axis=0 an error is raised:

ak1.count_nonzero(fit, axis=0)
>>> ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    682     """A pprint that just redirects to the normal repr function."""
    683     # Find newlines and replace them with p.break_()
--> 684     output = repr(obj)
    685     lines = output.splitlines()
    686     with p.group():

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/highlevel.py in __repr__(self, limit_value, limit_total)
   1134         value = awkward1._util.minimally_touching_string(limit_value,
   1135                                                          self._layout,
-> 1136                                                          self._behavior)
   1137 
   1138         name = getattr(self, "__name__", type(self).__name__)

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in minimally_touching_string(limit_length, layout, behavior)
    997             leftlen += len(l)
    998 
--> 999         r = next(rightgen)
   1000         if r is not None:
   1001             if (leftlen + rightlen + len(r)

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in forever(iterable)
    976 
    977     def forever(iterable):
--> 978         for token in iterable:
    979             yield token
    980         while True:

~/miniconda3/envs/km3pipe/lib/python3.7/site-packages/awkward1/_util.py in backward(x, space, brackets, wrap)
    938                 sp = ""
    939                 for i in range(len(x) - 1, -1, -1):
--> 940                     for token in backward(x[i], sp):
    941                         yield token
    942                     sp = ", "

ValueError: in ListArray64 attempting to get 27, starts[i] > stops[i]

The file I am reading data from is attached to this git issue.

aanet_v2.0.0.root.zip

Thank you for your help.

I ping @tamasgal

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
jpivarskicommented, Apr 20, 2020

It will be in the next release, which will probably wait for #216. I’m deploying releases a little less often than I used to because these include a lot of binaries and PyPI is instituting a quota.

What you want for your analysis/framework is probably not ak.count with axis=0. I don’t know exactly what you want to do, but it probably involves ak.num.

1reaction
jpivarskicommented, Apr 20, 2020

In fact, all reducers do; it shows up when axis is less than the nesting depth and there are trailing empty lists in the output. The starts and stops of the trailing lists were left uninitialized, rather than being set to equal values (indicating an empty list).

I just fixed it. Actually, the new system is quite nice to debug! Since there are for loops at the deepest level, I can insert print statements. :)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Calculating min distance between two jagged arrays having ...
I have the following two ak arrays, for instance: ... In my use case, the number of elements is very large in axis...
Read more >
ValueError: The truth value of an array with more than one ...
If a and b are Boolean NumPy arrays, the & operation returns the elementwise-and of them: a & b. That returns a Boolean...
Read more >
Simplified Run 2 Analysis - GitHub Pages
In other words, axis=0 would tell us the number of subarrays (events), while axis=-1 would tell us the number of muons within each...
Read more >
jax.numpy package - JAX documentation - Read the Docs
Counts the number of non-zero values in the array a . cov (m[, y, rowvar, bias, ddof, ... flipud (m). Reverse the order...
Read more >
NumPy Reference
For example slicing can produce views of the array: >>> y = x[:,1] ... Void arrays return a buffer object for item(), ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found