question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Rename counts (and counts64) to "sizes" and get the axis right

See original GitHub issue

(It’s important that it’s plural: sizes!)

This is a follow-on from https://github.com/scikit-hep/awkward-1.0/pull/115#issuecomment-586623849.

Given an array like

[[[  2,   3,   5,   7,  11],
  [ 13,  17,  19,  23,  29],
  [ 31,  37,  41,  43,  47]],
 [[ 53,  59,  61,  67,  71],
  [ 73,  79,  83,  89,  97],
  [101, 103, 107, 109, 113]]]
  • array.sizes(axis=0) should return [3, 3] (as a NumpyArray)
  • array.sizes(axis=1) should return [[5, 5, 5], [5, 5, 5]] (as a RegularArray/ListArray/ListOffsetArray of NumpyArray)
  • array.sizes(axis=2) should be illegal.

I guess negative axis should count from the deepest possible, so axis=-1axis=1 and axis=-2axis=0. Whatever this does, flatten should do the same, so this issue might supersede #51.

Let’s consider some examples with branching, to figure out what they should do. An array like

[[{"x": [], "y": [[]]},
  {"x": [1], "y": [[], [1]]},
  {"x": [2, 2], "y": [[], [1], [2, 2]]}],
 [],
 [{"x": [3, 3, 3], "y": [[], [1], [2, 2], [3, 3, 3]]},
  {"x": [4, 4, 4, 4], "y": [[], [1], [2, 2], [3, 3, 3], [4, 4, 4, 4]]}]]

has depth 3 along "x" and depth 4 along "y".

  • array.sizes(axis=0) should return [3, 0, 2].
  • array.sizes(axis=1) should return [[{"x": 0, "y": 1}, {"x": 1, "y": 2}, {"x": 2, "y": 3}], [], [{"x": 3, "y": 4}, {"x": 4, "y": 5}]]
  • array.sizes(axis=2) should be illegal because it fails for "x". An exception occurs during the recursive descent; we don’t have to check for it before descending.

Now for the negative axis values:

  • array.sizes(axis=-1) should return [[{"x": 0, "y": [0]}, {"x": 1, "y": [0, 1]}, {"x": 2, "y": [0, 1, 2]}], [], [{"x": 3, "y": [0, 1, 2, 3]}, {"x": 4, "y": [0, 1, 2, 3, 4]}]] because it represents the deepest axis along each branch.
  • array.sizes(axis=-2) should be illegal because it wants to be a non-record along "x" and a record along "y".

So, negative axis values aren’t just synonyms for non-negative axis values, as they are for rectilinear data. The reducer operations (PR #115) have a similar meaning, though reducers can be applied one level deeper than sizes and flatten can, so it’s a little different.

I’m not sure whether @ianna or I will do this. We might need to figure it out together.

@nsmith-: you might have opinions about the usefulness of negative axis having a different meaning than positive axis minus depth.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jpivarskicommented, Mar 7, 2020

Some changes in definition, implemented in #152: (1) I shifted the meaning of axis by one, so that it’s always describing lengths of the specified axis, rather than one deeper. So in my original example,

[[[  2,   3,   5,   7,  11],
  [ 13,  17,  19,  23,  29],
  [ 31,  37,  41,  43,  47]],
 [[ 53,  59,  61,  67,  71],
  [ 73,  79,  83,  89,  97],
  [101, 103, 107, 109, 113]]]
  • axis=0 now returns 2, the scalar length of the array;
  • axis=1 now returns [3, 3] (what the old axis=0 would have returned)
  • axis=2 now returns [[5, 5, 5], [5, 5, 5]] (what the old axis=1 would have returned).
  • no axis values that correspond to an axis in the array (this one has only three: 0, 1, and 2) are illegal, although axis=0 is qualitatively different from the rest. (That’s a recurring theme in these operations.)

Major change number (2) is that I’ve changed the name again, from “sizes” to “num”. This will allow physicists to write very readable code, like

ak.num(muons)     # implicit axis=1

and

muons[ak.num(muons) >= 2]

There is no collision in the NumPy namespace with a function named “num”, and it’s far from any of NumPy’s existing concepts (unlike “size”).

0reactions
iannacommented, Feb 17, 2020

Oh, I remember that. It was implemented as described above, then I extended it to an extra depth to return a one dimensional NumpyArray as in counts(axis) from study/flatten.py

Either lengths or sizes is less confusing then counts.

Intuitively, I’d say length is constant while size isn’t, but it’s from Java/C++ world.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Change axis labels in a chart - Microsoft Support
Right -click the category labels you want to change, and click Select Data. Right-click the category axis and Select Data. In the Horizontal...
Read more >
Change Axis Labels, Set Title and Figure Size to Plots with ...
We make use of the set_title(), set_xlabel(), and set_ylabel() functions to change axis labels and set the title for a plot. We can...
Read more >
Edit Axes - Tableau Help
To edit an axis range, double-click the axis that you want to edit. Note: In Tableau Desktop, you can right-click (control-click on Mac)...
Read more >
pandas.Series.rename_axis — pandas 1.5.2 documentation
The axis to rename. For Series this parameter is unused and defaults to 0. copybool, default True. Also copy underlying data.
Read more >
Edit your chart's axes - Computer - Google Docs Editors Help
Want to get more out of Google Docs for work or school? ... Optional: Next to "Apply to," choose the data series you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found