question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

calculating cumsums on a groupby object

See original GitHub issue

How do I go about calculating cumsums on a groupby object?

I have a Dataset that looks as the following:

lat = np.linspace(-5.175003, 5.9749985, 224)
lon = np.linspace(33.524994, 42.274994, 176)
time = pd.date_range(start='1981-01-31', end='2019-04-30', freq='M')
data = np.random.randn(len(time), len(lat), len(lon))
dims = ['time', 'lat', 'lon']
coords = {'time': time, 'lat': lat, 'lon': lon}

ds = xr.Dataset({'precip': (dims, data)}, coords=coords)

Out[]:
<xarray.Dataset>
Dimensions:  (lat: 224, lon: 176, time: 460)
Coordinates:
  * time     (time) datetime64[ns] 1981-01-31 1981-02-28 ... 2019-04-30
  * lat      (lat) float64 -5.175 -5.125 -5.075 -5.025 ... 5.875 5.925 5.975
  * lon      (lon) float64 33.52 33.57 33.62 33.67 ... 42.12 42.17 42.22 42.27
Data variables:
    precip   (time, lat, lon) float64 0.006328 0.2969 1.564 ... 0.6675 2.32

I need to groupby year and calculate the cumsum for each year. That way I will have a value for each month (timestep) and each pixel (lat - lon pair).

But the cumsum operation doesn’t work on a groupby object

ds.groupby('time.year').cumsum(dim='time')

Out[]:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-dceee5f5647c> in <module>
      9 display(ds_)
     10 
---> 11 ds_.groupby('time.year').cumsum(dim='time')

AttributeError: 'DatasetGroupBy' object has no attribute 'cumsum'

Is there a work around?

INSTALLED VERSIONS
------------------
commit: None
python: 3.7.0 | packaged by conda-forge | (default, Nov 12 2018, 12:34:36) 
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.2

xarray: 0.12.2
pandas: 0.24.2
numpy: 1.16.4
scipy: 1.3.0
netCDF4: 1.5.1.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: 1.0.17
cfgrib: 0.9.7
iris: None
bottleneck: 1.2.1
dask: 1.2.2
distributed: 1.28.1
matplotlib: 3.1.0
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 41.0.1
pip: 19.1
conda: None
pytest: 4.5.0
IPython: 7.1.1
sphinx: 2.0.1

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
shoyercommented, Jul 18, 2019

It looks like ds.groupby('time.year').apply(lambda x: x.cumsum(dim='time')) mostly works for now.

But yes, it would be great to add this.

0reactions
stale[bot]commented, Jun 23, 2021

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas groupby cumulative sum - python - Stack Overflow
When computing the cumulative sum, you want to do so by 'name' ... 'day']).sum().groupby(level=0).cumsum().reset_index() name day no 0 Jack ...
Read more >
Pandas: How to Calculate Cumulative Sum by Group - Statology
You can use the following syntax to calculate a cumulative sum by group in pandas: df['cumsum_col'] = df.groupby(['col1'])['col2'].cumsum().
Read more >
Pandas - Cumulative Sum By Group (cumsum) - Sin-Yi Chou
The goal is to compute the cumulative sum over date by different items. However, the index of the original data frame is not...
Read more >
How to calculate Cumulative Sum with Groupby in Python?
I am trying to calculate cumulative sum with groupby using Pandas's DataFrame. However, I don't get expected output. My Source Code: import ...
Read more >
4 Ways to Calculate Pandas Cumulative Sum - Datagy
groupby () method to both the Type and Price column, and then group the two columns by the Type column. Finally, we call...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found