datashading with datetime axis fails unexpectedly
See original GitHub issueI am getting the following error:
Error
/usr/local/anaconda3/envs/arara/lib/python3.6/site-packages/holoviews/element/raster.py in __init__(self, data, kdims, vdims, bounds, extents, xdensity, ydensity, rtol, **params)
324 'density.')
325 SheetCoordinateSystem.__init__(self, bounds, xdensity, ydensity)
--> 326 self._validate(data_bounds, supplied_bounds)
327
328
/usr/local/anaconda3/envs/arara/lib/python3.6/site-packages/holoviews/element/raster.py in _validate(self, data_bounds, supplied_bounds)
392 not_close = True
393 if not_close:
--> 394 raise ValueError('Supplied Image bounds do not match the coordinates defined '
395 'in the data. Bounds only have to be declared if no coordinates '
396 'are supplied, otherwise they must match the data. To change '
ValueError: Supplied Image bounds do not match the coordinates defined in the data. Bounds only have to be declared if no coordinates are supplied, otherwise they must match the data. To change the displayed extents set the range on the x- and y-dimensions.
I tried understanding what was happening with pdb
but didn’t go that deep into holoviews. What I found is that self.interface
is not an ImageInterface
but a XArrayInterface
and data is a XArray
not a np.ndarray
so the statement below is not executed.
https://github.com/pyviz/holoviews/blob/aa134e1d11b456e5f712c1bcfbc9306f1b69dc1c/holoviews/element/raster.py#L316
Here’s a stripped down reproducible example, I tried different DateTime frequencies to understand if it had to do with frequencies or the number of points (this is because originally I was using dask and thought I was plotting too much data). Messing with the initial start date also has an impact.
Reproducible Example
import pandas as pd # 0.24.1
import numpy as np # 1.15.4
import holoviews as hv # 1.11.3
from holoviews.operation.datashader import datashade # datashader 0.6.9
hv.extension('bokeh') # bokeh 1.0.4
def test_plot(size, start_date, freq):
"""size: number of points
freq: frequency on datetime index
"""
df = pd.DataFrame(data={'a': np.random.normal(0, 0.3, size=size).cumsum() + 50},
index=pd.date_range(start_date, periods=size, freq=freq))
print(f'First date: {df.index.min()}\nLast date: {df.index.max()}')
return datashade(hv.Scatter(df))
# base case
test_plot(70119, "1980-01-01", '1H') # this works
test_plot(70120, "1980-01-01", '1H') # this won't
# less points than base case
test_plot(35060, "1980-01-01", '2H') # this works
test_plot(35061, "1980-01-01", '2H') # this won't
# more points than base case
test_plot(4207105, "1980-01-01", '1T') # this works
test_plot(4207106, "1980-01-01", '1T') # this won't
# base case one day ahead
test_plot(70120, "1980-01-02", '1H') # this works
# previous with double points
test_plot(140240, "1980-01-02", '1H') # this won't
# previous 10 years ahead
test_plot(140240, "1990-01-02", '1H') # this works
This has to do with datashading and it doesn’t matter if the x-axis is originally an index or just a column although in the example we are using an index.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:9 (2 by maintainers)
Top GitHub Comments
Does anyone have any updates on workarounds or progress toward fixes for this issue? Converting datetimes to ints didn’t solve the problem for me (I still run into
Supplied Image bounds do not match the coordinates defined in the data
) when attempting to rasterize a curve.edit: increasing
rtol
seems to work for my particular use-case 🤷♂️:hv.extension("bokeh", config=dict(image_rtol=1000))
I can confirm this bug (still) exists in the following configuration: Python 3.7.4 Pandas 0.25.1 Numpy 1.16.4 Holoviews 1.12.3 Datashader 0.7.0 Bokeh 1.3.4
The problem appears not to be in the length of the time series, nor in the sampling. A simple script to show this in a Jupyter Notebook (very similar to @neuronist):
Plotting a 100000 elements works fine (every minute, starting at 1990-01-01):
Results in:
However, if I start at 1980-01-01 instead of 1990-01-01:
After that I tried to reproduce the series in the original report, with slightly different results…
Which only shows that anything that failed for @neuronist fails for me as well, while those that worked for him only briefly showed the correct results on my side, while trivial adjustments to the configuration make it work.