Memory leak with bokeh serve and datashader
See original GitHub issueI believe I am encountering a memory leak when using datashader in combination with a ‘bokeh serve’ style application.
Here is a simplified case that will reproduce the issue:
import datashader as ds
import numpy as np
import holoviews as hv
from itertools import cycle
from holoviews.operation.datashader import datashade
from bokeh.plotting import curdoc
import logging
log = logging.getLogger(__name__)
def setup_doc():
log.info('Loading module.')
cat = cycle(range(22))
# Generate a bunch of data
module_data = [
{
cat.next(): hv.Curve((np.linspace(0, 5000, 500000), np.random.normal(np.ones(500000))))
for i in range(514)
}
]
# Use datashader and create a layout
hv_layout = []
for item in module_data:
hv_layout.append(
datashade(hv.NdOverlay(item, kdims='k'), aggregator=ds.count_cat('k'))
.opts(plot=dict(width=800))
)
final_layout = hv.Layout(hv_layout).cols(1)
log.info('Done loading module.')
# Create plot
plot = hv.renderer('bokeh').instance(mode='server').get_plot(final_layout)
doc = curdoc()
doc.add_root(plot.state)
setup_doc()
Add this to a folder test
and run:
mprof run bokeh serve test
then try opening up a tab or two. Close the tabs and wait a minute or so for the sessions to be destroyed.
mprof plot
Afterward it appears that memory is not reclaimed.
Though it is hard to do with non-datashaded lines if you run the same experiment (with fewer datapoints) and a normal overlay, it appears the memory is reclaimed. I tried to dig into the persistent objects in the on_session_destroyed hook, and it seems at least the datashader object is not garbage collected when the session is destroyed. Not sure what else might be hanging around.
Am I missing something in the implementation?
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:7 (4 by maintainers)
Top GitHub Comments
Thank you both for your examples, after a marathon debugging session I’ve finally addressed the issue.
Sure, but please include detailed information about your environment (versions of packages, how installed, platform, etc) and how to reproduce (e.g. one of the scripts above).