question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Significant performance regression for datashader in numba 0.49.x

See original GitHub issue

Using latest released numba (0.49.1) I’m seeing a significant performance regression when compared to numba 0.48 in a simple datashader aggregation example. I’m downgrading from 0.49.1 to 0.48 with conda install numba=0.48 --no-deps to ensure no other packages change. The simple test case I’m using is the following:

import timeit

from functools import partial

import datashader as ds
import numba
import numpy as np
import pandas as pd

canvas = ds.Canvas(plot_height=1000, plot_width=1000)

def agg(df):
    canvas.points(df, 'x', 'y', agg=ds.mean('value'))

def test_agg_performance(N, repeats=10):
    df = pd.DataFrame({'x': np.random.randn(N), 'y': np.random.randn(N), 'value': np.random.rand(N)})
    agg(df) # Warm up JIT
    return timeit.timeit(partial(agg, df), number=repeats)/repeats

print(f'Numba version: {numba.__version__}')
[(n, test_agg_performance(int(n))) for n in np.logspace(0, 8, 9)]

Here is a graph of the performance difference by the number of points being aggregated:

bokeh_plot - 2020-05-25T153633 150

And here is the notebook which I used to generate the plot: https://anaconda.org/philippjfr/profiling_numba/notebook

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
esccommented, May 25, 2020

FWIW I’m bisecting now.

0reactions
stuartarchibaldcommented, Jun 3, 2020

Fixed in #5795. Will make it into 0.50.0.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Releases — Datashader v0.14.3
Performance ranges from 1.3x to 14x slower than the simplest zero-width implementation; see benchmarks. Fixed an issue with visibility on zoomed-in points ...
Read more >
目录 - Gitee
numba #2707: Fix regression: cuda test submodules not loading properly in runtests ... OpenCL 2.x switched to SPIR-V as the IR, which is...
Read more >
Creating a library of notebooks each being individually ...
# b/157908450 set to latest once numba 0.49.x fixes performance regression for datashader. ARG BASE_TAG=m46 ARG TENSORFLOW_VERSION=2.2.0 FROM gcr.io/kaggle- ...
Read more >
Plotting Hopalong attractor with Datashader and Numba
Datashader is a great Python library that allows to create beautiful images from large amout of spatial data, e.g. census data. Numba is...
Read more >
conda-forge - :: Anaconda.org
airflow-provider-great-expectations, 0.2.2, Apache-2.0, X ... 1.5.1, MIT, X, X, X, X, Python ASN.1 library with a focus on performance and a pythonic API....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found