Slow performance with plotly chart builder
See original GitHub issueHello all,
I need to use plotly
as the backend of a microservice who generates charts dynamically.
Unfortunately, after a little benchmarking, I found that the plotly.express
framework is very slow (around 5 secs to generate a chart from 500 lines dataset).
Here is the script I use to generate a scatter matrix:
import sys
import os
import traceback
import json
import time
sys.path.append('c:\\statwolf\\python\packages\Lib\site-packages')
input = json.loads('{\"file\":\"/tmp/data.tsv",\"color\":\"club_country\",\"dimensions\":[\"nolo\",\"tolo\",\"yolo\"]}')
def action():
def run():
import plotly.express as px
from pandas import read_csv
color = None if input['color'] == "" else input['color']
first = time.time()
d = read_csv(input['file'], sep='\t')
second = time.time()
fig = px.scatter_matrix(d, dimensions=input['dimensions'], color=color)
third = time.time()
j = fig.to_json()
fourth = time.time()
print('read: ' + str(second - first))
print('plot: ' + str(third - second))
print('json: ' + str(fourth - third))
return j
import time
for i in range(0, 3):
start = time.time()
result = run()
end = time.time()
print('iteration: ' + str(i) + '\ntime: ' + str(end - start))
return result
result = None
try:
result = { 'outcome': action() }
except Exception as e:
traceback.print_exc()
result = { 'error': str(e) }
resultDir = os.path.dirname(os.path.realpath(__file__))
resultFile = open(resultDir + '/result.json', 'w')
json.dump(result, resultFile)
resultFile.close()
from the dataset: https://www.dropbox.com/s/cm9i3pfv10exbba/data.tsv?dl=1
and this is the report with timing: https://www.dropbox.com/s/l2x3jqzea4i4xqw/report.txt?dl=1
Now:
- Is there any tweak I can implement to improve performances?
- Do you plan to focus on speed for the following releases?
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
go.Figure slow with lots of data - Plotly Python
I have 6 of those plots on one page. So in total loading the page takes more than 6 seconds. Is this an...
Read more >How Do I Make Plotly Faster? - Random Problems
Is there any way to speed it up? Basic test. To compare techniques for making plotly faster, I'm benchmarking a scatter plot with...
Read more >How to improve the performance and response time of Plotly
I discovered Plotly for plotting candlestick chart and I tried to use it for plotting SPX 1 mins chart. I noticed extremely slow...
Read more >24 Improving performance | Interactive web-based data ...
Recall, from Figure 2.5, when you print a plotly object (or really any plot) ... creating the visualization is main consumer, build time...
Read more >4 Ways To Improve Your Graphs Using Plotly
It's easier to debug than using Python dictionaries, and it's more flexible than Plotly Express. The code for making a graph using the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hey @ievgennaida,
I ran your code snippet on my PC, and it finished in 2.866 seconds. The largest chunk (> 90%) of that time is bcs of the
px.scatter
.I also tried
plotly.express
for my use-case (scattering a lot of points; in the following test example there are 2 million points). However, when comparingplotly.express
againstplotly.grah_objects
, I observe thatpx.scatter
is 100x slower than itsplotly.graph_objects
equivalent.When further line-profiling
px.scatter
for this simple use-case (i.e., plotting 1 large sequence), I observed that most of the time is used because of a groupby operation. As far as I know such operation is not required for creating a scatter plot of 1D array?If you would wonder why I am interested in this; for plotly-resampler I am looking to create a registration method that adds scalability (under the hood) to the
plotly.express
interface (see https://github.com/predict-idlab/plotly-resampler/issues/68). However, it seems thatplotly.express
is significantly slower than plotly itsgraph_objects
interface. I would love to hear some more adivce on how to consume theplotly.express
functionality efficiently for very large scatter / line plots?Profiling of
px.scatter
:Profiling result
plotly==5.8.0 python==3.8.10
this think is horrendously slow w/ financial data of more than few years