Performance issue on import - chatty database communication
See original GitHub issueHi,
I am importing CSVs with 10k rows and I noticed that it is very very slow. Like 10minutes to import more or less. This is against a django app running in my local connected to a postgres running in my local as well.
Upon further inspection, looks like for every row in the imported CSV, a select * from table where id = ?
is performed, thereby slowing down the import. It may be better to do a select * from table where id in (?, ?, ?, ...)
instead to speed up the process.
Thanks
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Why is My Database Application so Slow? - Simple Talk
A very common cause of performance problems, in our experience, is running “chatty” applications over high latency networks. A chatty ...
Read more >Chatty I/O antipattern - Performance antipatterns for cloud apps
Problem description. Network calls and other I/O operations are inherently slow compared to compute tasks. Each I/O request typically has significant overhead, ...
Read more >Urgent advise needed - Software AG Tech Community & Forums
It's been a while since I looked into it, but JDBC network traffic used to be “chatty” in small bursts. This caused performance...
Read more >SQL Server Advanced Troubleshooting and Performance ...
most problems present themselves as general performance issues: ... The Protocol Layer handles communication between SQL Server and client.
Read more >[ MATLAB + SQL Server ] Slow performance when fetching data
My problem is that the while-loop performs quite slowly. It takes up to 8-10 seconds to fetch the data. One might ask why...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@andrewgy8 This is my workaround
Then I use that
BulkQueryMixin
in myResource
to fetch all data related to the uploaded import once, and then on everyget_instance()
, i just retrieve from that cached dict. The code is not pretty, but seems to work and has greatly increased the performance on my end.But aside from that, I had to do a few more things (ranked in terms of performance gain)
Resource
skip_unchanged
- removes any update statements which can be one for every row elementModel
s to override__deepcopy__
(and__copy__
) - this also slows down things substantiallyResource
report_skipped
- speeds up rendering of the confirmation page. Also, with 10k records uploaded, it makes spotting the difference easier@andrewgy8 Good point. I’ll try to see if I can change it to use querysets and filters instead.