ENH: option efficient memory use by downcasting
See original GitHub issueWe use pandas-gbq a lot for our daily analyses. It is known that memory consumption can be a pain, see e.g. https://www.dataquest.io/blog/pandas-big-data/
I have started to write a patch, which could be integrated into an enhancement for read_gbq
(rough idea, details TBD):
- Provide boolean
optimize_memory
option - If
True
, the source table is inspected with a query to get min, max, presence of nulls and % of unique number of strings for INTEGER and STRING columns, respectively - When calling
to_dataframe
this information is passed to thedtypes
option, downcasting integers to the appropriate numpy (u)int type, and converting strings to pandascategory
type at some threshold (less than 50% of unique values)
I already have a working monkey-patch, which is still a bit rough. If there is enough interest I’d happily make it more robust and submit a PR. Would be my first significant contribution to an open source project, so some help and feedback would be appreciated.
Curious to hear your views on this.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:23 (6 by maintainers)
Top Results From Across the Web
ENH: option efficient memory use by downcasting · Issue #275
reflects on existing table or query results to determine current SQL types; does calculation in standardSQL to determine optimal pandas dtypes ...
Read more >Downcasting - Oracle Help Center
This downcast inquiry into the actual type of an object can be used when the object has been passed to some general facility,...
Read more >java - Is the performance/memory benefit of short nullified by ...
I.e. when the short is in use, you gain no memory or performance benefit ... Down casting from int to short happens at...
Read more >New Data Array Layouts in VTK 7.1 - Kitware Inc.
This post will guide you through how to use these tools. ... VTK 7.1 comes significant improvements to the efficiency and interoperability of...
Read more >Guidelines for the Visual Impact Assessment of Highway ...
The FHWA guidelines were initially used in training classes for personnel in State ... The Intermodal Surface Transportation Efficiency Act of 1991 (ISTEA) ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Oops, I should read more closely, I think I just proposed your user story. 😃
I can probably clean it up if you mail the PR and check the “allow edits from maintainers” box. Don’t worry about it too much.