Using object.size() in init.R could be expensive
See original GitHub issueDescribe the bug
87f9bb6f8b450d897fb191780b086de07805fd4e in #544 uses object.size(obj)
in inspect_env
in init.R
to support showing object size in the workspace viewer. However, the performance of this function could be extremely poor if the object contains character vectors in it.
library(data.table)
dt <- data.table(id = 1:5000000)
for (i in 1:20) {
dt[, paste0("x", i) := rep("hello", .N)]
}
system.time(object.size(dt))
user system elapsed
1.068 0.687 1.741
This means, with session watcher enabled, whenever this data table is present in the global environment, then user has to wait for 1.7s after evaluating each top-level expression.
Therefore, I don’t think we should call object.size()
so eagerly in this way.
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (6 by maintainers)
Top Results From Across the Web
In Java, what is the best way to determine the size of an object?
It's JVM-specific, but I usually estimate 40 bytes. Then you have to look at the members of the class. Object references are 4...
Read more >Report the Space Allocated for an Object - R
Object sizes are larger on 64-bit builds than 32-bit ones, but will very likely be the same on different platforms with the same...
Read more >Too many objects: Reducing memory overhead from Python ...
Objects in Python have large memory overhead. Learn why, and what do about it: avoiding dicts, fewer objects, and more.
Read more >object.size function - RDocumentation
Provides an estimate of the memory that is being used to store an R object.
Read more >Growing Objects and Loop Memory Pre-Allocation - R-bloggers
Preallocating Memory. This will be a short post about a simple, but very important concept that can drastically increase the speed of poorly ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
With #581 merged, now the
object.size()
is called less frequently: it is called only when a new symbol appears or its memory address or length has changed.And we have an option
vsc.show_object_size
to opt-in. It is disabled by default to minimize possible delay.It might add too much complexity if we take into account nested list, object attributes, etc. where character vector could appear anywhere. If we don’t handle these cases, it won’t work, e.g. a multi-level nested list with some big character vectors will trigger long waits.
Also, if we had a robust way to do this, the object size might not be useful as it omits the size of character vectors, making the result misleading.