size of serialized DOM
See original GitHub issueI’m seeing 10x character size of the serialization of the initial DOM state (EventType.FullSnapshot
) compared with a plain HTML representation of the same thing.
Is minimizing the size of this on the agenda as a design goal?
I’m thinking that it could be reduced as follows:
- simple things like renaming
attributes
toattrs
- not storing empty
childNodes
/attributes
lists/objects (making them implicit) - removing
type: 2
(type: NodeType.Element) and similar, as that can be inferred from presence ofchildNodes
- only setting
isSVG
/isStyle
boolean attributes if they are unusual (i.e. True)
Are there any strong reasons not to do any of the above?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:37 (29 by maintainers)
Top Results From Across the Web
JavaScript: how to serialize a DOM element as a string to be ...
I would like to clone this element (and all CSS and JS being applied), serialize it as a string that I could save...
Read more >DOM Parsing and Serialization - W3C
This specification defines various APIs for programmatic access to HTML and generic XML parsers by web applications for use in parsing and ...
Read more >dom-serialize - npm
dom -serialize. Serializes any DOM node into a String. Sauce Test Status. Build Status. It's like outerHTML , but it works with:.
Read more >DOM Standard
Return the number of node 's children. A node is considered empty if its length is 0. 4.2.1. Document tree. A ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sorry for the later. After finishing a lot of works last month, finally, I’ve got time to start working on rrweb again!
I think this issue is the most important one in the current stage, and I would like to provide a solution int the next major release.
With the ideas that I illustrated above, I have done some POC code in this repo.
Currently, I have implemented a analyze framework and several packers:
Now the msgpack packer is not working as intend and I’m still checking my implementation. The other two shows some good result when testing on two real-world events log.
I’m using two real-world events log to benchmark the packers:
===
simple
e1
e2
pako
e1
e2
Just a reminder that my original proposal related to being a bit more careful/efficient in the JSON format itself. Reducing the repetitive aspects of the original JSON would provide advantages in transmission as well preempt much of the need for zipping either client side or server side.
Here’s a quick analysis of a sample JSON DOM structure showing repetitive keys:
{ type: 560 childNodes: 218 name: 1 publicId: 1 systemId: 1 id: 560 tagName: 217 attributes: 217 textContent: 341 isStyle: 1 }
And here’s the empty nodes e.g.
{ ... attributes: {}, ... }
:{attributes: 79, childNodes: 47}
(Here’s the code I executed at the console to come up with these figures:
)
So by e.g. abbreviating
attributes
->a
,textContent
->t
,tagName
->n
,childNodes
->c
you’d effectively be doing a lot of what I imagine gzip is doing ‘for free’, and I don’t think it will be any less legible to someone browsing the structure as you’d usually be able to infer the meaning from the context (the value).This could be done in a backwards compatible way so that it’s still possible to playback non-abbreviated content.