Efficient bulk updates for DataView
See original GitHub issueIssue Description
We’re currently facing quite a significant performance issue with Slickgrid when doing a lot of updates in the DataView.
From a data stream we’re receiving thousands of log entries, which we want to feed into slickgrid via a DataView. As part these incoming events we need to delete some and add some new ones. The most nasty thing we might be doing is inserting all items at the beginning of the list because we want to have newest items first.
We’re calling beginUpdate
and endUpdate
but this only suspends the refresh
triggers. Any call to insertItem
, addItem
or deleteItem
causes the updateIdxById
to be called which in worst case causes a loop over all items again.
So assume you have a dataview with 3000 items, and you get 300 new items which will either call insertItem(0, item)
or deleteItem(item.id)
based on some rules. In the worst case we will have 300 times a loop over 3000 items to update the index table. As the updateIdxById
and idxById
are not public, it is really hardly possible to optimize here on our level with some known constraints.
The insert part I was able to patch already by modifying the original items array and calling insertItem at the very end:
dataView.beginUpdate();
for(let i = 0; i < logs.length; i++) {
if(i < logs.length - 1) { dataView.getItems().unshift(item); }
else { dataView.insertItem(0, item); }
}
dataView.endUpdate();
This way I ensure that we have 1 loop over 3000 items at the last item. But it gets tricky if you start deleting items because the internal index table is not correct anymore and also delete calls cannot easily be translated.
dataView.beginUpdate();
for(let i = 0; i < logs.length; i++) {
if (shouldRemove(item)) { dataView.deleteItem(item.id); }
else if (i < logs.length - 1) { dataView.getItems().unshift(item); }
else { dataView.insertItem(0, item); }
}
dataView.endUpdate();
My idea on how SlickGrid could handle this would be something like this:
- We add something like a
beginBulkUpdate()
and call it before the update. - Add/Insert operations are done instantly and the internal items array is changed, the id to index mapping is NOT updated yet.
- The delete operations are remembered in a separate
Set
instance (or as object key if you want to support old browsers). - At the end we call an
endBulkUpdate()
which loops once through the whole dataset and either update the index of the current element, or delete the item at the current location if it is contained in theSet
I would have added this to my level of the code but the as the index loop is private I cannot enforce this properly.
It would be great if the DataView could handle such kind of add or update operations. SlickGrid in general is operating fine, it’s the streaming update capabilities which are not available. Reimplementing various functionalities of the DataView we need is also not really a good option.
Alternatively I would be also fine with just exposing the currently private parts of the lookups to allow me changing it. This way I can load the original list and idx lookup. This way I should be able to insert new items and delete old ones. Finally I can just call setItems() with the final list or update idxById
on my own.
Looking forward to your feedback.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (4 by maintainers)
I adopted my changes into the lib and opened #572. @ghiscoding if you like you can have a closer look. Especially point 2 with the proposal on how to activate the new feature.
For the tests I will likely adopt a bunch of existing dataview tests to simply ensure that things work as expected. Unfortunately it will be hard to test the performance aspects. Before I start with tests it would be good to get a confirmation that the changes are good as they are. For me using this patched DataView gives the same performance boost as the logic on my code level (as described above).
Unfortunately for the wiki there is no way to open PRs. But I will create a fork with my changes and then you can pull it manually.
void 0
is just a short hand forundefined
. It’s like the pre-ES6 default parameter handling.When filing the PR I will try to stick to ES5 practices. Set can be replaced with an object, and the other things are just syntactic sugar 😉
Array.prototype.splice.apply
is simply calling the method. The special thing is that for splice you cannot pass in an array which you want to insert. It will insert the array as new item, but not “concatenate” the array.Will try to provide a PR soon once we confirmed that this logic does not have side effects. For SlickGrid this logic will ultimately only be active if you call the right
start*
method, old code should remain working as it is.