question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Efficient bulk updates for DataView

See original GitHub issue

We’re currently facing quite a significant performance issue with Slickgrid when doing a lot of updates in the DataView.

From a data stream we’re receiving thousands of log entries, which we want to feed into slickgrid via a DataView. As part these incoming events we need to delete some and add some new ones. The most nasty thing we might be doing is inserting all items at the beginning of the list because we want to have newest items first.

We’re calling beginUpdate and endUpdate but this only suspends the refresh triggers. Any call to insertItem, addItem or deleteItem causes the updateIdxById to be called which in worst case causes a loop over all items again.

So assume you have a dataview with 3000 items, and you get 300 new items which will either call insertItem(0, item) or deleteItem(item.id) based on some rules. In the worst case we will have 300 times a loop over 3000 items to update the index table. As the updateIdxById and idxById are not public, it is really hardly possible to optimize here on our level with some known constraints.

The insert part I was able to patch already by modifying the original items array and calling insertItem at the very end:

dataView.beginUpdate();
for(let i = 0; i < logs.length; i++) {
    if(i < logs.length - 1) { dataView.getItems().unshift(item); } 
    else { dataView.insertItem(0, item); }
}
dataView.endUpdate();

This way I ensure that we have 1 loop over 3000 items at the last item. But it gets tricky if you start deleting items because the internal index table is not correct anymore and also delete calls cannot easily be translated.

dataView.beginUpdate();
for(let i = 0; i < logs.length; i++) {
    if (shouldRemove(item)) { dataView.deleteItem(item.id); }
    else if (i < logs.length - 1) { dataView.getItems().unshift(item); } 
    else { dataView.insertItem(0, item); }
}
dataView.endUpdate();

My idea on how SlickGrid could handle this would be something like this:

  1. We add something like a beginBulkUpdate() and call it before the update.
  2. Add/Insert operations are done instantly and the internal items array is changed, the id to index mapping is NOT updated yet.
  3. The delete operations are remembered in a separate Set instance (or as object key if you want to support old browsers).
  4. At the end we call an endBulkUpdate() which loops once through the whole dataset and either update the index of the current element, or delete the item at the current location if it is contained in the Set

I would have added this to my level of the code but the as the index loop is private I cannot enforce this properly.

It would be great if the DataView could handle such kind of add or update operations. SlickGrid in general is operating fine, it’s the streaming update capabilities which are not available. Reimplementing various functionalities of the DataView we need is also not really a good option.

Alternatively I would be also fine with just exposing the currently private parts of the lookups to allow me changing it. This way I can load the original list and idx lookup. This way I should be able to insert new items and delete old ones. Finally I can just call setItems() with the final list or update idxById on my own.

Looking forward to your feedback.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
Danielku15commented, Jan 21, 2021

I adopted my changes into the lib and opened #572. @ghiscoding if you like you can have a closer look. Especially point 2 with the proposal on how to activate the new feature.

For the tests I will likely adopt a bunch of existing dataview tests to simply ensure that things work as expected. Unfortunately it will be hard to test the performance aspects. Before I start with tests it would be good to get a confirmation that the changes are good as they are. For me using this patched DataView gives the same performance boost as the logic on my code level (as described above).

Unfortunately for the wiki there is no way to open PRs. But I will create a fork with my changes and then you can pull it manually.

1reaction
Danielku15commented, Jan 20, 2021

void 0 is just a short hand for undefined. It’s like the pre-ES6 default parameter handling. image

When filing the PR I will try to stick to ES5 practices. Set can be replaced with an object, and the other things are just syntactic sugar 😉

Array.prototype.splice.apply is simply calling the method. The special thing is that for splice you cannot pass in an array which you want to insert. It will insert the array as new item, but not “concatenate” the array.

var a = [];
a.splice(0,0,1,2,3,4,5); 
// a will be [1,2,3,4,5]
var a = [];
var b = [0,0,1,2,3,4,5];
a.splice(0,0, b); 
// a will be [ [1,2,3,4,5] ] with a[0] being [1,2,3,4,5]

// but apply works: 
var a = [];
var b = [0,0,1,2,3,4,5];
Array.prototype.splice.apply(a, b);
// a will be [1,2,3,4,5]

Will try to provide a PR soon once we confirmed that this logic does not have side effects. For SlickGrid this logic will ultimately only be active if you call the right start* method, old code should remain working as it is.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Efficient bulk SQL database table update based on the datatable
I have over 1 million rows that I check for changes and then update. I completed my program that goes over each record...
Read more >
Using Views to Update Data - SQL Server - InformIT
Using Views to Update Data. A view can be used in a query that updates data, subject to a few restrictions. Bear in...
Read more >
Efficient Bulk Dataverse Updates with SQL - Mark Carrington
With SQL you can do this in bulk in a single command, e.g.. UPDATE account. SET owneridtype = 'systemuser',. ownerid = CASE industrycodename....
Read more >
Using SqlBulkCopy in .NET for Faster Bulk Data Loading
Use the SqlBulkCopyOptions.TableLock option so that it takes a bulk update lock on the SQL Server table (rather than a default row-level lock)....
Read more >
Bulk Update Changes Across Your Data Streams
Make changes to your data streams across your entire workspace at once with the Bulk Updater for Data Streams paid app. You can...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found