question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Concurrency issues

See original GitHub issue

When doing heavy concurrent writes/persists that progressively decrease in size, files can sometimes become corrupted as newer data only partially overwrites older data (as there is no locking mechanism in place).

I can write a testcase later if needed, as I’m currently rather busy, but I figured I’d file an issue before I would forget.

I did write a very hacky monkeypatch (in Coffeescript) that at least ensures write queueing:

writeQueue = []
currentlyWriting = false
_setItem = persist.setItem

persist.setItem = (key, value) ->
    new Promise (resolve, reject) ->
        _setItem.call(persist, key, value)
        addItemToQueue key, value, resolve, reject
        triggerWrite()

addItemToQueue = (key, value, resolveFunc, rejectFunc) ->
    writeQueue.push [key, value, resolveFunc, rejectFunc]

triggerWrite = ->
    if not currentlyWriting and writeQueue.length > 0
        currentlyWriting = 1
        [key, value, resolveFunc, rejectFunc] = writeQueue.shift()

        Promise.resolve(persist.persistKey(key))
            .then (result) -> resolveFunc(result)
            .catch (err) -> rejectFunc(err)
            .finally ->
                currentlyWriting = false
                triggerWrite()

It’s very inefficient - it doesn’t attempt to ‘deduplicate’ queued writes and it always persists the last known version at the time of persisting, so under heavy concurrency it’ll likely end up writing identical data many times… but at least it doesn’t corrupt the files on disk 😃

EDIT: For clarification, these monkeypatched methods I wrote are what caused me to discover the issue:

persist.addListItem = (key, item) ->
    newList = [item].concat (persist.getItem(key) ? [])

    persist.setItem key, newList

persist.removeListItem = (key, item) ->
    newList = (persist.getItem(key) ? [])
        .filter (existingItem) ->
            return (item == existingItem)

    persist.setItem key, newList

persist.removeListItemByFilter = (key, filter) ->
    newList = (persist.getItem(key) ? [])
        .filter (item) ->
            return !filter(item)

    persist.setItem key, newList

Rapidly calling removeListItem several times (eg. like would happen in a queue) would result in the aforementioned writes that progressively decrease in size.

Issue Analytics

  • State:open
  • Created 8 years ago
  • Reactions:1
  • Comments:18 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
rasgo-cccommented, Oct 7, 2020

I don’t have time right now to create a PR but in my case I apparently fixed the issue by using Bottleneck. All you have to do is to wrap the functions with a limiter. Something like this:

const limiter = new Bottleneck({
  minTime: 1,
  maxConcurrent: 1,
});

export async function createCache(cachePath: string) {
  await storage.init({
    dir: cachePath,
    ttl: 1000 * 60 * 60 * 24, // 24hrs
  });
}

export const put = limiter.wrap(
  async (key: string, value: any, opts: CacheOptions = {}) => {
    return await storage.setItem(key, value, opts as any);
  }
);

export const get = limiter.wrap(async (key: string) => {
  const value = await storage.getItem(key);
  return value;
});
1reaction
akhourycommented, Jul 4, 2017

at this point, I am not so sure a queue is enough, we might need some job processing mechanism that also spawns child processes, writes a batch and then close them? but that seems like an overkill. Also, with batch writing means that if the app crashes, some write-requests may get lost.

Also, that doesn’t solve the issue with different apps using the same storage dir.

The best solution is to run as a separate service with the same node api to talk to, but now we’re implementing a database engine, which I really don’t want to do.

Really node-persist is not designed for this, it’s more like a localStorage of the browser, if you want a database, i think you should use a database.

But going back to the original issue filed, a smart queue might solve OP’s issue ONLY.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Concurrency problems in DBMS Transactions - GeeksforGeeks
When multiple transactions execute concurrently in an uncontrolled or unrestricted manner, then it might lead to several problems.
Read more >
Concurrency (computer science) - Wikipedia
In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or...
Read more >
Common Concurrency Problems - cs.wisc.edu
In this chapter, we take a brief look at some example concurrency problems found in real code bases, to better understand what problems...
Read more >
Common Concurrency Pitfalls in Java - Baeldung
Memory consistency issues occur when multiple threads have inconsistent views of what should be the same data. In addition to the main memory, ......
Read more >
DBMS Concurrency Control - Javatpoint
The problem occurs when two different database transactions perform the read/write operations on the same database items in an interleaved manner (i.e., ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found