Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Concurrency issues

See original GitHub issue

When doing heavy concurrent writes/persists that progressively decrease in size, files can sometimes become corrupted as newer data only partially overwrites older data (as there is no locking mechanism in place).

I can write a testcase later if needed, as I’m currently rather busy, but I figured I’d file an issue before I would forget.

I did write a very hacky monkeypatch (in Coffeescript) that at least ensures write queueing:

writeQueue = []
currentlyWriting = false
_setItem = persist.setItem

persist.setItem = (key, value) ->
    new Promise (resolve, reject) ->
        _setItem.call(persist, key, value)
        addItemToQueue key, value, resolve, reject
        triggerWrite()

addItemToQueue = (key, value, resolveFunc, rejectFunc) ->
    writeQueue.push [key, value, resolveFunc, rejectFunc]

triggerWrite = ->
    if not currentlyWriting and writeQueue.length > 0
        currentlyWriting = 1
        [key, value, resolveFunc, rejectFunc] = writeQueue.shift()

        Promise.resolve(persist.persistKey(key))
            .then (result) -> resolveFunc(result)
            .catch (err) -> rejectFunc(err)
            .finally ->
                currentlyWriting = false
                triggerWrite()

It’s very inefficient - it doesn’t attempt to ‘deduplicate’ queued writes and it always persists the last known version at the time of persisting, so under heavy concurrency it’ll likely end up writing identical data many times… but at least it doesn’t corrupt the files on disk 😃

EDIT: For clarification, these monkeypatched methods I wrote are what caused me to discover the issue:

persist.addListItem = (key, item) ->
    newList = [item].concat (persist.getItem(key) ? [])

    persist.setItem key, newList

persist.removeListItem = (key, item) ->
    newList = (persist.getItem(key) ? [])
        .filter (existingItem) ->
            return (item == existingItem)

    persist.setItem key, newList

persist.removeListItemByFilter = (key, filter) ->
    newList = (persist.getItem(key) ? [])
        .filter (item) ->
            return !filter(item)

    persist.setItem key, newList

Rapidly calling removeListItem several times (eg. like would happen in a queue) would result in the aforementioned writes that progressively decrease in size.

Issue Analytics

State:
Created 8 years ago
Reactions:1
Comments:18 (2 by maintainers)

Top GitHub Comments

3reactions

rasgo-cccommented, Oct 7, 2020

I don’t have time right now to create a PR but in my case I apparently fixed the issue by using Bottleneck. All you have to do is to wrap the functions with a limiter. Something like this:

const limiter = new Bottleneck({
  minTime: 1,
  maxConcurrent: 1,
});

export async function createCache(cachePath: string) {
  await storage.init({
    dir: cachePath,
    ttl: 1000 * 60 * 60 * 24, // 24hrs
  });
}

export const put = limiter.wrap(
  async (key: string, value: any, opts: CacheOptions = {}) => {
    return await storage.setItem(key, value, opts as any);
  }
);

export const get = limiter.wrap(async (key: string) => {
  const value = await storage.getItem(key);
  return value;
});

1reaction

akhourycommented, Jul 4, 2017

at this point, I am not so sure a queue is enough, we might need some job processing mechanism that also spawns child processes, writes a batch and then close them? but that seems like an overkill. Also, with batch writing means that if the app crashes, some write-requests may get lost.

Also, that doesn’t solve the issue with different apps using the same storage dir.

The best solution is to run as a separate service with the same node api to talk to, but now we’re implementing a database engine, which I really don’t want to do.

Really node-persist is not designed for this, it’s more like a localStorage of the browser, if you want a database, i think you should use a database.

But going back to the original issue filed, a smart queue might solve OP’s issue ONLY.

Top Results From Across the Web

Concurrency problems in DBMS Transactions - GeeksforGeeks

When multiple transactions execute concurrently in an uncontrolled or unrestricted manner, then it might lead to several problems.

Concurrency (computer science) - Wikipedia

In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or...

Common Concurrency Problems - cs.wisc.edu

In this chapter, we take a brief look at some example concurrency problems found in real code bases, to better understand what problems...

Common Concurrency Pitfalls in Java - Baeldung

Memory consistency issues occur when multiple threads have inconsistent views of what should be the same data. In addition to the main memory, ......

DBMS Concurrency Control - Javatpoint

The problem occurs when two different database transactions perform the read/write operations on the same database items in an interleaved manner (i.e., ...