Concurrency issues
See original GitHub issueWhen doing heavy concurrent writes/persists that progressively decrease in size, files can sometimes become corrupted as newer data only partially overwrites older data (as there is no locking mechanism in place).
I can write a testcase later if needed, as I’m currently rather busy, but I figured I’d file an issue before I would forget.
I did write a very hacky monkeypatch (in Coffeescript) that at least ensures write queueing:
writeQueue = []
currentlyWriting = false
_setItem = persist.setItem
persist.setItem = (key, value) ->
new Promise (resolve, reject) ->
_setItem.call(persist, key, value)
addItemToQueue key, value, resolve, reject
triggerWrite()
addItemToQueue = (key, value, resolveFunc, rejectFunc) ->
writeQueue.push [key, value, resolveFunc, rejectFunc]
triggerWrite = ->
if not currentlyWriting and writeQueue.length > 0
currentlyWriting = 1
[key, value, resolveFunc, rejectFunc] = writeQueue.shift()
Promise.resolve(persist.persistKey(key))
.then (result) -> resolveFunc(result)
.catch (err) -> rejectFunc(err)
.finally ->
currentlyWriting = false
triggerWrite()
It’s very inefficient - it doesn’t attempt to ‘deduplicate’ queued writes and it always persists the last known version at the time of persisting, so under heavy concurrency it’ll likely end up writing identical data many times… but at least it doesn’t corrupt the files on disk 😃
EDIT: For clarification, these monkeypatched methods I wrote are what caused me to discover the issue:
persist.addListItem = (key, item) ->
newList = [item].concat (persist.getItem(key) ? [])
persist.setItem key, newList
persist.removeListItem = (key, item) ->
newList = (persist.getItem(key) ? [])
.filter (existingItem) ->
return (item == existingItem)
persist.setItem key, newList
persist.removeListItemByFilter = (key, filter) ->
newList = (persist.getItem(key) ? [])
.filter (item) ->
return !filter(item)
persist.setItem key, newList
Rapidly calling removeListItem
several times (eg. like would happen in a queue) would result in the aforementioned writes that progressively decrease in size.
Issue Analytics
- State:
- Created 8 years ago
- Reactions:1
- Comments:18 (2 by maintainers)
Top GitHub Comments
I don’t have time right now to create a PR but in my case I apparently fixed the issue by using
Bottleneck
. All you have to do is to wrap the functions with a limiter. Something like this:at this point, I am not so sure a queue is enough, we might need some job processing mechanism that also spawns child processes, writes a batch and then close them? but that seems like an overkill. Also, with batch writing means that if the app crashes, some write-requests may get lost.
Also, that doesn’t solve the issue with different apps using the same storage dir.
The best solution is to run as a separate service with the same node api to talk to, but now we’re implementing a database engine, which I really don’t want to do.
Really node-persist is not designed for this, it’s more like a localStorage of the browser, if you want a database, i think you should use a database.
But going back to the original issue filed, a smart queue might solve OP’s issue ONLY.