question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

workspace.add_file without incremental refcounts

See original GitHub issue

I don’t know if this is intentional: When looping over large workspaces and adding files to them, the workspace.mets._file_by_id dict keeps growing and thus leaks all the new OcrdFile and METS file element etree references. This can create extreme memory overhead and slow down eventually (because each new file has to be compared to ever more existing IDs).

Perhaps one can at least try to sever the references to the file etrees?

But an opt-out to the general _file_by_id dict mechanism is probably best seen as part of #416, right?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
kbacommented, Feb 24, 2020

I noticed that as well when removing the trivial caching. On my TODO list.

0reactions
bertskycommented, Apr 4, 2020

BTW This is a common cause of inefficiency. Not just when growing workspaces by adding more and more files. Any processor on a large workspace has to pay the penalty before it can start.

Read more comments on GitHub >

github_iconTop Results From Across the Web

5.2 Using tar to Perform Incremental Dumps - GNU.org
Incremental backup is a special form of GNU tar archive that stores additional metadata so that exact ... If this file does not...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found