question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

parser.apply does not return for a long time even though the progress bar indicates it finishes parsing

See original GitHub issue

Description of the bug

This is not a bug, but a performance issue. This is not noticeable when parsing a small number of documents, but parser.apply does not return even though the progress bar indicates it finishes parsing a long time ago (1 hour or more ago).

To Reproduce

Steps to reproduce the behavior:

  1. Parse many documents (my case: ~2500)

Expected behavior

parser.apply returns when the progress bar indicates it finished parsing all the documents.

Error Logs/Screenshots

If applicable, add error logs or screenshots to help explain your problem.

Environment (please complete the following information)

  • OS: Debian Buster
  • PostgreSQL Version: 12.1
  • Poppler Utils Version: N/A
  • Fonduer Version: 0.8.3+dev (01e0d9319b523aff7aa7f5c583a9f330b0705ecc)

Additional context

Add any other context about the problem here.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:14 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
robbieculkincommented, May 12, 2021

I’ve tested master on ~200 docs and can confirm that these changes fix OOM errors and slow performance. Many thanks for the fix.

1reaction
lukehsiaocommented, Jan 29, 2022

Your description sounds correct to me, and this is definitely a real bottleneck several have run into. We would love to try and resolve this, but I suspect it wouldn’t be a quick fix.

I’m going to reopen this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Setting up a Papa Parse progress bar with Web workers
The progress bar is updated, but only after the CSV file is parsed and the site is set up with data, so the...
Read more >
Stupid Smartbook Connect Orientation Assignment - Quizlet
The progress bar indicates how many concepts you have completed, how many are in progress, and how many are still left in the...
Read more >
11 Data import - R for Data Science - Hadley Wickham
11.1 Introduction Working with data provided by R packages is a great way to ... Long running jobs have a progress bar, so...
Read more >
API — Click Documentation (7.x)
While iteration happens, this function will print a rendered progress bar to the given file (defaults to stdout) and will attempt to calculate...
Read more >
AsyncTask - Android Developers
do not interfere with an in-progress onProgressUpdate(Progress...) call. ... even if cancel returns false, but onPostExecute(Result) has not yet run.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found