question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug: If a TemporarilyBanned error happens right away, the output .json file gets deleted

See original GitHub issue

Bug Report Summary

If you run the script generating a .json output once, it works great until it gets TemporarilyBanned; then it quits as expected. If you run it again immediately, the output json file gets deleted/overwritten with a 2-byte string.

Repo

  1. Run the script, like python3 -m facebook_scraper --filename output.json --pages 1000 --cookies cookiefile.txt -fmt json --use-youtube-dl --source --resume-file resume_file.txt --dump ./html_dump_folder --group 123123123
  2. Wait until it gets a TemporarilyBanned error.
  3. See that the output.json file contains a valid, large json file.
  4. Immediately re-run the script as in Step 1.
  5. See that it gets the same TemporarilyBanned error almost immediately (before loading a single page successfully).
  6. See that output.json is now basically empty, containing only 2 bytes of invalid json content.

What I expect to happen

output.json should not be overwritten with garbage if a TemporarilyBanned error happens right away. Instead, the script should simply exit right away without changing the ouput.json file at all.

Notes

The script works as expected (adds onto the output.json file) when 1 or more pages are successfully loaded in subsequent runs (i.e., repo Step #4).

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
neon-ninjacommented, Aug 17, 2022

The resume file just contains a URL to start scraping from. There’s no dependency between the resume file and the output file - a user could input their desired URL to start from, and start collecting posts, without having any prior output file.

0reactions
DeflateAwningcommented, Aug 16, 2022

Would be nice if it handled that concatenation itself, considering it accepts a --resume-file argument (which appears to be wholly misleading).

Read more comments on GitHub >

github_iconTop Results From Across the Web

JSON Query to <username>/feed/: Could not find "window ...
Describe the bug JSON Query to test/feed/: Could not find "window. ... The workaround is to remove the session file ( session-USERNAME in ......
Read more >
npm "failed to parse json" - Stack Overflow
Mostly, this error is due to a syntax error in package.json file. ... first run npm package.json to check errors in package.json file,...
Read more >
"Unexpected error" happening randomely accross all our ...
It happens when we call the UrlFetch() method. Re-running the operation works fine in most cases, however this increased the number of errors...
Read more >
How to Fix The Invalid JSON Error in WordPress (Beginner's ...
Are you seeing the JSON response is not valid error on your WordPress site? Here's our step by step guide on how to...
Read more >
Universal Print Known Issues - Microsoft Learn
There is a known compatibility issue where Windows print dialog may show an odd paper size value when the printer reports support for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found