Bug: If a TemporarilyBanned error happens right away, the output .json file gets deleted
See original GitHub issueBug Report Summary
If you run the script generating a .json output once, it works great until it gets TemporarilyBanned; then it quits as expected. If you run it again immediately, the output json file gets deleted/overwritten with a 2-byte string.
Repo
- Run the script, like
python3 -m facebook_scraper --filename output.json --pages 1000 --cookies cookiefile.txt -fmt json --use-youtube-dl --source --resume-file resume_file.txt --dump ./html_dump_folder --group 123123123
- Wait until it gets a
TemporarilyBanned
error. - See that the
output.json
file contains a valid, large json file. - Immediately re-run the script as in Step 1.
- See that it gets the same
TemporarilyBanned
error almost immediately (before loading a single page successfully). - See that
output.json
is now basically empty, containing only 2 bytes of invalid json content.
What I expect to happen
output.json
should not be overwritten with garbage if a TemporarilyBanned
error happens right away. Instead, the script should simply exit right away without changing the ouput.json
file at all.
Notes
The script works as expected (adds onto the output.json
file) when 1 or more pages are successfully loaded in subsequent runs (i.e., repo Step #4).
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top Results From Across the Web
JSON Query to <username>/feed/: Could not find "window ...
Describe the bug JSON Query to test/feed/: Could not find "window. ... The workaround is to remove the session file ( session-USERNAME in ......
Read more >npm "failed to parse json" - Stack Overflow
Mostly, this error is due to a syntax error in package.json file. ... first run npm package.json to check errors in package.json file,...
Read more >"Unexpected error" happening randomely accross all our ...
It happens when we call the UrlFetch() method. Re-running the operation works fine in most cases, however this increased the number of errors...
Read more >How to Fix The Invalid JSON Error in WordPress (Beginner's ...
Are you seeing the JSON response is not valid error on your WordPress site? Here's our step by step guide on how to...
Read more >Universal Print Known Issues - Microsoft Learn
There is a known compatibility issue where Windows print dialog may show an odd paper size value when the printer reports support for...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The resume file just contains a URL to start scraping from. There’s no dependency between the resume file and the output file - a user could input their desired URL to start from, and start collecting posts, without having any prior output file.
Would be nice if it handled that concatenation itself, considering it accepts a
--resume-file
argument (which appears to be wholly misleading).