Unable to collect posts beyond a certain number due to Temporary Block
See original GitHub issueHi,
First of all, this is a really great tool, so thank you very much for your work!
I want to scrape some private groups. However, every time I’m trying, I get the message You are Temporarily Blocked
after scraping from 100 posts up to 9000 posts, even thought the group I’m trying to scrape has way more posts. I have tested alt accounts too. Is there any possible solution to my problem so that Facebook don’t block me every time so quick? Or if there is a way I can continue from where I left off because I was blocked?
Furthermore, I’m using "allow_extra_requests": True
since I want to download all photos to max quality. Could you add get_photos
for groups to speed up scraping or is there any other way I could get the link of the first photo (at max quality) of every post faster without using allow_extra_requests which is slow?
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:18 (2 by maintainers)
Top GitHub Comments
Increasing posts_per_page might help, as then you’d make fewer requests. Adding some
time.sleep
lines might help reduce the rate at which you’re making requests. Yes, you can continue from a pagination url, by passing the url as thestart_url
argument toget_posts
. These pagination URLs can be seen in the logs if you have debug logging enabled, or you can pass a callback function asrequest_url_callback
toget_posts
to handle extracting these pagination urls. Here’s some sample code:Note: https://github.com/kevinzg/facebook-scraper/commit/f3c8948ae04414932899686c89e696306f37ce1f simplifies this code a bit by making it possible to pass a
start_url
ofNone
.AFAIK, facebook only provides the high resolution image URL if you click on the photo, which involves an extra request for each photo.
Probably nothing to worry about, so long as that image URL extraction worked