question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to collect posts beyond a certain number due to Temporary Block

See original GitHub issue

Hi,

First of all, this is a really great tool, so thank you very much for your work! I want to scrape some private groups. However, every time I’m trying, I get the message You are Temporarily Blocked after scraping from 100 posts up to 9000 posts, even thought the group I’m trying to scrape has way more posts. I have tested alt accounts too. Is there any possible solution to my problem so that Facebook don’t block me every time so quick? Or if there is a way I can continue from where I left off because I was blocked? Furthermore, I’m using "allow_extra_requests": True since I want to download all photos to max quality. Could you add get_photos for groups to speed up scraping or is there any other way I could get the link of the first photo (at max quality) of every post faster without using allow_extra_requests which is slow?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:18 (2 by maintainers)

github_iconTop GitHub Comments

4reactions
neon-ninjacommented, Jun 13, 2021

Increasing posts_per_page might help, as then you’d make fewer requests. Adding some time.sleep lines might help reduce the rate at which you’re making requests. Yes, you can continue from a pagination url, by passing the url as the start_url argument to get_posts. These pagination URLs can be seen in the logs if you have debug logging enabled, or you can pass a callback function as request_url_callback to get_posts to handle extracting these pagination urls. Here’s some sample code:

import time
from facebook_scraper import *

results = []
start_url = None
def handle_pagination_url(url):
    global start_url
    start_url = url
set_cookies("cookies.txt")
while True:
    try:
        for post in get_posts("Nintendo", page_limit=None, start_url=start_url, request_url_callback=handle_pagination_url):
            print(len(results))
            results.append(post)
        print("All done")
        break
    except exceptions.TemporarilyBanned:
        print("Temporarily banned, sleeping for 10m")
        time.sleep(600)

Note: https://github.com/kevinzg/facebook-scraper/commit/f3c8948ae04414932899686c89e696306f37ce1f simplifies this code a bit by making it possible to pass a start_url of None.

AFAIK, facebook only provides the high resolution image URL if you click on the photo, which involves an extra request for each photo.

2reactions
neon-ninjacommented, Jun 2, 2021

Probably nothing to worry about, so long as that image URL extraction worked

Read more comments on GitHub >

github_iconTop Results From Across the Web

Action Blocked on Instagram: What Triggers and How to ...
cannot be performed any longer, it means that Instagram blocked them and restricted you from being active on the platform.
Read more >
How To Fix Instagram Action Block Error 2022 "We're sorry ...
Open the “Settings” app on your phone. Tap “General". Tap “iPhone Storage” ; You can schedule posts and stories, use custom fonts for...
Read more >
Instagram Action Blocked: The Best Ways To Fix It
An Instagram action block can be temporary and not have a date when you will be unblocked, or it may come with an...
Read more >
Prevent rate limit blocks
A rate limit block is a temporary block that may automatically be put on your account if we notice that you've repeatedly done...
Read more >
Instagram Action Blocked: How To Fix It
There is not much you can do but wait it out. If you have not been given a date for when your block...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found