question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

get_posts returns only 20 posts when logged in

See original GitHub issue

I am trying to scrap group=120778514747417 which is a public group.

print(len(list(get_posts(group=120778514747417))))

If I don’t run set_cookies first, the scraper works fine but I get banned pretty quickly, probably because I am unlogged. If I run set_cookies first with my account which is part of this group, scraper only returns 20 posts and stops there. Logger says the following at the end: No raw posts (<article> elements) were found in this page. Page parser did not find next page URL 20

How can I scrap more posts being logged in ? If not possible, what is the recommended throttling to scrap without cookies ? Thanks and great work btw!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:14

github_iconTop GitHub Comments

2reactions
neon-ninjacommented, Oct 28, 2021

Try load https://m.facebook.com/groups/120778514747417/ in your browser, and scroll down. Posts don’t load there either. This looks like a bug in Facebook

1reaction
neon-ninjacommented, Nov 21, 2021

Oh, I’ve just realised - this kind of group pagination URL is already handled by this regex - href[=:]"(\/groups\/[^"]+bac=[^"]+)". All group pagination should be handled by the GroupPageParser class - given you’ve put it in PageParser, suggests you’re not calling get_posts correctly @chribell. For a group, call get_posts like this: get_posts(group="group_name")

Read more comments on GitHub >

github_iconTop Results From Across the Web

get_posts returns posts by current user only - Stack Overflow
This code when logged in as admin returns all the posts in the site but when logged in as retailer return only the...
Read more >
get_posts() returns all posts rather than the ones specified ...
get_posts () returns all posts rather than the ones specified with 'post_author' => - WordPress Development Stack Exchange. Stack Overflow for ...
Read more >
get_posts() | Function - WordPress Developer Resources
Retrieves an array of the latest posts, or posts matching the given criteria.
Read more >
WordPress Get_Posts: Easy Guide to Using This Function
The get_posts function returns an array of WP_Posts objects which you can loop over to display the posts on a page on your...
Read more >
WordPress get_posts: How to Use This PHP Function to Build ...
The function above retrieves the latest 20 blog posts in the specified category (by default the 'post_type' is 'post' ) and returns an...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found