get_posts returns only 20 posts when logged in
See original GitHub issueI am trying to scrap group=120778514747417 which is a public group.
print(len(list(get_posts(group=120778514747417))))
If I don’t run set_cookies
first, the scraper works fine but I get banned pretty quickly, probably because I am unlogged.
If I run set_cookies
first with my account which is part of this group, scraper only returns 20 posts and stops there.
Logger says the following at the end:
No raw posts (<article> elements) were found in this page.
Page parser did not find next page URL
20
How can I scrap more posts being logged in ? If not possible, what is the recommended throttling to scrap without cookies ? Thanks and great work btw!
Issue Analytics
- State:
- Created 2 years ago
- Comments:14
Top Results From Across the Web
get_posts returns posts by current user only - Stack Overflow
This code when logged in as admin returns all the posts in the site but when logged in as retailer return only the...
Read more >get_posts() returns all posts rather than the ones specified ...
get_posts () returns all posts rather than the ones specified with 'post_author' => - WordPress Development Stack Exchange. Stack Overflow for ...
Read more >get_posts() | Function - WordPress Developer Resources
Retrieves an array of the latest posts, or posts matching the given criteria.
Read more >WordPress Get_Posts: Easy Guide to Using This Function
The get_posts function returns an array of WP_Posts objects which you can loop over to display the posts on a page on your...
Read more >WordPress get_posts: How to Use This PHP Function to Build ...
The function above retrieves the latest 20 blog posts in the specified category (by default the 'post_type' is 'post' ) and returns an...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Try load https://m.facebook.com/groups/120778514747417/ in your browser, and scroll down. Posts don’t load there either. This looks like a bug in Facebook
Oh, I’ve just realised - this kind of group pagination URL is already handled by this regex -
href[=:]"(\/groups\/[^"]+bac=[^"]+)"
. All group pagination should be handled by the GroupPageParser class - given you’ve put it in PageParser, suggests you’re not calling get_posts correctly @chribell. For a group, callget_posts
like this:get_posts(group="group_name")