question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

understanding get_posts() parameters pages and post_per_page?

See original GitHub issue

I don’t know if I’m misunderstanding facebook-scraper, my own code, or I’m seeing a bug. When I ran the code fragment:

for username in usernames:
    total_posts = 12
    if username == frequent_poster:
        total_posts = 24

    posts_per_page = 4

    pages = (total_posts + posts_per_page - 1) // posts_per_page

    print('')
    print('total_posts = %d, posts_per_page = %d, pages = %d' % (total_posts, posts_per_page, pages))

    posts = get_posts(username, cookies=cookie_file, pages=pages, extra_info=True,
                      options={'posts_per_page': posts_per_page, 'allow_extra_requests': False, 'HQ_images': False})

    print('Actual number of posts = %d' % (sum(1 for i in posts)))

I got the output:

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 9

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 9

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

...

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

total_posts = 24, posts_per_page = 4, pages = 6
Actual number of posts = 63

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 9

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

...

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 9

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

...

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 6

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 31

total_posts = 12, posts_per_page = 4, pages = 3
Actual number of posts = 30

I’m trying to understand how post extraction and its parameters work, and I expected the actual number of posts to be less than or equal to total_posts in each case. My goal is to make as few requests as possible.

What am I misunderstanding?

As always, I appreciate your help.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:32

github_iconTop GitHub Comments

1reaction
curiousier-georgecommented, Mar 31, 2022

Hmm, after commenting out

# set_user_agent("Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")

I get 100 and everything seems to work. 😊

Thanks.

0reactions
neon-ninjacommented, Apr 1, 2022

The warning is safe to ignore

Read more comments on GitHub >

github_iconTop Results From Across the Web

What Is the WordPress get_posts Function & How Do You Use ...
The WordPress get_posts function is a method for retrieving a custom set of posts based on specified criteria. It's important to note that...
Read more >
WordPress get_posts: How to Use This PHP Function to Build ...
WordPress get_posts is a powerful function allowing developers to retrieve pieces of content from the WordPress database.
Read more >
get_posts() | Function - WordPress Developer Resources
Retrieves an array of the latest posts, or posts matching the given criteria. ... set of parameters within a page, then get_posts is...
Read more >
How to Use the WordPress get_posts Function
The WordPress get_posts function allows developers to retrieve post data from the WordPress database by taking the ID of a given post and ......
Read more >
How to limit get_posts()? - WordPress Stack Exchange
UPDATE #1: I want to use, say, 5 posts per page and use multiple queries on the same page using get_posts() (not query_posts...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found