Reactors not populating in get_post
See original GitHub issueHi. I’m trying a simple request on public page. I really need only reactors name (or any other id info) of a post and same info (user names and info) from all comments to the post. (if you can suggest me a complete request to do above).
I’m using current master
try: post = next(get_posts(post_urls=["160806872879026"], options={"reactors": True})) pprint(post)
Fetching 3000 reactors Found 0 reactors
‘reaction_count’: None, ‘reactions’: None, ‘reactors’: None,
other info seems ok.
Note: i receive locale warning UserWarning: Locale detected as it_IT - for best results, set to en_US
on a browser i noted that reactors div is referred as “reaction_profile_browser1”
while in extractors i see “reaction_profile_browser”
elems = list(response.html.find("div[id^='reaction_profile_browser']>div"))
anyway changing it doesn’t work.
Thanks.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6
Top GitHub Comments
You can filter out just the keys you want, here’s some sample code:
Fetching this post takes ~1.81 seconds. Setting
"allow_extra_requests": False
reduces that by 0.5815s, as it means the scraper doesn’t make a request to https://m.facebook.com/WeedpayOfficial/photos/a.108611961431851/160806839545696/?type=3&source=57&refid=52&__tn__=EH-R. In terms of where the time is spent then:So time to get the post (0.6976) + time to get reactions (0.3653) = total (1.0629s). Processing only takes ~137ms. This is neglible, compared to the time taken to make the requests.
Yes, you can pass a dict to
set_cookies
.You will need to pass cookies as per the readme for this. You can pass
"comments": True
to extract comments.^=
means “starts with”, so reaction_profile_browser1 will match as it starts with reaction_profile_browser.The code:
outputs: