Issue received while scraping with comments
See original GitHub issuepost urls = [10158609033378741, 10159694338583054,1200387097127839,3176654032562645,204010715100946]
From this below code I get this error
error name: name 'time' is not defined
from facebook_scraper import *
set_cookies("cookies.txt")
results = []
start_url = None
post_result = []
def handle_pagination_url(url):
global start_url
start_url = url
while True:
try:
post = next(
get_posts(
post_urls=[10158609033378741],
options={
"comments": "generator",
"comment_start_url": start_url,
"comment_request_url_callback": handle_pagination_url,
},
)
)
comments = list(post["comments_full"])
for comment in comments:
comment["replies"] = list(comment["replies"])
replies_list = []
if comment["replies"]:
for replies in comment["replies"]:
replies_list.append(replies)
comment.update({"replies":replies_list})
results.append(comment)
print("All done")
post.update({"comments_full":results})
post_result.append(post)
break
except exceptions.TemporarilyBanned:
print("Temporarily banned, sleeping for 10m")
time.sleep(600)
Issue Analytics
- State:
- Created 2 years ago
- Comments:23 (13 by maintainers)
Top Results From Across the Web
how to get the comments in a html page while scraping?
With BeautifulSoup you can do this. Try this:- from bs4 import BeautifulSoup, Comment soup = BeautifulSoup(html, 'lxml') for comment in soup ...
Read more >How do YOU Solve this common Web Scraping issue?
This is a common error we get when an element on the page doesn't exist - abstracting out to a new function to...
Read more >10 Tips to avoid getting Blocked while Scraping Websites
In this post we are going to understand how we can avoid getting blocked while scraping.
Read more >What is Web Scraping and How to Use It? - GeeksforGeeks
So, when a web scraper needs to scrape a site, first the URLs are provided. Then it loads all the HTML code for...
Read more >How to scrape Instagram posts, comments, and photos
This step-by-step guide should get you started in just a few minutes. Step 1. Go to Apify Store for Instagram Scraper. When you...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@fashandatafields – this example of code for private groups has very cautious sleep timers (and unfortunately pretty shoddy Python) - but you can try it this way and adjust the timers as necessary. I’ve yet to be banned with it. It appends to pandas dataframes as it goes, so even if you are temp banned you should be able to write the comments you’ve parsed so far. It writes to CSV at the end, but you could change the pandas “to_csv” to “to_json” -
Lots of ways to do this.
@milesgratz can we make it all in one JSON, I mean posts+comments & replies in one JSON. comments & replies needs to be in proper chain