question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

HTTP Error, Gives 404 but the URL is working

See original GitHub issue

Hi, I had a script running over the past weeks and earlier today it stopped working. I keep receiving HTTPError 404, but the provided link in the errors still brings me to a valid page. Code is (all mentioned variables are established and the error specifically happens with the Manager when I check via debugging): tweetCriteria = got.manager.TweetCriteria().setQuerySearch(term)\ .setMaxTweets(max_count)\ .setSince(begin_timeframe)\ .setUntil(end_timeframe) scraped_tweets = got.manager.TweetManager.getTweets(tweetCriteria)

The error message for this is the standard 404 error “An error occured during an HTTP request: HTTP Error 404: Not Found Try to open in browser:” followed by the valid link

As I have changed nothing about the folder, I am wondering if something has happened with my configurations more so than anything else, but wondering if others are experiencing this.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:74
  • Comments:144

github_iconTop GitHub Comments

14reactions
herdemocommented, Sep 19, 2020

Maybe it has something to do with this: https://blog.twitter.com/developer/en_us/topics/tips/2020/understanding-the-new-tweet-payload.html

Unfortunately twitter api does not fully meet our need, because we need to full history search without any limitations. You can search only 5000 tweet in a month with twitter api.
I hope getoldtweets start to work as soon as possible, otherwise i can not complete my master thesis

10reactions
HuifangYeocommented, Sep 25, 2020

Any alternative solution for it? My masters thesis is on hold because of it. I tried snscrape as mentioned in above comment but it does not return result based on a search query string

I used the below query search and it returns me the links of the tweets.

snscrape twitter-search "#XRP since:2019-12-31 until:2020-09-25" > XRP_Sept_tweets.txt

I obtain the tweet_id and then I used tweepy to extract the tweet as I needed more attributes (may not be the best way to do):

def get_tweets(tweet_ids, currency):
    #     global api
    statuses = api.statuses_lookup(tweet_ids, tweet_mode="extended")
    data = get_df() # define your own dataframe
    # printing the statuses
    for status in statuses:
        # print(status.lang)
        
        if status.lang == "en":
            mined = {
                "tweet_id": status.id,
                "name": status.user.name,
                "screen_name": status.user.screen_name,
                "retweet_count": status.retweet_count,
                "text": status.full_text,
                "mined_at": datetime.datetime.now(),
                "created_at": status.created_at,
                "favourite_count": status.favorite_count,
                "hashtags": status.entities["hashtags"],
                "status_count": status.user.statuses_count,
                "followers_count": status.user.followers_count,
                "location": status.place,
                "source_device": status.source,
                "coin_symbol": currency
            }

            last_tweet_id = status.id
            data = data.append(mined, ignore_index=True)

    print(currency, "outputing to tweets", len(data))
    data.to_csv(
        f"Extracted_TWEETS.csv", mode="a", header=not os.path.exists("Extracted_TWEETS.csv"), index=False
    )
    print("..... going to sleep 20s")
    time.sleep(20)

Note that tweet_ids is a list of 100 tweet ids.

Read more comments on GitHub >

github_iconTop Results From Across the Web

404 Page Not Found Error: What It Is and How to Fix It - Lifewire
A 404 error is an HTTP status code that means that the page you were trying to reach on a website couldn't be...
Read more >
Error 404 not found - What does it mean & how to fix it! - IONOS
An HTTP 404 error page appears when a web page can't be found. See why this happens and solutions to fix the error...
Read more >
404 Not Found Error: What It Is and How to Fix It - Airbrake Blog
The 404 Not Found Error is an HTTP response status code, which indicates that the requested resource could not be found. Like most...
Read more >
What Is a 404 Error? How to Deal With the Web Error
A 404 error indicates that the webpage you're trying to reach can't be found, and usually means that the page has moved or...
Read more >
How to Fix Error 404 Not Found on Your WordPress Site - Kinsta
Basically, it means that the client (your, or your visitor's, web browser) was able to successfully connect to the host (your website's server), ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found