question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`twarc2 counts --archive` returning incomplete results

See original GitHub issue

I’m running twarc2 counts with the Academic Track of the Twitter API, and for some reason it’s returning incomplete results (only going back a few years) with the following query:

!twarc2 counts "(\"this is how\" \"not with a\" but) OR (\"not with a bang\") OR (\"not with a\" \"but a whimper\") OR (\"not with a\" \"but with a whimper\") OR (\"this is the way\" \"not with a\" but)" --archive --granularity day --csv > tweets.csv

(For reference I’m doing a project about T.S. Eliot’s “not with a bang but a whimper.”)

When I check twarc.log, I can see that an empty page is returned, which is interpreted as the end of the search:

2022-02-28 15:59:21,700 INFO Retrieved an empty page of results
2022-02-28 15:59:21,700 INFO No more results for search.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
SamHamescommented, Mar 4, 2022

I did put a workaround together in the linked #604 - it turned out to be easier than I thought it was going to be, as long as start-time is specified in the query we can check if we’ve reached that point or not.

@melaniewalsh - you’re welcome to test against that branch, or if it would be easier for you I can push a pre-release to PyPI on Monday.

1reaction
edsucommented, Mar 4, 2022

I greatly appreciate the API detective work here @melaniewalsh & @SamHames – we all benefit from your efforts here.

I tried running @melaniewalsh’s query above and it works now 🥳

(twarc) ➜  twarc git:(workaround_counts_API_zero_count) ✗ twarc2 counts "(\"this is how\" \"not with a\" but) OR (\"not with a bang\") OR (\"not with a\" \"but a whimper\") OR (\"not with a\" \"but with a whimper\") OR (\"this is the way\" \"not with a\" but)" --archive --granularity day --csv tweets.csv
100%|█████████████████| Processed 15 years/15 years [16:33<00:00, 354018 tweets total ]
Read more comments on GitHub >

github_iconTop Results From Across the Web

Tweet data appears to be missing from 2010 - full archive search
I am querying data on MSNBC (and other news twitter accounts) around 2009 and 2010 using the twitter archive search, but the results...
Read more >
twarc.Client2 - Read the Docs
Twarc2. A client for the Twitter v2 API. Source code in twarc/client2.py ... In the latter case # instead of returning counts of...
Read more >
twarc2. twarc has been redesigned from the… | by Ed Summers
Access to the full archive means it's now possible to study events that have happened in the past back to the beginning of...
Read more >
Harvesting Twitter Data with twarc - The Carpentries Incubator
Let's start configuring twarc by sending typing in the command twarc2 configure ... the full conversation threads that the input... counts Return counts...
Read more >
Search Tweets - How to build a query | Docs - Twitter Developer
The character count includes the entire query string including spaces and ... Getting a query to return the "right" results the first time...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found