`twarc2 counts --archive` returning incomplete results
See original GitHub issueI’m running twarc2 counts
with the Academic Track of the Twitter API, and for some reason it’s returning incomplete results (only going back a few years) with the following query:
!twarc2 counts "(\"this is how\" \"not with a\" but) OR (\"not with a bang\") OR (\"not with a\" \"but a whimper\") OR (\"not with a\" \"but with a whimper\") OR (\"this is the way\" \"not with a\" but)" --archive --granularity day --csv > tweets.csv
(For reference I’m doing a project about T.S. Eliot’s “not with a bang but a whimper.”)
When I check twarc.log, I can see that an empty page is returned, which is interpreted as the end of the search:
2022-02-28 15:59:21,700 INFO Retrieved an empty page of results
2022-02-28 15:59:21,700 INFO No more results for search.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:8 (8 by maintainers)
Top Results From Across the Web
Tweet data appears to be missing from 2010 - full archive search
I am querying data on MSNBC (and other news twitter accounts) around 2009 and 2010 using the twitter archive search, but the results...
Read more >twarc.Client2 - Read the Docs
Twarc2. A client for the Twitter v2 API. Source code in twarc/client2.py ... In the latter case # instead of returning counts of...
Read more >twarc2. twarc has been redesigned from the… | by Ed Summers
Access to the full archive means it's now possible to study events that have happened in the past back to the beginning of...
Read more >Harvesting Twitter Data with twarc - The Carpentries Incubator
Let's start configuring twarc by sending typing in the command twarc2 configure ... the full conversation threads that the input... counts Return counts...
Read more >Search Tweets - How to build a query | Docs - Twitter Developer
The character count includes the entire query string including spaces and ... Getting a query to return the "right" results the first time...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I did put a workaround together in the linked #604 - it turned out to be easier than I thought it was going to be, as long as start-time is specified in the query we can check if we’ve reached that point or not.
@melaniewalsh - you’re welcome to test against that branch, or if it would be easier for you I can push a pre-release to PyPI on Monday.
I greatly appreciate the API detective work here @melaniewalsh & @SamHames – we all benefit from your efforts here.
I tried running @melaniewalsh’s query above and it works now 🥳