question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Inject metadata about when a tweet was retrieved

See original GitHub issue

One thing that isn’t present in Twitter API responses is any information about when the tweet was retrieved/the call was made. This is important metadata, and critical to some use cases (for example, if you’re looking at engagement metrics, there’s a big difference between a tweet collected one minute after compare to one day after). This information is currently stored in the collection log, but that isn’t super friendly because any programmatic use requires parsing the log

I propose we inject an ISO8601 string representing the time the call was made into either the existing meta dictionary of some API responses, or (maybe less likely to collide), a top level key like collection_meta, or _twarc. This avoids needing to manage a sidechannel of information such as through the log file, and also stores the data exactly where it’s needed for processing.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:12 (11 by maintainers)

github_iconTop GitHub Comments

3reactions
lwrubelcommented, Mar 2, 2021

Catching up on this, documenting the request also brought to mind the WARC approach we’re using in SFM. We’re still using that approach, but it does indeed introduce overhead.

The WARC request records provide documentation of when the HTTP request was made, which API was called, the query parameters, auth info and more. But I think the info about which API was called and when the WARC was created are the pieces of info we use the most and are most relevant to twarc. So, agreed it is helpful/necessary to know when the request was made, in case you want to only work with data captured in a certain timeframe or want the context of the metrics, as mentioned above.

3reactions
edsucommented, Feb 23, 2021

I was going to say, it’s a slippery slope, but I kinda like it. It almost starts to remind me of work that GWU was doing to write Twitter Data using the WARC format, which wraps the HTTP response in WARC Record which has metadata and is linked to an HTTP Request. That’s definitely one end of the spectrum. I think they switched away from doing that because it made downstream processing more difficult.

I like the idea of a top level property __twarc that is an object which contains whatever we deem relevant. Maybe with the version of twarc that was used?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Tweet metadata timeline | Docs | Twitter Developer Platform
Tweet metadata, mutability, updates, and currency ... Search APIs serve up historical Tweets with the profile settings as it is at the time...
Read more >
Interpreting metadata on Twitter - Aware Online Academy
How to interpret metadata on Twitter - Learn how to find out the exact time of Tweets, Retweets and comments on Twitter.
Read more >
Scrape MILLIONS of Tweets from Twitter's Historical Archive
Learn how to go back in time on Twitter ! Search Tweets from 2006 and get a full record of every public Tweet...
Read more >
How to call historical Tweets with metadata using Twitter API ...
I am experimenting with the Python wrapper searchtweets-v2 (found here) to try and call Tweets. When I have previously used the API through...
Read more >
MaTED: Metadata-Assisted Twitter Event Detection System
In this component, we parse the tweet JSON object to obtain tweet text, hashtags, URLs, user mentions and other available metadata. We then ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found