Add official support for taking multiple snapshots of websites over time
See original GitHub issueThis is by far the most requested feature.
People want an easy way to take multiple snapshots of websites over time.
This will be easier to do once we’ve added pywb support since we’ll be able to use timestamped de-duped WARCs to save each snapshot: #130
For people finding this issue via Google / incoming links, if you want a hacky solution to take a second snapshot of a site, you can add the link with a new hash and it will be treated as a new page and a new snapshot will be taken:
echo https://example.com/some/page.html#archivedate=2019-03-18 | archivebox add
# then to re-shapshot it on another day...
echo https://example.com/some/page.html#archivedate=2019-03-22 | archivebox add
Edit: as of v0.6 there is now a button in the UI to do this ^
Issue Analytics
- State:
- Created 5 years ago
- Reactions:31
- Comments:12 (6 by maintainers)
Top Results From Across the Web
8 Tools to View Old Versions of Any Website - MakeUseOf
Here are eight tools to help you view old versions of any website. ... timeline with black lines indicating the time snapshots were...
Read more >6 Ways to Save Pages In the Wayback Machine
There are several ways to save pages and whole sites so that they ... Archive-It is a paid subscription service with technical and...
Read more >How to See Old Versions of Websites (And Why You'd Want To)
Online historical records enable you to see old versions of websites, captured at specific moments in time. Being able to do that is...
Read more >ArchiveBox | Open source self-hosted web archiving. Takes ...
Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, ... First-class support for saving multiple snapshots of each site over time will ...
Read more >Archive Amazon EBS snapshots - AWS Documentation
When you need to access an archived snapshot, you can restore it from the archive tier to the standard tier, and then use...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
This is now added in v0.6. It’s not full support, but it’s a step in the right direction. I just added a UI button labeled
Re-snapshot
that automates the process of creating a new snapshot with a bumped timestamp in the URL hash. I could also add a flag called--resnapshot
or--duplicate
that automates this step when archiving via the CLI too.Then later when we add better real multi-snapshot support, we can migrate all the Snapshots with timestamps in their hashes to the new system automatically.
You can accomplish this right now still by adding a hash at the end of the URL, e.g.
Official first-class support for multiple snapshots is still on the roadmap, but don’t expect it anytime in the next month or two, it’s quite a large feature with big implications for how we store and dedupe snapshot data internally.