archivebox 0.4.2 init fails parsing old json (ValueError: year 1586476777 is out of range/dateutil.parser._parser.ParserError: year 1586476777 is out of range)
See original GitHub issueDescribe the bug
archivebox init produces error
ValueError: year 1586476777 is out of range dateutil.parser._parser.ParserError: year 1586476777 is out of range
Steps to reproduce
create virtual environment
mkcd /home/kangus/src/archivebox0.4/ pew new -p /usr/bin/python3.8 -a $(pwd) archivebox0.4
clone
git clone https://github.com/pirate/ArchiveBox cd ArchiveBox git branch -a
checkout relevant
git checkout remotes/origin/v0.4.3
install dependencies
pip install -e .
config ENV
eval export $(grep -v '^#' /home/kangus/.ArchiveBox.conf)
migration
` /home/kangus/src/archivebox0.4/ArchiveBox/bin/archivebox init
`
Screenshots or log output
/home/kangus/src/archivebox0.4/ArchiveBox/bin/archivebox init
[*] Updating existing ArchiveBox collection in this folder...
/data/Zalohy/archivebox
------------------------------------------------------------------
[*] Verifying archive folder structure...
√ /data/Zalohy/archivebox/sources
√ /data/Zalohy/archivebox/archive
√ /data/Zalohy/archivebox/logs
√ /data/Zalohy/archivebox/ArchiveBox.conf
[*] Verifying main SQL index and running migrations...
√ /data/Zalohy/archivebox/index.sqlite3
Operations to perform:
Apply all migrations: admin, auth, contenttypes, core, sessions
Running migrations:
No migrations to apply.
[*] Collecting links from any existing indexes and archive folders...
√ Loaded 28875 links from existing main index.
Traceback (most recent call last):
File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 655, in parse
ret = self._build_naive(res, default)
File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 1241, in _build_naive
naive = default.replace(**repl)
ValueError: year 1586476777 is out of range
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/kangus/src/archivebox0.4/ArchiveBox/bin/archivebox", line 14, in <module>
archivebox.main(args=sys.argv[1:], stdin=sys.stdin)
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/cli/archivebox.py", line 54, in main
run_subcommand(
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/cli/__init__.py", line 55, in run_subcommand
module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/cli/archivebox_init.py", line 32, in main
init(
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 105, in typechecked_function
return func(*args, **kwargs)
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/main.py", line 321, in init
fixed, cant_fix = fix_invalid_folder_locations(out_dir=out_dir)
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/index/__init__.py", line 572, in fix_invalid_folder_locations
link = parse_json_link_details(entry.path)
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 105, in typechecked_function
return func(*args, **kwargs)
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/index/json.py", line 100, in parse_json_link_details
return Link.from_json(link_json)
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/index/schema.py", line 190, in from_json
info['updated'] = parse_date(info.get('updated'))
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 105, in typechecked_function
return func(*args, **kwargs)
File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 144, in parse_date
return dateparser.parse(date)
File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 657, in parse
six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
File "<string>", line 3, in raise_from
dateutil.parser._parser.ParserError: year 1586476777 is out of range: 1586476777.093312
Software versions
- OS: (Linux Mint 19 Tara/ Ubuntu 18.4)
- ArchiveBox version: (374dd39)
- Python version: (3.8.2, also tested on 3.7.2)
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (7 by maintainers)
Top Results From Across the Web
archivebox 0.4.2 init fails parsing old json (ValueError: year ...
Describe the bug archivebox init produces error ValueError: year 1586476777 is out of range dateutil.parser._parser.ParserError: year 1586476777 is out of ...
Read more >How to fix ParserError: year 0 is out of range: 0000-00-00 with ...
Pandas uses the pandas.Timestamp type to store date with time, instead pythons datetime.datetime. The min/max values for TimeStamp are:.
Read more >archivebox 0.4.2 init فشل تحليل json القديم (ValueError: السنة ...
وصف الخطأ. تهيئة صندوق الأرشيف ينتج خطأ. ValueError: year 1586476777 is out of range dateutil.parser._parser.ParserError: year 1586476777 is out of range ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I like that idea of warning. Since you are still recording the raw value, it’s really a superficial problem. On the frontend, if the parsed date fails the
1960-01-01 < date < $CURRENT_YEAR+1
check, it could display a placeholder value in addition to logging a warning to stdout/file.I understand you are busy. I am waiting for the rewrite to be merged before I start hacking away. 😃
We can also warn the user or bail out if the parsed date is outside of something like this:
1960-01-01 < date < $CURRENT_YEAR+1
.