question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

archivebox 0.4.2 init fails parsing old json (ValueError: year 1586476777 is out of range/dateutil.parser._parser.ParserError: year 1586476777 is out of range)

See original GitHub issue

Describe the bug

archivebox init produces error

ValueError: year 1586476777 is out of range dateutil.parser._parser.ParserError: year 1586476777 is out of range

Steps to reproduce

create virtual environment

mkcd /home/kangus/src/archivebox0.4/ pew new -p /usr/bin/python3.8 -a $(pwd) archivebox0.4

clone

git clone https://github.com/pirate/ArchiveBox cd ArchiveBox git branch -a

checkout relevant

git checkout remotes/origin/v0.4.3

install dependencies

pip install -e .

config ENV

eval export $(grep -v '^#' /home/kangus/.ArchiveBox.conf)

migration

` /home/kangus/src/archivebox0.4/ArchiveBox/bin/archivebox init

`

Screenshots or log output

/home/kangus/src/archivebox0.4/ArchiveBox/bin/archivebox  init                                                                                                                                                                     
[*] Updating existing ArchiveBox collection in this folder...                                                                                                                                                                                 
    /data/Zalohy/archivebox                                                                                                                                                                                                                   
------------------------------------------------------------------                                                                                                                                                                            
                                                                                                                                                                                                                                              
[*] Verifying archive folder structure...   
    √ /data/Zalohy/archivebox/sources
    √ /data/Zalohy/archivebox/archive
    √ /data/Zalohy/archivebox/logs
    √ /data/Zalohy/archivebox/ArchiveBox.conf

[*] Verifying main SQL index and running migrations...
    √ /data/Zalohy/archivebox/index.sqlite3

    Operations to perform:
      Apply all migrations: admin, auth, contenttypes, core, sessions
    Running migrations:
    No migrations to apply.

[*] Collecting links from any existing indexes and archive folders...
    √ Loaded 28875 links from existing main index.
Traceback (most recent call last):
  File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 655, in parse
    ret = self._build_naive(res, default)
  File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 1241, in _build_naive
    naive = default.replace(**repl)
ValueError: year 1586476777 is out of range

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/kangus/src/archivebox0.4/ArchiveBox/bin/archivebox", line 14, in <module>
    archivebox.main(args=sys.argv[1:], stdin=sys.stdin)
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/cli/archivebox.py", line 54, in main
    run_subcommand(
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/cli/__init__.py", line 55, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/cli/archivebox_init.py", line 32, in main
    init(
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 105, in typechecked_function
    return func(*args, **kwargs)
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/main.py", line 321, in init
    fixed, cant_fix = fix_invalid_folder_locations(out_dir=out_dir)
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/index/__init__.py", line 572, in fix_invalid_folder_locations
    link = parse_json_link_details(entry.path)
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 105, in typechecked_function
    return func(*args, **kwargs)
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/index/json.py", line 100, in parse_json_link_details
    return Link.from_json(link_json)
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/index/schema.py", line 190, in from_json
    info['updated'] = parse_date(info.get('updated'))
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 105, in typechecked_function
    return func(*args, **kwargs)
  File "/home/kangus/src/archivebox0.4/ArchiveBox/archivebox/util.py", line 144, in parse_date
    return dateparser.parse(date)
  File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 1374, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/home/kangus/.local/share/virtualenvs/archivebox0.4/lib/python3.8/site-packages/dateutil/parser/_parser.py", line 657, in parse
    six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
  File "<string>", line 3, in raise_from
dateutil.parser._parser.ParserError: year 1586476777 is out of range: 1586476777.093312

Software versions

  • OS: (Linux Mint 19 Tara/ Ubuntu 18.4)
  • ArchiveBox version: (374dd39)
  • Python version: (3.8.2, also tested on 3.7.2)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
mdhowlecommented, Apr 30, 2020

I like that idea of warning. Since you are still recording the raw value, it’s really a superficial problem. On the frontend, if the parsed date fails the 1960-01-01 < date < $CURRENT_YEAR+1 check, it could display a placeholder value in addition to logging a warning to stdout/file.

I understand you are busy. I am waiting for the rewrite to be merged before I start hacking away. 😃

1reaction
piratecommented, Apr 30, 2020

We can also warn the user or bail out if the parsed date is outside of something like this: 1960-01-01 < date < $CURRENT_YEAR+1.

Read more comments on GitHub >

github_iconTop Results From Across the Web

archivebox 0.4.2 init fails parsing old json (ValueError: year ...
Describe the bug archivebox init produces error ValueError: year 1586476777 is out of range dateutil.parser._parser.ParserError: year 1586476777 is out of ...
Read more >
How to fix ParserError: year 0 is out of range: 0000-00-00 with ...
Pandas uses the pandas.Timestamp type to store date with time, instead pythons datetime.datetime. The min/max values for TimeStamp are:.
Read more >
archivebox 0.4.2 init فشل تحليل json القديم (ValueError: السنة ...
وصف الخطأ. تهيئة صندوق الأرشيف ينتج خطأ. ValueError: year 1586476777 is out of range dateutil.parser._parser.ParserError: year 1586476777 is out of range ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found