question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bugfix: django.db.utils.IntegrityError: UNIQUE constraint failed: core_snapshot.timestamp

See original GitHub issue

Describe the bug

Y’all helped me with upgrading my super old archive to the django branch before official 0.4.9 release. I recently upgraded to the newest version, so I could start adding links. archivebox said I had to re-init. archivebox init gives me following error, and will not let me add new links.

django.db.utils.IntegrityError: UNIQUE constraint failed: core_snapshot.timestamp

Full log/error below.

Steps to reproduce

  1. git checkout master to switch from django branch.
  2. git pull origin master to pull new release.
  3. pip install -e . (also tried with pip uninstall archivebox && pip install .)
  4. Navigate to archivebox-output directory.
  5. Run archivebox init.
  6. error.

Screenshots or log output

[i] [2020-07-31 17:34:44] ArchiveBox v0.4.9: archivebox init
    > /.archivebox-output/archive-working

[*] Updating existing ArchiveBox collection in this folder...
    /.archivebox-output/archive-working
------------------------------------------------------------------

[*] Verifying archive folder structure...
    √ /.archivebox-output/archive-working/sources
    √ /.archivebox-output/archive-working/archive
    √ /.archivebox-output/archive-working/logs
    √ /.archivebox-output/archive-working/ArchiveBox.conf

[*] Verifying main SQL index and running migrations...
    √ /.archivebox-output/archive-working/index.sqlite3

    Operations to perform:
      Apply all migrations: admin, auth, contenttypes, core, sessions
    Running migrations:
    Applying core.0005_auto_20200728_0326... OK

[*] Collecting links from any existing indexes and archive folders...
    √ Loaded 1376 links from existing main index.
    √ Added 347 orphaned links from existing archive directories.
    ! Skipped adding 239 invalid link data directories.

    X /* SNIP A BUNCH OF BROKEN ARCHIVES /*

    Hint: For more information about the link data directories that were skipped, run:
        archivebox status
        archivebox list --status=invalid

[*] [2020-07-31 18:01:50] Writing 1723 links to main index...
Traceback (most recent call last):
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/query.py", line 575, in update_or_create
    obj = self.select_for_update().get(**kwargs)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/query.py", line 417, in get
    self.model._meta.object_name
core.models.DoesNotExist: Snapshot matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/backends/sqlite3/base.py", line 396, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: UNIQUE constraint failed: core_snapshot.timestamp

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/home/USERNAME/.local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/cli/__init__.py", line 126, in main
    pwd=pwd or OUTPUT_DIR,
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/cli/__init__.py", line 62, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/cli/archivebox_init.py", line 35, in main
    out_dir=pwd or OUTPUT_DIR,
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/util.py", line 109, in typechecked_function
    return func(*args, **kwargs)
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/main.py", line 369, in init
    write_main_index(list(all_links.values()), out_dir=out_dir)
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/util.py", line 109, in typechecked_function
    return func(*args, **kwargs)
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/index/__init__.py", line 235, in write_main_index
    write_sql_main_index(links, out_dir=out_dir)
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/util.py", line 109, in typechecked_function
    return func(*args, **kwargs)
  File "/home/USERNAME/datahoard/ArchiveBox/archivebox/index/sql.py", line 42, in write_sql_main_index
    Snapshot.objects.update_or_create(url=link.url, defaults=info)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/query.py", line 580, in update_or_create
    obj, created = self._create_object_from_params(kwargs, params, lock=True)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/query.py", line 604, in _create_object_from_params
    raise e
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/query.py", line 596, in _create_object_from_params
    obj = self.create(**params)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/query.py", line 433, in create
    obj.save(force_insert=True, using=self.db)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/base.py", line 746, in save
  force_update=force_update, update_fields=update_fields)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/base.py", line 784, in save_base
    force_update, using, update_fields,
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/base.py", line 887, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/base.py", line 926, in _do_insert
    using=using, raw=raw,
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/query.py", line 1204, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1392, in execute_sql
    cursor.execute(sql, params)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/backends/utils.py", line 68, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/home/USERNAME/.local/lib/python3.7/site-packages/django/db/backends/sqlite3/base.py", line 396, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: UNIQUE constraint failed: core_snapshot.timestamp

Software versions

  • OS: Ubuntu 18.04
  • ArchiveBox version: 0.4.9 (0ac4e12)
  • Python version: Python 3.7.8

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (7 by maintainers)

github_iconTop GitHub Comments

5reactions
piratecommented, Aug 11, 2020

Very helpful @karlicoss! This is high on our priority list of things to fix.

I’ll check in with an update once we’ve started working on this. I suspect it’s a relatively simple bug in the timestamp deduping code, most of the work will be QA and testing to make sure we don’t introduce any regressions while we fix it.

For context, timestamp deduping has been one of the most brittle parts of ArchiveBox in the past years, and we already have plans to remove the need for it in a refactoring in the next major version.

1reaction
drpfendersoncommented, Sep 2, 2020

With the changes present in the cdvv7788:sql_index branch, reflected in PR #452, it fixed my issue! I was able to archivebox init on the old index, updated with some broken directories, but ultimately wrote everything to the index. Looks to be intact! I’ll just add the “invalid link data directories” through a .txt file.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Bugfix: django.db.utils.IntegrityError: UNIQUE constraint failed ...
I experimented a bit and managed to consistently reproduce. I suspect the urls that have a suffix in the timestamp are causing it....
Read more >
django.db.utils.IntegrityError: UNIQUE constraint failed
I just met this simiilar error: Django UNIQUE constraint failed. I tried examine the code for very long time, but didn't solve it....
Read more >
django.db.utils.IntegrityError: UNIQUE constraint failed
If you are getting a Unique constraint failed error, I would suggest you open up a new topic for this discussion, along with...
Read more >
IntegrityError: UNIQUE constraint failed: home_profile.user_id
Hello all, I have a form rendered on every page with a global context which sets two values to a 'profile' model (an...
Read more >
django.db.utils.IntegrityError: UNIQUE constraint failed
Django : django. db. utils. IntegrityError : UNIQUE constraint failed : rango_category__new.slug [ Beautify Your Computer ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found