question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Double insert of rows while importing

See original GitHub issue

When importing a CSV via the admin interface, the rows in the sheet are being imported twice. Once when loading preview and once when confirming import. Importing with the id column left empty. I’ve used the library before and I’ve never faced this issue before. Super confused!

Django==2.1.3 django-import-export==2.1.0

class ReviewsResource(resources.ModelResource):
    class Meta:
        model = Review
        fields = ('id',  the remaining fields in the model...)
class Review(models.Model):
    .. few field definitions

    class Meta:
        managed = False
        db_table = 'review'

    def __str__(self):
        return str(self.id)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
nishanthvijayancommented, May 19, 2020

I wrote a subclass for ModelResouce to fix this and its working fine.

class DatabaseAwareAtomicIfUsingTransaction(atomic_if_using_transaction):
    def __init__(self, using_transactions, using):
        self.using_transactions = using_transactions
        if using_transactions:
            self.context_manager = transaction.atomic(using)


class DatabaseAwareResource(resources.ModelResource):
    """
    DatabaseAwareResource class is a subclass of resources.ModelResource of the django_import_export library
    that fixes a double import bug in it. The bug is caused by the resource class not being aware of multiple databases
    When rows are imported for preview, a transaction is created and rolled back. This is to identify all the potential
    errors in the data being imported. However, the transaction and its associated operations are always run
    against the default database instead of the specific db to which data is being imported

    This class reads the :using resource meta parameter to determine the database against which operations should be run

    TODO: Stop using this hack when this issue is fixes: https://github.com/django-import-export/django-import-export/issues/1137
    """

    def import_data(self, dataset, dry_run=False, raise_errors=False,
                    use_transactions=None, collect_failed_rows=False, **kwargs):

        if use_transactions is None:
            use_transactions = self.get_use_transactions()

        db_alias = self._meta.using
        if db_alias is None:
            db_alias = DEFAULT_DB_ALIAS

        connection = connections[db_alias]
        supports_transactions = getattr(connection.features, "supports_transactions", False)

        if use_transactions and not supports_transactions:
            raise ImproperlyConfigured

        using_transactions = (use_transactions or dry_run) and supports_transactions

        with DatabaseAwareAtomicIfUsingTransaction(using_transactions, db_alias):
            return self.import_data_inner(dataset, dry_run, raise_errors, using_transactions, collect_failed_rows, **kwargs)

    def import_data_inner(self, dataset, dry_run, raise_errors, using_transactions, collect_failed_rows, **kwargs):
        result = self.get_result_class()()
        result.diff_headers = self.get_diff_headers()
        result.total_rows = len(dataset)

        db_alias = self._meta.using
        if db_alias is None:
            db_alias = DEFAULT_DB_ALIAS

        if using_transactions:
            # when transactions are used we want to create/update/delete object
            # as transaction will be rolled back if dry_run is set
            sp1 = savepoint(db_alias)

        try:
            with DatabaseAwareAtomicIfUsingTransaction(using_transactions, db_alias):
                self.before_import(dataset, using_transactions, dry_run, **kwargs)
        except Exception as e:
            logger.debug(e, exc_info=e)
            tb_info = traceback.format_exc()
            result.append_base_error(self.get_error_result_class()(e, tb_info))
            if raise_errors:
                raise

        instance_loader = self._meta.instance_loader_class(self, dataset)
        # Update the total in case the dataset was altered by before_import()
        result.total_rows = len(dataset)

        if collect_failed_rows:
            result.add_dataset_headers(dataset.headers)

        for i, row in enumerate(dataset.dict, 1):
            with DatabaseAwareAtomicIfUsingTransaction(using_transactions, db_alias):
                row_result = self.import_row(
                    row,
                    instance_loader,
                    using_transactions=using_transactions,
                    dry_run=dry_run,
                    **kwargs
                )
            result.increment_row_result_total(row_result)

            if row_result.errors:
                if collect_failed_rows:
                    result.append_failed_row(row, row_result.errors[0])
                if raise_errors:
                    raise row_result.errors[-1].error
            elif row_result.validation_error:
                result.append_invalid_row(i, row, row_result.validation_error)
                if collect_failed_rows:
                    result.append_failed_row(row, row_result.validation_error)
                if raise_errors:
                    raise row_result.validation_error
            if (row_result.import_type != RowResult.IMPORT_TYPE_SKIP or
                    self._meta.report_skipped):
                result.append_row_result(row_result)

        try:
            with DatabaseAwareAtomicIfUsingTransaction(using_transactions, db_alias):
                self.after_import(dataset, result, using_transactions, dry_run, **kwargs)
        except Exception as e:
            logger.debug(e, exc_info=e)
            tb_info = traceback.format_exc()
            result.append_base_error(self.get_error_result_class()(e, tb_info))
            if raise_errors:
                raise

        if using_transactions:
            if dry_run or result.has_errors():
                savepoint_rollback(sp1, db_alias)
            else:
                savepoint_commit(sp1, db_alias)

        return result
1reaction
andrewgy8commented, May 19, 2020

ATM, I would say not unless this is a common use case that many people are running into. Otherwise, I think they can benefit from your snippet.

Read more comments on GitHub >

github_iconTop Results From Across the Web

preventing duplicate row insertion in mysql while importing ...
Preventing duplicate row insertion in mysql while importing csv file .I want to insert data into mysql table via importing csv file. how...
Read more >
duplicate record while importing and inserting csv file data into ...
1 Answer 1 · Hi tushar , i am getting error .. · @learnjqueery you need to do accmap. · tushar.. · @learnjqueery...
Read more >
13.2.7.2 INSERT ... ON DUPLICATE KEY UPDATE Statement
This function is especially useful in multiple-row inserts. The VALUES() function is meaningful only in the ON DUPLICATE KEY UPDATE clause or INSERT...
Read more >
- Import - Duplicate row error - Community - Teradata Support
Possible solutions: 1. Load the data into a multiset table. If the target table must be a set table, you can "insert ......
Read more >
To insert several rows in a one cell in import csv file
I prepared an import file in the MS Excel. There is column "Description". One cell in this column contains several rows where each...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found