"KeyError: None" when using JOIN and aggregate_rows in 2.10.2
See original GitHub issueThere is a case where using joins and aggregate_rows is causing uncaught exceptions. The example seems to be related to joining tables through a junction table, where there are two tables A and C and another B that joins the two together. B has two rows, both pointing to the same A. The two B records have one with a C object and the other with a null C. Similar to this:
B - C
/
A
\
B - NULL
When attempting to join and aggregate the rows, we see the following stack trace:
Error
Traceback (most recent call last):
File "/Users/ericely/workspace/test/test_peewee.py", line 48, in test_peewee
print len(records)
File "/Users/ericely/workspace/env/lib/python2.7/site-packages/peewee.py", line 3298, in __len__
return len(self.execute())
File "/Users/ericely/workspace/env/lib/python2.7/site-packages/peewee.py", line 2334, in __len__
return self.count
File "/Users/ericely/workspace/env/lib/python2.7/site-packages/peewee.py", line 2330, in count
self.fill_cache()
File "/Users/ericely/workspace/env/lib/python2.7/site-packages/peewee.py", line 2377, in fill_cache
next(self)
File "/Users/ericely/workspace/env/lib/python2.7/site-packages/peewee.py", line 2363, in next
obj = self.iterate()
File "/Users/ericely/workspace/env/lib/python2.7/site-packages/peewee.py", line 2761, in iterate
instance._data[metadata.foreign_key.name]]
KeyError: None
The code to generate this is:
import unittest
from peewee import *
db = SqliteDatabase(':memory:')
# create a base model class that our application's models will extend
class BaseModel(Model):
class Meta:
database = db
class A(BaseModel):
val = FloatField(default=3.14)
class C(BaseModel):
val = IntegerField(default=42)
class B(BaseModel):
a = ForeignKeyField(A, null=True, default=None)
c = ForeignKeyField(C, null=True, default=None)
class Osha_Violation_Model_Tests(unittest.TestCase):
def test_peewee(self):
A.create_table()
B.create_table()
C.create_table()
# save the first record chain
A().save(force_insert=True)
a = A.get(A.id == 1)
C().save(force_insert=True)
c = C.get(C.id == 1)
B(a=a, c=c).save(force_insert=True)
# save the second record chain, starting with the same A, without a C link
a = A.get(A.id == 1)
B(a=a).save(force_insert=True)
records = A\
.select(A, B, C)\
.join(B, JOIN_LEFT_OUTER)\
.join(C, JOIN_LEFT_OUTER)\
.aggregate_rows()
print len(records)
This is peewee 2.10.2. Even though 2.10.2 isn’t the latest release, we are still using it since 3.X has some breaking backward changes. I know aggregate_rows was removed in 3.X and prefetch is now recommended, but we still have a relatively large codebase on 2.10.2.
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
Keyerror:None ,I don't understand this problem - Stack Overflow
In the line displayed you have self.side_map[side]) and KeyError: None means that the key is None, so your side variable have a value...
Read more >API Reference — peewee 2.10.2 documentation
Method to look at an aggregate of rows using a given function and return a scalar value, such as the count of all...
Read more >MySQL bugs fixed by Aurora MySQL database engine updates
The query includes a left join and an IN subquery. (Bug #34060289). Fixed an issue where it wasn't possible to revoke the DROP...
Read more >KeyError Pandas – How To Fix - Data Independent
Pandas KeyError - This annoying error means that Pandas can not find your column name in your dataframe. Here's how to fix this...
Read more >An Introduction to Using SQL Aggregate Functions with JOINs
Let's see how they cooperate paired with LEFT JOIN, SUM and GROUP BY ... COUNT(column), Counts the number of non-null values in a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For anyone who stumbles upon this, here is how we ultimately solved the issue:
Allows us to order A by joined tables, but we dedupe by grouping on A’s primary key. Then we prefetch the other values through subqueries. So no duplicate A objects, and all other fields are prefetched.
Performance can still be a concern as you are now joining multiple tables, grouping, and then running subqueries on top of that, so your mileage may vary depending on your situation.
Exactly. All
aggregate_rows()
did was to roll up the duplicates and accumulate joined rows as related instances. That’s why I suggested using something likeitertools.groupby
.