question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Efficient lookup in many-many relationship

See original GitHub issue

I have the following schema, where the relationship between Executable and Symbol is many-to-many.

    class File(db.Entity):
        loc = Required(str, unique=True)
        tim = Optional(datetime)

    class Executable(File):
        sym = Set("Symbol")

    class Symbol(db.Entity):
        sig = Required(str, 5000, encoding='utf-8')
        exe = Set(Executable)

A foreign-key table called Executable_Symbol would be created by Pony to store this relationship, but there seems to be no way to check whether a particular relationship exists via the ORM unless I drop down to raw SQL, i.e.

eid,sid = exe.id,sym.id
db.select('* from Executable_Symbol where executable=$eid and symbol=$sid ')

I figured the best way of doing this is that if I have a Symbol called sym, and an Executable called exe, I can use the expression:

exe in sym.exe

But this seems to be very slow. In comparison, accessing the Executable_Symbol table using raw SQL is much faster, but dropping to raw SQL is not very desirable. My application would check this a few hundred thousand times, so every bit of efficiency would be useful.

Is there a better way to do this?

thanks!

Issue Analytics

  • State:closed
  • Created 10 years ago
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
kozlovskycommented, Jul 6, 2016

Hi @hfaran, if a Set attribute does not have lazy=True option then Pony assumes that the collection size is probably not too big and it is more efficient to load all items together if some items were requested. Pony operates this way in order to reduce the total number of queries. For example, if you calculate something like course1 in student1.courses and the courses collection is not lazy, then all items will be loaded. After that course2 in student1.courses will not lead to a new query and return answer immediately.

If lazy=True option is specified, then Pony will load only the items which are necessary to perform the operation. This way Pony minimizes number of rows loaded instead of number of queries. With lazy=True option, course1 in student1.courses will load a single row only. Some operations will still load full collection content, for example if you do loop over all items of collection:

for c in student1.courses:
    print c.name

It will load all items in a single query. But you can restrict number of loaded items by using filtering:

for c in student1.courses.filter(lambda c: c.credits > 4):
    print c.name

or

for c in student1.courses.filter(lambda c: c.credits > 4) \
                         .order_by(lambda c: c.name).page(1, pagesize=10):
    print c.name

This way only filtered items will be loaded. Also note that len(c.courses) loads all items while c.courses.count() just calculates COUNT(*) in SQL.

0reactions
hfarancommented, Jul 5, 2016

Hi @kozlovsky, I know this issue is now over 3 years old so I just wanted to clarify current Pony behaviour.

Let’s say I have a Set which I expect may grow to over several million entries. If I do not create the Set attribute with lazy=True, will Pony attempt to load all of those several million entities into the cache?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Efficient persistence strategy for many-to-many relationship
Before you read any further: stop being afraid of JOINs. This is a classic case for using a genuine relational database such as...
Read more >
The right way to use a ManyToManyField in Django
To maintain a many-to-many relationship between two tables in a database, the only way is to have a third table which has references...
Read more >
Efficient many-to-many field lookup in Django REST ...
The trick is to only do 1 query on the many-to-many field's table, 1 query on the app_blogpost table and 1 query on...
Read more >
master detail - Is it possible to have more than 2 many- ...
1) You can use lookups to accomplish the exact same thing as a Master-Detail if the only requirement is to have a Many...
Read more >
Database access optimization
But in general, callable attributes cause DB lookups every time: ... For example, assuming a Group model that has a many-to-many relation to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found