apply omission of "local" columns when applying aliasing to join conditions around a secondary
See original GitHub issueHello! My issue is about creating many-to-many relationship, based on a column with array of ids (using PostgreSQL) I’ve seen the issue https://github.com/sqlalchemy/sqlalchemy/issues/4472 and it looks like in one of the last messages there’s an error like my issue, but I’m not sure, so here’s a new one.
TL;DR:
When creating a many-to-many relationship based on a column with array of ids (using PostgreSQL) (and yes. I know, that it’s a violation of SQL pattern, but this is present in a huge project and I’m unable to change this, but I want to get relations in one query) if I use joinedload, alchemy creates an invalid query (I’ll list raw SQL queries, generated by alchemy, below)
I’ve also seen this question and used something from the answers as a base for my solution https://stackoverflow.com/questions/9729381/sqlalchemy-relationships-with-postgresql-array
Tested using SQLAlchemy==1.3.10
The fully-working standalone code:
from sqlalchemy import Column, Integer, String, create_engine, select, func
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.orm import relationship, sessionmaker, scoped_session, joinedload, contains_eager
#### Creating connection
engine = create_engine('postgres://test:test@127.0.0.1:5432/test')
# Session = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Session = scoped_session(
sessionmaker(autocommit=False, autoflush=False, bind=engine)
)
Base = declarative_base()
Base.metadata.bind = engine
##
## Helping util
from sqlalchemy.dialects.postgresql.psycopg2 import PGDialect_psycopg2
psycopg_dialect = PGDialect_psycopg2()
def compile_query(q):
compiled = q.statement.compile(dialect=psycopg_dialect)
return str(compiled) % compiled.params
##
class Author(Base):
__tablename__ = 'authors'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False, unique=True)
books = Column(ARRAY(Integer), default=[], nullable=False)
def __repr__(self):
return f'Author(name={self.name!r}, books={self.books!r})'
class Book(Base):
__tablename__ = 'books'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False, unique=True)
def __repr__(self):
return f'Book(name={self.name!r})'
### creating relationships
## idea from:
# https://stackoverflow.com/questions/9729381/sqlalchemy-relationships-with-postgresql-array
authors_books_selectable = select([
func.unnest(Author.books).label('book_id'),
Author.id.label('author_id'),
]).alias()
join_primary = Book.id == authors_books_selectable.c.book_id # book_id is a label for unnested ids
join_secondary = authors_books_selectable.c.author_id == Author.id # author_id label from selectable
Book._relationship_inverse_authors_ = relationship(
Author,
secondary=authors_books_selectable,
primaryjoin=join_primary,
secondaryjoin=join_secondary,
viewonly=True,
)
Author._relationship_books_ = relationship(
Book,
secondary=authors_books_selectable,
primaryjoin=join_primary,
secondaryjoin=join_secondary,
viewonly=True,
# # Does not work with backref :(
# backref='_relationship_inverse_authors_',
#
# backref=backref(
# '_relationship_inverse_authors_',
# uselist=True,
# viewonly=True,
# )
)
#
def create_tables():
print('[Re]Creating tables')
# # uncomment if wanna clear tables
# comment out for preserving the tables for each run
# Base.metadata.drop_all()
Base.metadata.create_all()
def create_entities():
print('Creating entities')
b1 = Book(name='First')
b2 = Book(name='Second')
b3 = Book(name='Third')
bn = Book(name='Nth')
Session.add(b1)
Session.add(b2)
Session.add(b3)
Session.add(bn)
Session.commit()
a1 = Author(name='A1', books=[b1.id, b2.id])
a2 = Author(name='A2', books=[b2.id, b3.id])
Session.add(a1)
Session.add(a2)
Session.commit()
def print_all_of_model(model):
print()
print('Querying', model.__name__, 'from', model.__tablename__)
items = Session.query(model).all()
print('The result is:')
print(items)
def show_whats_inside():
"""
Outputs
Querying Author from authors
The result is:
[Author(name='A1', books=[1, 2]), Author(name='A2', books=[2, 3])]
Querying Book from books
The result is:
[Book(name='First'), Book(name='Second'), Book(name='Third'), Book(name='Nth')]
:return:
"""
for model in (Author, Book):
print_all_of_model(model)
def show_query_for_one_author(q):
print('\nCompiled query:')
print(compile_query(q))
a = q.one()
print('.\nauthor:', a)
print('his books:', a._relationship_books_)
def just_show_one_author():
"""
Outputs
Querying author:
Compiled query:
SELECT authors.id, authors.name, authors.books
FROM authors
WHERE authors.id = 2
.
author: Author(name='A2', books=[2, 3])
his books: [Book(name='First'), Book(name='Second'), Book(name='Third')]
:return:
"""
print('Querying author:')
q = Session.query(Author).filter(Author.id == 2)
show_query_for_one_author(q)
def get_author_and_books_using_joinedload():
"""
This one failes, output is:
getting one author and his books using joined load
Compiled query:
SELECT authors.id, authors.name, authors.books, books_1.id, books_1.name
FROM authors LEFT OUTER JOIN ((SELECT unnest(authors.books) AS book_id, authors.id AS author_id
FROM authors) AS anon_1 JOIN books AS books_1 ON anon_1.author_id = anon_1.author_id) ON books.id = anon_1.book_id
WHERE authors.id = 2
failed: (psycopg2.ProgrammingError) invalid reference to FROM-clause entry for table "books"
LINE 3: ...ooks_1 ON anon_1.author_id = anon_1.author_id) ON books.id =...
^
HINT: Perhaps you meant to reference the table alias "books_1".
[SQL: SELECT authors.id AS authors_id, authors.name AS authors_name, authors.books AS authors_books, books_1.id AS books_1_id, books_1.name AS books_1_name
FROM authors LEFT OUTER JOIN ((SELECT unnest(authors.books) AS book_id, authors.id AS author_id
FROM authors) AS anon_1 JOIN books AS books_1 ON anon_1.author_id = anon_1.author_id) ON books.id = anon_1.book_id
WHERE authors.id = %(id_1)s]
[parameters: {'id_1': 2}]
(Background on this error at: http://sqlalche.me/e/f405)
:return:
"""
print('-----')
print('getting one author and his books using joined load')
q = Session.query(Author).filter(Author.id == 2)
q = q.options(joinedload(Author._relationship_books_))
show_query_for_one_author(q)
def get_author_and_books_using_contains_eager():
"""
So. this is my solution. Output:
getting one author and his books using contains_eager
Compiled query:
SELECT authors.id, authors.name, authors.books, books.id, books.name
FROM books JOIN (SELECT unnest(authors.books) AS book_id, authors.id AS author_id
FROM authors) AS anon_1 ON books.id = anon_1.book_id JOIN authors ON anon_1.author_id = authors.id
WHERE authors.id = 2
.
author: Author(name='A2', books=[2, 3])
his books: [Book(name='Second'), Book(name='Third')]
:return:
"""
print('---')
print('getting one author and his books using contains_eager')
q = (
Session.query(Author).filter(Author.id == 2)
.join(Book._relationship_inverse_authors_)
.options(
contains_eager(Author._relationship_books_)
)
)
show_query_for_one_author(q)
def main():
# # We need these two only for the first run
create_tables()
create_entities()
show_whats_inside()
just_show_one_author()
try:
get_author_and_books_using_joinedload()
except Exception as e:
Session.rollback()
print('failed:', e)
# outputs:
"""
failed: (psycopg2.ProgrammingError) invalid reference to FROM-clause entry for table "books"
LINE 3: ...ooks_1 ON anon_1.author_id = anon_1.author_id) ON books.id =...
^
HINT: Perhaps you meant to reference the table alias "books_1".
[SQL: SELECT authors.id AS authors_id, authors.name AS authors_name, authors.books AS authors_books, books_1.id AS books_1_id, books_1.name AS books_1_name
FROM authors LEFT OUTER JOIN ((SELECT unnest(authors.books) AS book_id, authors.id AS author_id
FROM authors) AS anon_1 JOIN books AS books_1 ON anon_1.author_id = anon_1.author_id) ON books.id = anon_1.book_id
WHERE authors.id = %(id_1)s]
[parameters: {'id_1': 2}]
(Background on this error at: http://sqlalche.me/e/f405)
"""
# This one works!
get_author_and_books_using_contains_eager()
if __name__ == '__main__':
main()
So!
The wrong compiled SQL (created using joinedload
) is:
q = Session.query(Author).filter(Author.id == 2)
q = q.options(joinedload(Author._relationship_books_))
SELECT authors.id, authors.name, authors.books, books_1.id, books_1.name
FROM authors
LEFT OUTER JOIN ((SELECT unnest(authors.books) AS book_id, authors.id AS author_id
FROM authors) AS anon_1 JOIN books AS books_1 ON anon_1.author_id = anon_1.author_id)
ON books.id = anon_1.book_id
WHERE authors.id = 2
As you can see, in the ON
clause it uses books
instead of alias books_1
: ON books.id
.
It also uses LEFT OUTER JOIN
, which would return all of the books
Also, there’s a useless ON
clause: ON anon_1.author_id = anon_1.author_id
And the working solution (created using contains_eager
) compiled SQL query is:
q = (
Session.query(Author).filter(Author.id == 2)
.join(Book._relationship_inverse_authors_)
.options(
contains_eager(Author._relationship_books_)
)
)
SELECT authors.id, authors.name, authors.books, books.id, books.name
FROM books
JOIN (SELECT unnest(authors.books) AS book_id, authors.id AS author_id
FROM authors) AS anon_1 ON books.id = anon_1.book_id
JOIN authors ON anon_1.author_id = authors.id
WHERE authors.id = 2
So, is it OK to use it like this? Am I missing something? Maybe I’m doing wrong the joinedload
? Do I have to add any more aliases manually? Is this any kind of an issue, or just my fault?
Thanks in advance!
Issue Analytics
- State:
- Created 4 years ago
- Comments:10 (6 by maintainers)
you can use relationship to secondary but the issue with “anon_1.id == anon_1.id” is fixed in the above gerrit and will be in 1.3.11.
Mike Bayer has proposed a fix for this issue in the master branch:
Exclude local columns when adapting secondary in a join condition https://gerrit.sqlalchemy.org/1571