question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Creating tables with SQLAlchemy models supported?

See original GitHub issue

The first issue I encountered:

sqlalchemy.exc.CompileError: You need to specify the storage location for the table using the awsathena_location dialect keyword argument

I was able to address this with something like:

Base.metadata.tables[‘my_table’].dialect_options[‘awsathena’][‘location’] = location

There is probably a better solution, but this is all I could come up with so far.

Once I fixed this, I encountered this:

sqlalchemy.exc.DatabaseError: (pyathena.error.DatabaseError) An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input ‘EXTERNAL’. Expecting: ‘OR’, ‘SCHEMA’, ‘TABLE’, ‘VIEW’

I debugged this on the Athena console and determined the issue is the SQL generated has primary key constraints which are not supported by Athena. Removing the primary key constraint in SQLAlchemy gives:

sqlalchemy.exc.ArgumentError: Mapper mapped class Institution->institution could not assemble any primary key columns for mapped table ‘my_table’

Am I in unsupported territory or am I doing something wrong?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:23 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
terrylimnsvcommented, Mar 1, 2022

OK, great! The next step on my side is to try our full application with all our tables and data types.

1reaction
terrylimnsvcommented, Feb 16, 2022

Without the executed code and the stack trace, I can’t examine anything. Please share the code you executed and the stack trace that resulted in the error first.

I created a small test program to reproduce this:

import os
import urllib
import sqlalchemy
from sqlalchemy.orm import registry
from sqlalchemy import Column, Integer, String

BUCKET = 's3://terry-athena-bucket/'

mapper_registry = registry()
Base = mapper_registry.generate_base()

class MyTable(Base):
    __tablename__ = "mytable"
    __table_args__ = { 'awsathena_location': BUCKET + "mytable" }
    
    pk = Column('pk', Integer, primary_key=True)
    name = Column('name', String)
    
    
conn_str =  "awsathena+rest://{aws_access_key_id}:{aws_secret_access_key}@" \
    "athena.{region_name}.amazonaws.com:443/" \
    "{schema_name}?s3_staging_dir={s3_staging_dir}&work_group=primary&aws_session_token={sess_token}"

frmt_string = conn_str.format(
    aws_access_key_id=urllib.parse.quote_plus(os.environ['AWS_ACCESS_KEY_ID']),
    aws_secret_access_key=urllib.parse.quote_plus(os.environ['AWS_SECRET_ACCESS_KEY']),
    region_name='us-east-1',
    schema_name='terry_db',
    s3_staging_dir=urllib.parse.quote_plus(BUCKET),
    sess_token=urllib.parse.quote_plus(os.environ['AWS_SESSION_TOKEN'])
)

athena_engine = sqlalchemy.create_engine(frmt_string, echo=False)

Base.metadata.tables['mytable'].create(athena_engine)

Here is the trace:

Failed to execute query.
Traceback (most recent call last):
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/common.py", line 417, in _execute
    **request
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/util.py", line 84, in retry_api_call
    return retry(func, *args, **kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/tenacity/__init__.py", line 349, in iter
    return fut.result()
  File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/botocore/client.py", line 391, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/botocore/client.py", line 719, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'
Traceback (most recent call last):
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/common.py", line 417, in _execute
    **request
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/util.py", line 84, in retry_api_call
    return retry(func, *args, **kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/tenacity/__init__.py", line 349, in iter
    return fut.result()
  File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/botocore/client.py", line 391, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/botocore/client.py", line 719, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1803, in _execute_context
    cursor, statement, parameters, context
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/util.py", line 37, in _wrapper
    return wrapped(*args, **kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/cursor.py", line 96, in execute
    cache_expiration_time=cache_expiration_time,
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/common.py", line 421, in _execute
    raise DatabaseError(*e.args) from e
pyathena.error.DatabaseError: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "pyathena_test.py", line 35, in <module>
    Base.metadata.tables['mytable'].create(athena_engine)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/sql/schema.py", line 950, in create
    bind._run_ddl_visitor(ddl.SchemaGenerator, self, checkfirst=checkfirst)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 3117, in _run_ddl_visitor
    conn._run_ddl_visitor(visitorcallable, element, **kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2113, in _run_ddl_visitor
    visitorcallable(self.dialect, self, **kwargs).traverse_single(element)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 524, in traverse_single
    return meth(obj, **kw)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/sql/ddl.py", line 898, in visit_table
    include_foreign_key_constraints,  # noqa
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1289, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/sql/ddl.py", line 81, in _execute_on_connection
    self, multiparams, params, execution_options
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1387, in _execute_ddl
    compiled,
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2027, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1803, in _execute_context
    cursor, statement, parameters, context
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/util.py", line 37, in _wrapper
    return wrapped(*args, **kwargs)
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/cursor.py", line 96, in execute
    cache_expiration_time=cache_expiration_time,
  File "/home/ec2-user/environment/pyathena_test/virtenv/lib/python3.7/site-packages/pyathena/common.py", line 421, in _execute
    raise DatabaseError(*e.args) from e
sqlalchemy.exc.DatabaseError: (pyathena.error.DatabaseError) An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'
[SQL: 
CREATE EXTERNAL TABLE mytable (
        pk INTEGER NOT NULL, 
        name VARCHAR, 
        PRIMARY KEY (pk)
)
STORED AS PARQUET
LOCATION 's3://terry-athena-bucket/mytable/'


]
(Background on this error at: https://sqlalche.me/e/14/4xp6)
Read more comments on GitHub >

github_iconTop Results From Across the Web

SQLAlchemy Core - Creating Table
Let us now discuss how to use the create table function. The SQL Expression Language constructs its expressions against table columns. SQLAlchemy Column ......
Read more >
Object Relational Tutorial (1.x API)
The SQLAlchemy Object Relational Mapper presents a method of associating user-defined Python classes with database tables, and instances of those classes ...
Read more >
Implement ORM Data Models with SQLAlchemy
Handle your application's data layer with SQLAlchemy's powerful ORM. Define data models, add/remove records, and execute queries purely in Python.
Read more >
Generate create table SQL from SQLAlchemy model
Generate create table SQL from SQLAlchemy models and support MySQL specific features like AUTO_INCREMENT¶ ... We can use SQLAlchemy's CreateTable ...
Read more >
Create a table from a SQLAlchemy model with a different ...
Just extend your existing table and change its name class StagingMyModel(MyModel): __tablename__ = "staging_mymodel".
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found