question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

bug: RecursionError raised when attempting to create a duckdb table from a pandas-backed exp

See original GitHub issue

Reading the documentation on Creating and Inserting Data, I was under the impression that it would be possible to create a new table from an arbitrary ibis expression, even if it did not stem from the same connection/backend.

Hence I wanted to use pandas to create an in-memory table and then store that into a duckdb database:

import ibis
import pandas as pd

df = pd.DataFrame(
    {

        "g": ["a", "a", "a", "a", "a"],
        "x": [0, 1, 2, 3, 4],
        "y": [3, 2, 0, 1, 1],
    }
)
t_pandas = ibis.pandas.connect({"t": df}).table("t")
duckdb_conn = ibis.duckdb.connect()
duckdb_conn.create_table("t", t_pandas)

However this results in RecursionError: maximum recursion depth exceeded:

Traceback (most recent call last):
  File "/Users/ogrisel/tmp/ibis_pandas_window.py", line 16, in <module>
    duckdb_conn.create_table("t", t_pandas)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/__init__.py", line 215, in create_table
    method = self._get_insert_method(expr)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/__init__.py", line 219, in _get_insert_method
    compiled = self.compile(expr)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/__init__.py", line 344, in compile
    return self.compiler.to_ast_ensure_limit(expr, limit, params=params).compile()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/base.py", line 39, in compile
    compiled_queries = [q.compile() for q in self.queries]
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/base.py", line 39, in <listcomp>
    compiled_queries = [q.compile() for q in self.queries]
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 176, in compile
    frag = self._compile_table_set()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 203, in _compile_table_set
    result = helper.get_result()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 40, in get_result
    self.join_tables.append(self._format_table(op))
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 134, in _format_table
    result = ctx.get_compiled_expr(op)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 73, in get_compiled_expr
    result = self._compile_subquery(node)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 34, in _compile_subquery
    return self._to_sql(op, sub_ctx)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 37, in _to_sql
    return self.compiler.to_sql(expr, ctx)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 408, in to_sql
    return query.compile()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 176, in compile
    frag = self._compile_table_set()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 203, in _compile_table_set
    result = helper.get_result()
[...]
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 203, in _compile_table_set
    result = helper.get_result()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 40, in get_result
    self.join_tables.append(self._format_table(op))
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 134, in _format_table
    result = ctx.get_compiled_expr(op)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 73, in get_compiled_expr
    result = self._compile_subquery(node)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 34, in _compile_subquery
    return self._to_sql(op, sub_ctx)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 37, in _to_sql
    return self.compiler.to_sql(expr, ctx)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/alchemy/query_builder.py", line 405, in to_sql
    query = cls.to_ast(expr, context).queries[0]
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/query_builder.py", line 527, in to_ast
    query = cls.select_builder_class().to_select(
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/select_builder.py", line 144, in to_select
    select_query = self._build_result_query()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/select_builder.py", line 195, in _build_result_query
    self._populate_context()
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/select_builder.py", line 218, in _populate_context
    self._make_table_aliases(self.table_set)
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/select_builder.py", line 244, in _make_table_aliases
    elif not ctx.is_extracted(node):
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 130, in is_extracted
    return node in self.top_context.extracted_subexprs
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 59, in top_context
    return self.parent.top_context
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 59, in top_context
    return self.parent.top_context
  File "/Users/ogrisel/code/ibis/ibis/backends/base/sql/compiler/translator.py", line 59, in top_context
    return self.parent.top_context
  [Previous line repeated 105 more times]
RecursionError: maximum recursion depth exceeded

Ideally I would have liked ibis to import the in-memory data to the duckdb table or alternative explicitly state that creating a table from an expr backed by another backend is not possible instead of raising a low-level RecursionError.

UPDATE: I also tried with a sqlite connection and I get the same error when passing the pandas df directly while it works when wrapping it into an ibis.memtable.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jcristcommented, Oct 19, 2022

Thanks for opening this! I agree that both the docs and error message here could be improved.

Replying to myself, I just discovered ibis.memtable which solves my original problem:

While this works and is one way to do it, you can also use con.register to register an external table with duckdb from multiple different common sources (pandas, pyarrow, parquet, …). Since this only registers an external table, it’s zero-copy (when possible), and won’t result in a new table being created in any persistent duckdb database.

import ibis
import pandas as pd

df = pd.DataFrame(
    {

        "g": ["a", "a", "a", "a", "a"],
        "x": [0, 1, 2, 3, 4],
        "y": [3, 2, 0, 1, 1],
    }
)
duckdb_conn = ibis.duckdb.connect()
t = duckdb_conn.register(df)  # you optionally also pass a table name, otherwise one is generated for you
t.execute()
0reactions
ogriselcommented, Oct 19, 2022

Related UX issue:

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issues · ibis-project/ibis - GitHub
Contribute to ibis-project/ibis development by creating an account on GitHub. ... bug: RecursionError raised when attempting to create a duckdb table from ...
Read more >
Friendlier SQL with DuckDB
As you are building a query that joins similar tables, you'll often encounter duplicate column names. If the query is the final result,...
Read more >
CLI API - DuckDB
To open or create a persistent database, simply include a path as a command line argument like duckdb path/to/my_database.duckdb .
Read more >
Create Table - DuckDB
-- create a temporary table from a CSV file using AUTO-DETECT (i.e., Automatically detecting column names and types) · -- create a table...
Read more >
Efficient SQL on Pandas with DuckDB
DuckDB is fully capable of running queries in parallel directly on top of a Pandas DataFrame (or on a Parquet/CSV file, or on...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found