question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

set_type_codec() appears to assume a particular set_type_codec for the "char" datatype

See original GitHub issue
  • asyncpg version: 0.21.0

  • PostgreSQL version: 11.8 fedora

  • Do you use a PostgreSQL SaaS? If so, which? Can you reproduce the issue with a local PostgreSQL install?: N/A

  • Python version: 3.8.3

  • Platform: Fedora 31

  • Do you use pgbouncer?: no

  • Did you install asyncpg with pip?: yes

  • If you built asyncpg locally, which version of Cython did you use?: N/A

  • Can the issue be reproduced under both asyncio and uvloop?: N/A

It appears that the implementation for set_type_codec() relies upon the results of the query TYPE_BY_NAME which itself is assumed to return a bytes value from the PostgreSQL “char” datatype.

I was previously unaware that PostgreSQL actually has two “char” variants bpchar and char, and in the documentation at https://magicstack.github.io/asyncpg/current/usage.html#type-conversion this is talking about the “bpchar” datatype. that’s fine. However, when trying to normalize asyncpg’s behavior against that of the psycopg2 and pg8000 drivers, both of which will give you back string for both of these types (we have determined this is also a bug in those drivers, as they fail to return arbirary bytes for such a datatype and likely was missed when they migrated to Python 3), I tried setting up a type_codec for “char” that would allow it to return strings:

    await conn.set_type_codec(
        "char",
        schema="pg_catalog",
        encoder=lambda value: value,
        decoder=lambda value: value,
        format="text",
    )

that works, but when you do that, you no longer can use the set_type_codec method for anything else, because the behavior of the type is redefined outside of the assumptions made by is_scalar_type.

The example program below illustrates this failure when attempting to subsequently set up a codec for the JSONB datatype:

import asyncio
import json

import asyncpg


async def main(illustrate_bug):
    conn = await asyncpg.connect(
        user="scott", password="tiger", database="test"
    )

    if illustrate_bug:
        await conn.set_type_codec(
            "char",
            schema="pg_catalog",
            encoder=lambda value: value,
            decoder=lambda value: value,
            format="text",
        )

    await conn.set_type_codec(
        "jsonb",
        schema="pg_catalog",
        encoder=lambda value: value,
        decoder=json.loads,
        format="text",
    )


print("no bug")
asyncio.run(main(False))


print("bug")
asyncio.run(main(True))

output:

no bug
bug
Traceback (most recent call last):
  File "test3.py", line 35, in <module>
    asyncio.run(main(True))
  File "/opt/python-3.8.3/lib/python3.8/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/opt/python-3.8.3/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "test3.py", line 21, in main
    await conn.set_type_codec(
  File "/home/classic/.venv3/lib/python3.8/site-packages/asyncpg/connection.py", line 991, in set_type_codec
    raise ValueError(
ValueError: cannot use custom codec on non-scalar type pg_catalog.jsonb

Since the “char” datatype is kind of an obscure construct, it’s likely reasonable that asyncpg disallow setting up a type codec for this particular type, or perhaps it could emit a warning, but at the moment there doesn’t seem to be documentation suggesting there are limitations on what kinds of type codecs can be constructed.

none of this is blocking us, just something we came across and I hope it’s helpful to the asyncpg project. cc @fantix

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
elpranscommented, Sep 18, 2020

@fantix, I submitted #619, which complements your fixes, but also creates a small conflict. Feel free to rebase onto that branch.

1reaction
elpranscommented, Sep 17, 2020

I was previously unaware that PostgreSQL actually has two “char” variants bpchar and char,

bpchar is an internal name for “(b)lank (p)added character”, which is char(n) in SQL. The other one is varchar(n), which is equivalent to text for all purposes of conversion. The docs refer to “char” and “varchar” as the standard names for those data types in SQL.

Read more comments on GitHub >

github_iconTop Results From Across the Web

The char Data Type - math.oxford.emory.edu
The char type is used to store single characters (letters, digits, symbols, ... Encoding refers to how something (like a char) is converted...
Read more >
Introduction to data types and field properties - Microsoft Support
Overview of data types and field properties in Access, and detailed data type reference. including Memo, Date/Time, and Text.
Read more >
11.3.2 The CHAR and VARCHAR Types
The following table illustrates the differences between CHAR and VARCHAR by showing the result of storing various string values into CHAR(4) and VARCHAR(4) ......
Read more >
Chapter 2: Data types
The size or range of the data that can be stored in an integer data type is ... Given 8 bits per byte,...
Read more >
Oracle Datatypes
For example, assume you declare a column VARCHAR2 with a maximum size of 50 characters. In a single-byte character set, if only 10...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found