Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Column names in cursor.description contain unnecessary characters

See original GitHub issue

Driver version

2.0.876

Redshift version

Redshift 1.0.23412

Client Operating System

Mac OS 10.15.7

Python version

3.9.1

Table schema

bookname varchar author‎ varchar

Problem description

Expected behaviour: When the table has columns bookname and author, column names in cursor.description are also available as bookname and author.
Actual behaviour: What I got from cursor.description were b'bookname' and b'author\xe2\x80\x8e' .
Error message/stack trace: No error message for this issue.
Any other details that can be helpful: See the reproduction code.

Python Driver trace logs

No trace for this issue.

Reproduction code

Code

# This is the same as the sample code on README
cursor: redshift_connector.Cursor = conn.cursor()
cursor.execute("create Temp table book(bookname varchar,author‎ varchar)")
cursor.executemany("insert into book (bookname, author‎) values (%s, %s)",
                    [
                      ('One Hundred Years of Solitude', 'Gabriel García Márquez'),
                      ('A Brief History of Time', 'Stephen Hawking')
                    ]
                  )
cursor.execute("select * from book")

result: tuple = cursor.fetchall()
print(result)

#  This is the piece of code that I added:
for field in cursor.description:
  print(field)

Output

(['A Brief History of Time', 'Stephen Hawking'], ['One Hundred Years of Solitude', 'Gabriel García Márquez'])
(b'bookname', 1043, None, None, None, None, None)
(b'author\xe2\x80\x8e', 1043, None, None, None, None, None)

Issue Analytics

State:
Created 3 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

meihcommented, Mar 19, 2021

Hi @Brooke-white , Thanks for your investigation, it was nice to find out a code-specific issue and not a driver’s. And I’ve just figured out that the type of the column names was bytes and I needed to decode() them before using them as strings.

Problem solved. Thank you so much for your help!

1reaction

Brooke-whitecommented, Mar 19, 2021

Hi @meih ,

I was playing around with this issue a bit and pasted the queries in question into the Redshift query editor, which showed there is a “hidden” unicode character \u200e present after author. This is the cause of this issue. I’ve included a screenshot below which highlights this character.

If you try executing the statements pasted below, which do not contain this character, you should see no issue:

 with redshift_connector.connect(**db_kwargs) as conn:
        cursor: redshift_connector.Cursor = conn.cursor()
        cursor.execute("create Temp table book(bookname varchar, author varchar)");
        cursor.executemany("insert into book(bookname, author) values (%s, %s)",
                           [
                               ('One Hundred Years of Solitude', 'Gabriel García Márquez'),
                               ('A Brief History of Time', 'Stephen Hawking')
                           ]
                           )
        cursor.execute("select * from book")
        result: tuple = cursor.fetchall()
        print(result)

        #  This is the piece of code that I added:
        for field in cursor.description:
            print(field)

(['A Brief History of Time', 'Stephen Hawking'], ['One Hundred Years of Solitude', 'Gabriel García Márquez'])
(b'bookname', 1043, None, None, None, None, None)
(b'author', 1043, None, None, None, None, None)

I will update the project README with the corrected queries.