question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Redshift Driver Truncates Important Information in Error Messages

See original GitHub issue

Driver version

2.0.907

Redshift version

Redshift 1.0.38698

Client Operating System

Docker python:3.10.2 image

Python version

3.10.2

Table schema

test_column: INTEGER

Problem description

When attempting to copy a file from S3 into Redshift via awswrangler, a data type mismatch will correctly throw an error. However, the error message is truncated, which makes it hard to debug the issue in non-trivial applications

Python Driver trace logs

redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "
error: Spectrum Scan Error 
code: 15007 
context: File 'https://s3.region.amazonaws.com/bucket/bucket_directory/subdirectory/afile.snappy.parquet' has an incompatible Parquet schema for column 's3://bucket/bucket_director 
query: 1234567 
location: dory_util.cpp:1226 
process: worker_thread [pid=12345]

Reproduction code

import pandas as pd
import awswrangler as wr

df = pandas.DataFrame([[1.23]], columns="test_column")  # target schema is an integer, this is a float
wr.redshift.copy(
  df=df,
  path="s3://bucket/bucket_directory/subdirectory/",
  table="test",
)

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:16 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
Brooke-whitecommented, Nov 21, 2022

Hi @justbldwn – unfortunately I do not have any update from my end. As many engineers are on holiday this week I will ping the team again next week for an update on this.

1reaction
WillAydcommented, Sep 30, 2022

For any readers that may come across the same issue, I’ve noticed that while the traceback in Python truncates this error message you can still get the full error message if you query SVL_S3LOG . Might help as a workaround

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting queries in Amazon Redshift Spectrum
The error message might be truncated due to the limit on message length. To retrieve the complete error message, including column name and...
Read more >
Issues · aws/amazon-redshift-python-driver - GitHub
Redshift Python Connector. ... Issues · aws/amazon-redshift-python-driver. ... Redshift Driver Truncates Important Information in Error Messages.
Read more >
Redshift ODBC Driver error: String data right truncation on ...
The workaround to this issue is to use the PostgreSQL UNICODE driver instead of the RedShift one. It seems there might be a...
Read more >
Amazon Redshift ODBC Driver Release Notes - Amazon S3
Amazon Redshift ODBC Data Connector. 1.4.56. Released July 2022. These release notes provide details of enhancements, features, known issues, and.
Read more >
When Running Flow And Connecting to ... - Knowledge Base
Issue. When running flows and connecting to Amazon Redshift, the flow failed with the below error displays in Performance of Flow Runs.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found