Dealing with StringType to varchar(256) limit in python
See original GitHub issueHi, I have seen #118 and also read the documentation which says
Note: Due to limitations in Spark, metadata modification is unsupported in
the Python, SQL, and R language APIs.
While I understand this limitation, I’m wondering how a python user should deal with dataframes that have text columns with values exceeding 256 characters.
Im trying to save a dataframe to redshift where a column of type string has entries as large as 1000 characters and any hacks / workarounds will be appreciated 😃
Issue Analytics
- State:
- Created 7 years ago
- Reactions:1
- Comments:8 (1 by maintainers)
Top Results From Across the Web
What is the max length of a Python string? - Stack Overflow
64 GB of memory, a Python string of around 63 GB should be quite feasible, if not maximally fast. If you can upgrade...
Read more >Built-in Types — Python 3.11.1 documentation
Return the integer represented by the given array of bytes. The argument bytes must either be a bytes-like object or an iterable producing...
Read more >Character types - Amazon Redshift - AWS Documentation
A VARCHAR can contain multibyte characters, up to a maximum of four bytes per character. For example, a VARCHAR(12) column can contain 12...
Read more >Documentation: 15: 8.3. Character Types - PostgreSQL
Both of these types can store strings up to n characters (not bytes) in length. An attempt to store a longer string into...
Read more >String & Binary Data Types - Snowflake Documentation
There is no difference with respect to Unicode handling between CHAR and ... Although a VARCHAR's maximum length is specified in characters, a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
spark-redshift now lets you do this in version runtime-3.0. here’s an example: https://gist.github.com/pallavi/f83a45308ba8387f6b227c28aa209077
I tried using the
createTableColumnTypes
option, it seems to work withpostgres
but not withspark-redshift
which seems to ignore the option. The solution provided by @pallavi is working perfectly.