Change in way read_glue_table addresses tables
See original GitHub issueDescribe the bug Previously, I have used the following code without issue:
item_ids = wr.s3.read_parquet(path=f'{processed_data_bucket}/distinct_path')
keys = list(wr.s3.read_parquet_table(database=gluedatabase, table='path_tokens').itemid.unique())
Now when I try to run the same code, I can see there has been some changes in the way that awswrangler refers to parquet file locations, as I now get:
item_ids = wr.s3.read_parquet(path=f'{processed_data_bucket}/distinct_path')
...
InvalidArgumentValue: '<redacted s3 path that does not start with s3://>/distinct_path' is not a valid path. It MUST start with 's3://'
which can obviously be fixed by changing the specified path, however I am not sure how to resolve the read_parquet_table issue:
keys = list(wr.s3.read_parquet_table(database=gluedatabase, table='path_tokens').itemid.unique())
...
InvalidArgumentValue: '<redacted s3 path that does not start with s3://>/path_tokens' is not a valid path. It MUST start with 's3://'
digging into the call to read_parquet_table, I see that the location is resolved via
res = client_glue.get_table(DatabaseName=gluedatabase, Name="path_tokens")
res['Table']['StorageDescriptor']['Location']
>> <same path from above that does not include s3://>
Because of this, I figured it must have been a configuration issue in the data catalogue, but I checked and all the tables have the full location path specified including s3, as does the database, so I cannot see how to work around this issue
wr.version = 1.9.0 boto3.version = 1.14.53
Issue Analytics
- State:
- Created 3 years ago
- Comments:12 (8 by maintainers)
Top Results From Across the Web
Working with tables on the AWS Glueconsole - AWS Glue
To change the schema of a table, choose Edit schema to add and remove columns, change column names, and change data types. To...
Read more >MAC Address Tables | Basic Data Transmission in Networks
The first thing the switch would do when receiving the traffic is create a new entry in its MAC address table for PC1's...
Read more >Static MAC Address Table Entry - NetworkLessons.com
This lesson explains how to configure static MAC address entries in your Cisco Catalyst IOS Switch MAC address table.
Read more >Switching Tables - Router Alley
To perform this forwarding decision, a switch consults its hardware address table. For Ethernet switches, this is referred to as the MAC address...
Read more >The MAC Address Table (7.3) > Ethernet Switching | Cisco Press
Layer 3 switches are beyond the scope of this book. Switch Fundamentals (7.3.1). Now that you know all about Ethernet MAC addresses, it...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Looks correct to me?
A last tip about:
It can be done using
wr.s3.list_directories()
- It is faster cause will not fetch all filenames to the client side.