PyArrow Requirement Exceeds Lambda Unzipped Max
See original GitHub issueThe PandasCursor
is no longer usable in AWS Lambdas because of the requirement to use PyArrow. The library is quite large, and it exceeds AWS Unzipped file size limit. It is prevent deploys at the moment.
PyArrow is quite large, and other libraries have run into similar issues: https://github.com/snowflakedb/snowflake-connector-python/issues/213
I wonder if could be made an optional dependency for the PandasCursor
? The library could use the older logic as fallback for when it isn’t present.
Issue Analytics
- State:
- Created a year ago
- Comments:12 (5 by maintainers)
Top Results From Across the Web
Current version too big for AWS Lambda - make pyarrow and ...
We have an app running a 1.7 version of the connector and the package size was about 80 MB, now with things like...
Read more >Why is there a size difference when using the AWS Lambda ...
the deployment package size (unzipped) needs to be <250 MB (https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html).
Read more >3 Ways to Overcome AWS Lambda Deployment Size Limit
The zipped size of the entire repo is around 117MB and unzipped size is around 300MB. Directory Structure and respective filesize. as barebone ......
Read more >AWS Lambda: comparing Golang and Python | Blog post
For Python no pure-Python parquet implementation exists. A Lambda deployment with pyarrow (0.15.1) and pandas currently exceeds the limits of a ...
Read more >Create an AWS Lambda Layer for Python Runtime
So, you are a Python Developer and excited to try AWS Lambda. ... let's say pandas for data manipulation or pyarrow for transforming...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The unload option can be easily configured and used as described in the README as follows.
The query execution itself is very fast when using the unload option, and the retrieval of results is also super fast. Please give it a try.
I will remove PyArrow from the required dependencies so that you can choose PyArrow or FastParquet. Until I release a supported version, please use versions earlier than 2.9.0.
@laughingman7743 currently there is only one version of botocore that you will have to force pin. So I don’t think your suggested solution will work.