question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Boto3 incompatible with python zip import

See original GitHub issue

One of Python’s useful features is its ability to load modules from a .zip archive (PEP here), allowing you to package up multiple dependencies into a single file.

Boto breaks when trying to import it from a .zip, throwing:

  File "C:\code sandbox\boto.zip\boto3\session.py", line 263, in client
  File "C:\code sandbox\boto.zip\botocore\session.py", line 799, in create_client
  File "C:\code sandbox\boto.zip\botocore\session.py", line 668, in _get_internal_component
  File "C:\code sandbox\boto.zip\botocore\session.py", line 870, in get_component
  File "C:\code sandbox\boto.zip\botocore\session.py", line 150, in create_default_resolver
  File "C:\code sandbox\boto.zip\botocore\loaders.py", line 132, in _wrapper
  File "C:\code sandbox\boto.zip\botocore\loaders.py", line 424, in load_data
botocore.exceptions.DataNotFoundError: Unable to load data for: endpoints

How to Reproduce:

  1. Create a .zip containing boto3 and botocore
  2. Create a .py file in the same directory as the zip (access keys removed for obvious reasons):
sys.path.insert(0, 'boto.zip')
import boto3

s3 = boto3.client('s3', aws_access_key_id='access_key', aws_secret_access_key='secret_key')
  1. Run

Tested on Python 3.6.7 boto3 1.9.39 botocore 1.12.39

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:24
  • Comments:20 (6 by maintainers)

github_iconTop GitHub Comments

16reactions
LukeBollycommented, Dec 14, 2018

What are the odds of getting this implemented? Its preventing us from distributing boto3, which makes it very hard to provide a package that depends on it in PySpark.

0reactions
alete89commented, Dec 2, 2021

Found a work around for this. You can pass spark conf args to have spark unzip the dependencies and include in path, something like this,

	--conf spark.yarn.dist.archives=s3://<bucket+path>/sparkapp.zip#deps" \
	--conf spark.yarn.appMasterEnv.PYTHONPATH=deps" \
	--conf spark.executorEnv.PYTHONPATH=deps" \

Worked with EMR 6.2.0 and Python 3.7.9

Hey @dsonavane-rgare I’m trying this without success. Can you elaborate a bit more? This is how I was sending my file and deps (this throws boto3 not found because one of my zipped files uses boto3):

spark-submit --py-files s3://<bucket>/code/spark/dependencies.zip s3://<bucket>/code/spark/job.py args

This is what I’ve tried now, based on your example:

spark-submit --conf spark.yarn.dist.archives=s3://<bucket>/code/spark/dependencies.zip#deps --conf spark.yarn.appMasterEnv.PYTHONPATH=deps --conf spark.executorEnv.PYTHONPATH=deps s3://<bucket>/code/spark/job.py args

and this as well:

spark-submit --py-files s3://<bucket>/code/spark/dependencies.zip --conf spark.yarn.dist.archives=s3://<bucket>/code/spark/dependencies.zip#deps --conf spark.yarn.appMasterEnv.PYTHONPATH=deps --conf spark.executorEnv.PYTHONPATH=deps s3://<bucket>/code/spark/job.py 2021-12-01

Thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

Resolve "Unable to import module" errors from Python ... - AWS
I receive an "Unable to import module" error when I try to run my AWS Lambda code in Python. How do I resolve...
Read more >
boto3 - PyPI
Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to write software that makes use...
Read more >
S3 — Boto3 Docs 1.26.34 documentation - AWS
Description: The bucket you tried to create already exists, and you own it. Amazon S3 returns this error in all Amazon Web Services...
Read more >
zipimport — Import modules from Zip archives ... - Python Docs
Source code: Lib/zipimport.py This module adds the ability to import Python ... The optional path argument is ignored—it's there for compatibility with the ......
Read more >
macos - boto3 python 2.7 ImportError: No module named ...
Try this: PYTHONPATH=/usr/local/lib/python2.7/site-packages python -c 'import boto3; print dir(boto3)' . – Leon. Mar 1, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found