question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Checking for file existence.

See original GitHub issue

As we use s3 for storage of generated assets, and only generate them again if not existing yet, we need to implement a check for existence.

How about a from smart_open import exists? It would be great if this library could also allow easy checking for existence, tailored for it’s multiple backends.

from smart_open import exists  # like os.exists
if not exists("s3://my_key:my_secret@my_server:my_port@my_bucket/some_file.mp4"):
  data = long_running_procedure("some_file.mp4")   # generate assets
  from smart_open import open
  with open("s3://my_key:my_secret@my_server:my_port@my_bucket/some_file.mp4", "wb") as f:
    f.write(data)
  # end with
  print("Generated from scratch.")
else:
  print("Already uploaded.")
# end if

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:12

github_iconTop GitHub Comments

3reactions
mpenkovcommented, Apr 22, 2019

You don’t need to download it. Smart_open streams its data. If you open a stream, but don’t actually read from it, nothing gets downloaded.

Try:

try:
    with smart_open.open('s3://bucket/key/does/not/exist.txt'):
        file_exists = True
except ValueError:
    file_exists = False

The ValueError gets raised here in case you want to do more precise checking.

1reaction
luckydonaldcommented, Apr 22, 2019

Using boto3, which also is internally used by smart_open, we used the listing with Prefix=file_path to check for existence, as it doesn’t download the file, and we have the s3:ListBucket permission set anyway.

def exists(bucket_name, file_path):
    s3 = boto3.resource(
        's3',
        endpoint_url=f'{proto}://{host}:{port}',
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
    )
    # Bucket name we wanna use
    bucket = s3.Bucket(bucket_name)
    # list files matching a filter: the path
    objs = list(bucket.objects.filter(Prefix=file_path))
    return len(objs) > 0
# end def 

Source: Stackoverflow, also includes other solutions with probably don’t require s3:ListBucket.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Check If a File Exists in Python
1) Using os.path.exists() function to check if a file exists ... To check if a file exists, you pass the file path to...
Read more >
How do I check whether a file exists without exceptions?
Use os.path.isfile to check only files and Use os.path.exists to check both files and directories. Learn more from here: shortbuzz.in/blog/shortbuzz.in/…
Read more >
Python Check if File Exists: How to Check If a Directory Exists?
Python exists() method is used to check whether specific file or directory exists or not. It is also used to check if a...
Read more >
Python - Check if a file or directory exists - GeeksforGeeks
Using os.path.exists() to check if file exists · Python3 · Using os.path.isfile() Method to check if file exists · Python3 · Using os.path.isdir() ......
Read more >
Python Check If File Exists [3 Ways] - PYnative
Use the os.path.isfile('file_path') function to check whether a file exists. Pass the file name or file path to this function as an argument....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found