question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Threads calling S3 operations return RuntimeError (cannot schedule new futures after interpreter shutdown)

See original GitHub issue

Describe the bug Basic S3 operations, like downloading or uploading files to buckets, when used in Python 3 threaded application methods, result in a RuntimeException. No bug reports are located here so this documents the error and requests a recommended workaround, if available.

Background Python 3.8 introduced some changes to how the concurrent futures module handled executor requests. Ostensibly, this prevents new tasks from being scheduled after the executor received a shutdown signal. The changes caused Boto3 versions (at least some) after 1.17.53 to yield the following exception:

cannot schedule new futures after interpreter shutdown
Traceback (most recent call last):
  File \"<some_file_calling_an_s3_operation>.py\", line 277, in <method_calling_an_s3_operation>
    s3_client.download_file(bucket_name, file_key, file_destination)
  File \"/usr/local/lib/python3.9/site-packages/boto3/s3/inject.py\", line 170, in download_file
    return transfer.download_file(
  File \"/usr/local/lib/python3.9/site-packages/boto3/s3/transfer.py\", line 304, in download_file
    future = self._manager.download(
  File \"/usr/local/lib/python3.9/site-packages/s3transfer/manager.py\", line 369, in download
    return self._submit_transfer(
  File \"/usr/local/lib/python3.9/site-packages/s3transfer/manager.py\", line 500, in _submit_transfer
    self._submission_executor.submit(
  File \"/usr/local/lib/python3.9/site-packages/s3transfer/futures.py\", line 467, in submit
    future = ExecutorFuture(self._executor.submit(task))
  File \"/usr/local/lib/python3.9/concurrent/futures/thread.py\", line 163, in submit
    raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown

This impacted Apache Airflow to the extent that the solution was to disable threading in S3 operations. Similarly, there are other related bug reports. This has appeared sporadically in similar scenarios.

This ticket seeks guidance from the Boto3 team on how to best deal with this issue. (NOTE: Recommendations online suggest reverting to Boto3 1.17.53 [see above]. Another potential solution is disabling threading in S3 operations using TransferConfig. Another potential solution is using Thread.join() on the topmost thread, but that will result in waits and may not be readily possible, depending on architecture.

Steps to reproduce This was reproduced with the following application setup: Python 3.9.9 CentOS 7 botocore==1.20.112 boto3==1.17.112

Example Code:

#!/usr/bin/python3

import logging
from queue import Queue
import threading
import time

log = logging.getLogger(__main__)


def finalizer(some_queue):
    while True:  # loop to catch all items
        time.sleep(0.05)  # poor man's nice
        if not some_queue.empty():
            try:
                # application logic here
                method_that_performs_s3_operations()
                # application logic here
            except BaseException as be:
                log.exception(be)
    return


def processor(base_queue, some_queue):
    while True:  # loop to catch all items
        time.sleep(0.05)  # poor man's nice
        if not base_queue.empty():
            try:
                # application logic here
                method2_that_performs_s3_operations()
                add_to_some_queue()
                # application logic here
            except BaseException as be:
                log.exception(be)
    return


def collector(base_queue):
    while True:  # loop to catch all items
        time.sleep(0.05)  # poor man's nice
        if not base_queue.full():
            try:
                # application logic here
                add_to_base_queue()
                # application logic here
            except BaseException as be:
                log.exception(be)
    return


def main():
    base_queue = Queue(DEFAULT_QUEUE_SIZE)
    some_queue = Queue(DEFAULT_QUEUE_SIZE * 2)
    # define and run threads
    thread_collector = threading.Thread(target=collector, name='thread_collector',
                                         args=(base_queue))
    thread_processor = threading.Thread(target=processor, name='thread_processor',
                                      args=(base_queue, some_queue))
    thread_finalizer = threading.Thread(target=finalizer, name='thread_finalizer',
                                             args=(some_queue))
    # wait specific time to start processing threads
    time.sleep(30.0)
    thread_collector.start()
    thread_processor.start()
    thread_finalizer.start()
    return


if __name__ == '__main__':
    main()

Expected behavior S3 operations will proceed successfully to download/upload without any custom configuration. Exceptions relating to concurrency inside s3 code will not be thrown.

Debug logs Full stack trace by adding boto3.set_stream_logger('') to your code.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:3
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
tim-finnigancommented, Jan 19, 2022

Hi @jpl-jengelke, thanks for reaching out. I brought this up with the team and it is something that we’re looking into further. We will let you know when we have an update.

1reaction
jpl-jengelkecommented, Jun 10, 2022

@tim-finnigan Any update on this?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error "cannot schedule new futures after interpreter shutdown ...
Here is error message: cannot schedule new futures after interpreter shutdown; Place: script.py; Line: 49; This row links to s3. upload_file( ...
Read more >
concurrent.futures.thread — Cloud Custodian documentation
This is done to allow the interpreter # to exit when there are still idle ... raise RuntimeError('cannot schedule new futures after shutdown')...
Read more >
Error "cannot schedule new futures after interpreter shutdown ...
Coding example for the question Error "cannot schedule new futures after interpreter shutdown" with boto3 while working through treading.
Read more >
ThreadPoolExecutor with wait=True shuts down too early
Example ``` from concurrent.futures import ThreadPoolExecutor from ... has occurred: RuntimeError cannot schedule new futures after shutdown ...
Read more >
boto3 1.26.25 - PythonFix.com
... Threads calling S3 operations return RuntimeError (cannot schedule new futures after interpreter shutdown); HTML tags are showing in ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found