question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Async code in aiobotocore calls synchronous functions of boto3

See original GitHub issue

Describe the bug While running some performance test (simple loop, code attached below) we believe that we found a problem where creating a session/resource in aioboto3 is calling an async function in aiobotocore which is calling a blocking function in boto3.

Code The following code demonstrates the problem - the more sessions/resources we create - the more time it takes which shows that it’s not real async.

import asyncio
import datetime
import aioboto3

async def create_session_and_resource():
    session = aioboto3.Session()
    async with session.resource('dynamodb', region_name='us-east-1',
                                endpoint_url='http://fakehost:8123'
                                ) as dynamo_resource:
        pass

async def main(N):
    tasks = [create_session_and_resource() for _ in range(N)]
    t1 = datetime.datetime.now()
    await asyncio.gather(*tasks)
    t2 = datetime.datetime.now()
    print(f"N={N} total duration= {(t2 - t1).total_seconds()}")


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main(2))
    loop.run_until_complete(main(10))
    loop.run_until_complete(main(50))

pip results

aioboto3                         8.0.5
aiobotocore                    1.0.4
boto3                              1.12.32
botocore                         1.15.32

Environment:

  • Python Version: 3.6.15 (similar behavior with Python3.8)
  • OS name and version: Ubuntu 20.04.1

Additional info/context We opened an issue with aioboto3 (https://github.com/terrycain/aioboto3/issues/254) but the assignee was right to ask us to bring it up here.

We believe that the block is happening because aiobotocore calls synchronous functions of boto3.

As an example, think of the coroutine named _create_client (implemented in aiobotocoro/session) which calls self._get_internal_component('endpoint_resolver') .

def _get_internal_component(self, name):
        # While this method may be called by botocore classes outside of the
        # Session, this method should **never** be used by a class that lives
        # outside of botocore.
        return self._internal_components.get_component(name)

which eventually hits the get_component method in botocore.session which executes the deffered (sync) function as can be seen here:

class ComponentLocator(object):
    """Service locator for session components."""
    def __init__(self):
        self._components = {}
        self._deferred = {}

    def get_component(self, name):
        if name in self._deferred:
            factory = self._deferred[name]
            self._components[name] = factory()
            # Only delete the component from the deferred dict after
            # successfully creating the object from the factory as well as
            # injecting the instantiated value into the _components dict.
            del self._deferred[name]
        try:
            return self._components[name]
        except KeyError:
            raise ValueError("Unknown component: %s" % name)

We wonder if it’s possible to assure the asynchronicity of both the Session and the resource creation to eliminate the blocking. Our backend code creates many sessions/resources and we’re trying to optimize its performance.

Thank you in advance !!

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:2
  • Comments:15

github_iconTop GitHub Comments

1reaction
thehesiodcommented, May 19, 2022

and in either case can use your monkeypatch to speed up your use case 😃. Actually if this really does take a significant amount of time I’d like to use this monkeypatch at our company as well, or perhaps we could ship an optional set of speedups in aiobotocore. Please do report your test and findings here!

1reaction
thehesiodcommented, May 19, 2022

@zferentz I think it would be great to speed up botocore, actually I proposed some changes to how they parse timestamp years ago that sped up our scenario 20% and they have yet to merge my change, so unfortunately getting them to change things can be glacial. However I suggest the following, given list_available_services is doing a file listing, and one wouldn’t reasonably expect the service list to change during the lifetime of the app I think it would be one of the few instances where it would be worthwhile to cache in the service list in a global. I would make a monkeypatch to replace that method with one that caches globally, then write a small python test that shows the speed difference by running both versions for like 100 times or so. The open a botocore bug to propose your solution. This is really a botocore issue and not an aiobotocore issue IMO. Let me know if you disagree.

Read more comments on GitHub >

github_iconTop Results From Across the Web

HeadObject calls are slower than the regular boto when done ...
Describe the bug When performing API calls to the s3 through aiobotocore, they get extremely slow compared to the boto itself on the...
Read more >
aiobotocore's documentation! — aiobotocore 2.4.1 ...
Async client for amazon services using botocore and aiohttp/asyncio. This library is a mostly full featured asynchronous version of botocore. Features¶. Full ...
Read more >
aiobotocore - PyPI
Async client for amazon services using botocore and aiohttp/asyncio. This library is a mostly full featured asynchronous version of botocore.
Read more >
boto3 - aioboto3 speedup not as expected - Stack Overflow
This was run on a c3.8xlarge EC2 instance. Code: import asyncio import aioboto3 from boto3.dynamodb.conditions import Key import boto3 import ...
Read more >
AWS Lambda Function Performance: parallelism in python
Parallelizing AWS API calls with in Python Lambda functions ... be drop in compatible with boto3 but uses async/non-blocking IO requests to make...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found