Large parallel scans return empty
See original GitHub issueI ran into an issue today similar to https://github.com/boto/boto3/issues/2362#issue-590929269
When I attempt to run a parallel scan with a large number of total segments (say 10x the total number of items) it returns blank. But when I scale it down to 1.5x total number of items it runs no problem.
Is there a limit or a magic formula used to determine the max number of possible parallel scans? Please let me know, thanks!
Here is a log dump from a recent attempt.
2021-09-30 21:57:00,498 botocore.hooks [DEBUG] Event choose-service-name: calling handler <function handle_service_name_alias at 0x7f8c171f00e0>
2021-09-30 21:57:00,499 botocore.hooks [DEBUG] Event creating-client-class.dynamodb: calling handler <function add_generate_presigned_url at 0x7f8c1721cb90>
2021-09-30 21:57:00,499 botocore.hooks [DEBUG] Event creating-client-class.dynamodb: calling handler <function add_generate_presigned_url at 0x7f8c1721cb90>
2021-09-30 21:57:00,501 botocore.endpoint [DEBUG] Setting dynamodb timeout as (60, 60)
2021-09-30 21:57:00,501 botocore.endpoint [DEBUG] Setting dynamodb timeout as (60, 60)
2021-09-30 21:57:00,501 botocore.client [DEBUG] Registering retry handlers for service: dynamodb
2021-09-30 21:57:00,501 botocore.client [DEBUG] Registering retry handlers for service: dynamodb
2021-09-30 21:57:00,502 boto3.resources.factory [DEBUG] Loading dynamodb:dynamodb
2021-09-30 21:57:00,502 boto3.resources.factory [DEBUG] Loading dynamodb:dynamodb
2021-09-30 21:57:00,503 botocore.hooks [DEBUG] Event creating-resource-class.dynamodb.ServiceResource: calling handler <function lazy_call.<locals>._handler at 0x7f8c16d10f80>
2021-09-30 21:57:00,503 botocore.hooks [DEBUG] Event creating-resource-class.dynamodb.ServiceResource: calling handler <function lazy_call.<locals>._handler at 0x7f8c16d10f80>
2021-09-30 21:57:00,503 boto3.resources.factory [DEBUG] Loading dynamodb:Table
2021-09-30 21:57:00,503 boto3.resources.factory [DEBUG] Loading dynamodb:Table
2021-09-30 21:57:00,504 botocore.hooks [DEBUG] Event creating-resource-class.dynamodb.Table: calling handler <function lazy_call.<locals>._handler at 0x7f8c16d46050>
2021-09-30 21:57:00,504 botocore.hooks [DEBUG] Event creating-resource-class.dynamodb.Table: calling handler <function lazy_call.<locals>._handler at 0x7f8c16d46050>
2021-09-30 21:57:00,504 botocore.hooks [DEBUG] Event creating-resource-class.dynamodb.Table: calling handler <function lazy_call.<locals>._handler at 0x7f8c16d10f80>
2021-09-30 21:57:00,504 botocore.hooks [DEBUG] Event creating-resource-class.dynamodb.Table: calling handler <function lazy_call.<locals>._handler at 0x7f8c16d10f80>
2021-09-30 21:57:00,505 boto3.resources.action [DEBUG] Calling dynamodb:scan with {'TableName': 'some_table', 'Segment': 0, 'TotalSegments': 24576}
2021-09-30 21:57:00,505 boto3.resources.action [DEBUG] Calling dynamodb:scan with {'TableName': 'some_table', 'Segment': 0, 'TotalSegments': 24576}
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event provide-client-params.dynamodb.Scan: calling handler <function copy_dynamodb_params at 0x7f8c16474440>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event provide-client-params.dynamodb.Scan: calling handler <function copy_dynamodb_params at 0x7f8c16474440>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <bound method TransformationInjector.inject_condition_expressions of <boto3.dynamodb.transform.TransformationInjector object at 0x7f8c15f14a90>>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <bound method TransformationInjector.inject_condition_expressions of <boto3.dynamodb.transform.TransformationInjector object at 0x7f8c15f14a90>>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <bound method TransformationInjector.inject_attribute_value_input of <boto3.dynamodb.transform.TransformationInjector object at 0x7f8c15f14a90>>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <bound method TransformationInjector.inject_attribute_value_input of <boto3.dynamodb.transform.TransformationInjector object at 0x7f8c15f14a90>>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <function generate_idempotent_uuid at 0x7f8c171a3680>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <function generate_idempotent_uuid at 0x7f8c171a3680>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <function block_endpoint_discovery_required_operations at 0x7f8c1734c050>
2021-09-30 21:57:00,506 botocore.hooks [DEBUG] Event before-parameter-build.dynamodb.Scan: calling handler <function block_endpoint_discovery_required_operations at 0x7f8c1734c050>
2021-09-30 21:57:00,507 botocore.hooks [DEBUG] Event before-call.dynamodb.Scan: calling handler <function inject_api_version_header_if_needed at 0x7f8c171abef0>
2021-09-30 21:57:00,507 botocore.hooks [DEBUG] Event before-call.dynamodb.Scan: calling handler <function inject_api_version_header_if_needed at 0x7f8c171abef0>
2021-09-30 21:57:00,507 botocore.endpoint [DEBUG] Making request for OperationModel(name=Scan) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target': 'DynamoDB_20120810.Scan', 'Content-Type': 'application/x-amz-json-1.0', 'User-Agent': 'Boto3/1.17.66 Python/3.7.10 Linux/4.14.243-185.433.amzn2.x86_64 exec-env/CloudShell Botocore/1.20.66 Resource'}, 'body': b'{"TableName": "some_table", "Segment": 0, "TotalSegments": 24576}', 'url': 'https://dynamodb.us-east-1.amazonaws.com/', 'context': {'client_region': 'us-east-1', 'client_config': <botocore.config.Config object at 0x7f8c15f10150>, 'has_streaming_input': False, 'auth_type': None}}
2021-09-30 21:57:00,507 botocore.endpoint [DEBUG] Making request for OperationModel(name=Scan) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target': 'DynamoDB_20120810.Scan', 'Content-Type': 'application/x-amz-json-1.0', 'User-Agent': 'Boto3/1.17.66 Python/3.7.10 Linux/4.14.243-185.433.amzn2.x86_64 exec-env/CloudShell Botocore/1.20.66 Resource'}, 'body': b'{"TableName": "some_table", "Segment": 0, "TotalSegments": 24576}', 'url': 'https://dynamodb.us-east-1.amazonaws.com/', 'context': {'client_region': 'us-east-1', 'client_config': <botocore.config.Config object at 0x7f8c15f10150>, 'has_streaming_input': False, 'auth_type': None}}
2021-09-30 21:57:00,507 botocore.hooks [DEBUG] Event request-created.dynamodb.Scan: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7f8c15f1ebd0>>
2021-09-30 21:57:00,507 botocore.hooks [DEBUG] Event request-created.dynamodb.Scan: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7f8c15f1ebd0>>
2021-09-30 21:57:00,507 botocore.hooks [DEBUG] Event choose-signer.dynamodb.Scan: calling handler <function set_operation_specific_signer at 0x7f8c171a3560>
2021-09-30 21:57:00,507 botocore.hooks [DEBUG] Event choose-signer.dynamodb.Scan: calling handler <function set_operation_specific_signer at 0x7f8c171a3560>
2021-09-30 21:57:00,508 botocore.credentials [DEBUG] Credentials need to be refreshed.
2021-09-30 21:57:00,508 botocore.credentials [DEBUG] Credentials need to be refreshed.
2021-09-30 21:57:00,508 botocore.credentials [DEBUG] Credentials need to be refreshed.
2021-09-30 21:57:00,508 botocore.credentials [DEBUG] Credentials need to be refreshed.
2021-09-30 21:57:00,508 botocore.credentials [DEBUG] Credentials need to be refreshed.
2021-09-30 21:57:00,508 botocore.credentials [DEBUG] Credentials need to be refreshed.
2021-09-30 21:57:00,509 urllib3.connectionpool [DEBUG] Resetting dropped connection: localhost
2021-09-30 21:57:00,509 urllib3.connectionpool [DEBUG] Resetting dropped connection: localhost
2021-09-30 21:57:00,509 urllib3.connectionpool [DEBUG] http://localhost:1338 "GET /latest/meta-data/container/security-credentials HTTP/1.1" 200 1212
2021-09-30 21:57:00,509 urllib3.connectionpool [DEBUG] http://localhost:1338 "GET /latest/meta-data/container/security-credentials HTTP/1.1" 200 1212
2021-09-30 21:57:00,510 botocore.credentials [DEBUG] Retrieved credentials will expire at: 2021-09-30 22:05:37+00:00
2021-09-30 21:57:00,510 botocore.credentials [DEBUG] Retrieved credentials will expire at: 2021-09-30 22:05:37+00:00
2021-09-30 21:57:00,510 botocore.auth [DEBUG] Calculating signature using v4 auth.
2021-09-30 21:57:00,510 botocore.auth [DEBUG] Calculating signature using v4 auth.
2021-09-30 21:57:00,511 botocore.auth [DEBUG] CanonicalRequest:
POST
2021-09-30 21:57:00,511 botocore.endpoint [DEBUG] Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://dynamodb.us-east-1.amazonaws.com/, headers={'X-Amz-Target': b'DynamoDB_20120810.Scan', 'Content-Type': b'application/x-amz-json-1.0', 'User-Agent': b'Boto3/1.17.66 Python/3.7.10 Linux/4.14.243-185.433.amzn2.x86_64 exec-env/CloudShell Botocore/1.20.66 Resource', 'X-Amz-Date': b'20210930T215700Z', apitoken:washere, 'Authorization': b'AWS4-HMAC-SHA256 Credential=XXXXXXX, SignedHeaders=content-type;host;x-amz-date;x-amz-security-token;x-amz-target, Signature=XXXXXX', 'Content-Length': '64'}>
2021-09-30 21:57:00,511 botocore.endpoint [DEBUG] Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://dynamodb.us-east-1.amazonaws.com/, headers={'X-Amz-Target': b'DynamoDB_20120810.Scan', 'Content-Type': b'application/x-amz-json-1.0', 'User-Agent': b'Boto3/1.17.66 Python/3.7.10 Linux/4.14.243-185.433.amzn2.x86_64 exec-env/CloudShell Botocore/1.20.66 Resource', 'X-Amz-Date': b'20210930T215700Z', apitoken:washere, 'Authorization': b'AWS4-HMAC-SHA256 Credential=XXXXXXX SignedHeaders=content-type;host;x-amz-date;x-amz-security-token;x-amz-target, Signature=XXXXXX', 'Content-Length': '64'}>
2021-09-30 21:57:00,512 botocore.httpsession [DEBUG] Certificate path: /usr/local/lib/python3.7/site-packages/certifi/cacert.pem
2021-09-30 21:57:00,512 botocore.httpsession [DEBUG] Certificate path: /usr/local/lib/python3.7/site-packages/certifi/cacert.pem
2021-09-30 21:57:00,512 urllib3.connectionpool [DEBUG] Starting new HTTPS connection (1): dynamodb.us-east-1.amazonaws.com:443
2021-09-30 21:57:00,512 urllib3.connectionpool [DEBUG] Starting new HTTPS connection (1): dynamodb.us-east-1.amazonaws.com:443
2021-09-30 21:57:00,538 urllib3.connectionpool [DEBUG] https://dynamodb.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 39
2021-09-30 21:57:00,538 urllib3.connectionpool [DEBUG] https://dynamodb.us-east-1.amazonaws.com:443 "POST / HTTP/1.1" 200 39
2021-09-30 21:57:00,539 botocore.parsers [DEBUG] Response headers: {'Server': 'Server', 'Date': 'Thu, 30 Sep 2021 21:57:00 GMT', 'Content-Type': 'application/x-amz-json-1.0', 'Content-Length': '39', 'Connection': 'keep-alive', 'x-amzn-RequestId': 'PA8IM0TQ99VRN8FOGMLN3UQ25FVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '3413411624'}
2021-09-30 21:57:00,539 botocore.parsers [DEBUG] Response headers: {'Server': 'Server', 'Date': 'Thu, 30 Sep 2021 21:57:00 GMT', 'Content-Type': 'application/x-amz-json-1.0', 'Content-Length': '39', 'Connection': 'keep-alive', 'x-amzn-RequestId': 'PA8IM0TQ99VRN8FOGMLN3UQ25FVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '3413411624'}
2021-09-30 21:57:00,539 botocore.parsers [DEBUG] Response body:
b'{"Count":0,"Items":[],"ScannedCount":0}'
2021-09-30 21:57:00,539 botocore.parsers [DEBUG] Response body:
b'{"Count":0,"Items":[],"ScannedCount":0}'
2021-09-30 21:57:00,540 botocore.hooks [DEBUG] Event needs-retry.dynamodb.Scan: calling handler <botocore.retryhandler.RetryHandler object at 0x7f8c15f0d410>
2021-09-30 21:57:00,540 botocore.hooks [DEBUG] Event needs-retry.dynamodb.Scan: calling handler <botocore.retryhandler.RetryHandler object at 0x7f8c15f0d410>
2021-09-30 21:57:00,540 botocore.retryhandler [DEBUG] No retry needed.
2021-09-30 21:57:00,540 botocore.retryhandler [DEBUG] No retry needed.
2021-09-30 21:57:00,540 botocore.hooks [DEBUG] Event after-call.dynamodb.Scan: calling handler <bound method TransformationInjector.inject_attribute_value_output of <boto3.dynamodb.transform.TransformationInjector object at 0x7f8c15f14a90>>
2021-09-30 21:57:00,540 botocore.hooks [DEBUG] Event after-call.dynamodb.Scan: calling handler <bound method TransformationInjector.inject_attribute_value_output of <boto3.dynamodb.transform.TransformationInjector object at 0x7f8c15f14a90>>
2021-09-30 21:57:00,540 boto3.resources.action [DEBUG] Response: {'Items': [], 'Count': 0, 'ScannedCount': 0, 'ResponseMetadata': {'RequestId': 'PA8IM0TQ99VRN8FOGMLN3UQ25FVV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Thu, 30 Sep 2021 21:57:00 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '39', 'connection': 'keep-alive', 'x-amzn-requestid': 'PA8IM0TQ99VRN8FOGMLN3UQ25FVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '3413411624'}, 'RetryAttempts': 0}}
2021-09-30 21:57:00,540 boto3.resources.action [DEBUG] Response: {'Items': [], 'Count': 0, 'ScannedCount': 0, 'ResponseMetadata': {'RequestId': 'PA8IM0TQ99VRN8FOGMLN3UQ25FVV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Thu, 30 Sep 2021 21:57:00 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '39', 'connection': 'keep-alive', 'x-amzn-requestid': 'PA8IM0TQ99VRN8FOGMLN3UQ25FVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '3413411624'}, 'RetryAttempts': 0}}
[]
boto3: 1.17.66 python: 3.7.10
also seen on boto3: 1.17.69 python: 3.9.5
Issue Analytics
- State:
- Created 2 years ago
- Comments:13 (6 by maintainers)
Top Results From Across the Web
Working with scans in DynamoDB - AWS Documentation
Scan always returns a result set. If no matching items are found, the result set is empty. A single Scan request can retrieve...
Read more >Why Your DynamoDB Scan or Query Is Not Returning All Your ...
DynamoDB Scans and Queries have a limitation that only 1MB worth of data can be returned per operation. The number of records returned...
Read more >Scaling DynamoDB for Big Data using Parallel Scan - Medium
Reading large volumes of data via scan vs parallel scan. How does scan work in AWS DynamoDB? Scan operation returns one or more...
Read more >dynamodb_scan: The Scan operation returns one or ... - Rdrr.io
scan operations proceed sequentially; however, for faster performance on a large table or secondary index, applications can request a parallel scan ...
Read more >Scans | DynamoDB, explained.
Parallel Scans. One use case for Scans is to export the data into cold storage or for data analysis. If you have a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Awesome. Thanks @stobrien89! excited to see where this goes.
@stobrien89 Thanks for the update!
I tried creating a ratio (similar to your 1.9) on my end and found that ratio method was inconsistent.