[BUG] Slow scan operations
See original GitHub issueSummary:
I’m using dynamoose for my project, and overall it works well. However scan operations are several times slower than the same operations using AWS DocumentClient.
Code sample:
Schema
It’s the same with all my different schemas. Here is an example of the simplest one. None of them use Buffer type.
const userSchema = new dynamoose.Schema(
{
id: {
type: String,
required: true,
hashKey: true,
},
phoneNumber: {
type: String,
index: {
project: true,
global: true,
name: 'UserPhoneNumberIndex',
},
},
name: String,
status: String,
email: String,
// a few more fields with strings, numbers and booleans
},
{ timestamps: true }
)
Model
const model = dynamoose.model('myTable', mySchema, {
create: false,
waitForActive: false,
})
General
// dynamoose
MyModel.scan().all().exec()
// DocumentClient
export const scanTable = async <T = unknown>(
table: string,
extraParams?: Partial<AWS.DynamoDB.DocumentClient.ScanInput>
): Promise<T[]> => {
const params: AWS.DynamoDB.DocumentClient.ScanInput = {
TableName: `${constants.ENV}-${table}`,
...extraParams,
}
const scanResults = []
let response: PromiseResult<
AWS.DynamoDB.DocumentClient.ScanOutput,
AWS.AWSError
>
do {
// eslint-disable-next-line no-await-in-loop
response = await getDocumentClient().scan(params).promise()
scanResults.push(...response.Items)
params.ExclusiveStartKey = response.LastEvaluatedKey
} while (typeof response.LastEvaluatedKey !== 'undefined')
return <T[]>scanResults
}
Current output and behavior (including stack trace):
Example 1: Scanning ~450 items that are ~1 kb each (according to aws dynamodb console). Using aws DocumentClient and scanning all: ~400 ms Using dynamoose scan all: ~4500 ms
Example 2: Scanning ~23000 items that are ~230 bytes each. Using aws DocumentClient and scanning all: ~6000 ms Using dynamoose scan all: ~24000 ms
Expected output and behavior:
Somewhat similar performance
Environment:
Operating System: Amazon Linux
Operating System Version: 2
Node.js version (node -v
): v14.x
NPM version: (npm -v
): 7.18.1
Dynamoose version: 2.8.1
Other information (if applicable):
AWS lambda using the Serverless framework.
Serverless version: 2.48.0 aws sdk version: 2.952.0 AWS_NODEJS_CONNECTION_REUSE_ENABLED: 1
Other:
- [ x ] I have read through the Dynamoose documentation before posting this issue
- [ x ] I have searched through the GitHub issues (including closed issues) and pull requests to ensure this issue has not already been raised before
- [ x ] I have searched the internet and Stack Overflow to ensure this issue hasn’t been raised or answered before
- [ x ] I have tested the code provided and am confident it doesn’t work as intended
- [ x ] I have filled out all fields above
- [ x ] I am running the latest version of Dynamoose
Issue Analytics
- State:
- Created 2 years ago
- Reactions:6
- Comments:14 (6 by maintainers)
Top GitHub Comments
We were experiencing the same performance issues with a large query request. After upgrading to v3 alpha, performance is now on-par with the native client.
Hi,
I’ve got the same problem using Dynsamoose on AWS Lambda, on localhost ubuntu works perfectly.
After some investigation I have found what cause the issue: this line https://github.com/dynamoose/dynamoose/blob/a838224d031ba32db6f84d427600beda5ec765ed/lib/DocumentRetriever.ts#L59
Processing 150 records took 10s (sic!)