MongoDB: Performance issue with nested `take` on many-to-one relationship
See original GitHub issueBug description
It seems like a nested take
(inside a include
) query doesn’t actually include a $limit
operator on the mongodb aggregation pipeline which i believe causes mongodb to return all document from a collection, this results in a really slow query time when the collection has lots of documents. So this is how it goes from my understanding:
- query is executed
- mongodb returns all documents matching the filter
- prisma client takes 3 documents from what mongodb returned
this means a large chunk of data has to be downloaded only to return 3 documents…
for example, the query included in the prisma information section takes around 1:33 minutes on my internet (db on mongodb atlas) whereas an equivalent query using findUnique and aggregateRaw took around 700ms:
const chat = await prisma.chat.findUnique({
where: {
id: CHAT_ID
}
});
const messages = await prisma.message.aggregateRaw({
pipeline: [
{ $match: { chatId: CHAT_ID }, },
{ $sort: { _id: 1, }, },
{ $limit: 3 },
{ $project: { _id: 1, chatId: 1, authorId: 1, content: 1, deleted: 1, createdAt: 1, }, },
]
});
chat.messages = messages;
How to reproduce
git clone https://github.com/itsarnob/prisma-nested-take-repro
- set db url in the .env file
- Run
yarn seed
to seed the database with a chat and 100k messages - Run
yarn start
to see the time differences between nested take and the equivalent query using findUnique & aggregateRaw
Expected behavior
- the query should include
$limit
operator in the aggregation pipeline so mongodb returns n documents instead of all documents in the collection. - should be quick to return results.
Prisma information
Schema
// This is your Prisma schema file,
// learn more about it in the docs: https://pris.ly/d/prisma-schema
generator client {
provider = "prisma-client-js"
}
datasource db {
provider = "mongodb"
url = env("DATABASE_URL")
}
model Chat {
id String @id @default(auto()) @map("_id") @db.ObjectId
name String?
chatType ChatType
createdAt DateTime @default(now())
recipients chatRecipient[]
messages Message[] @relation("messages")
lastMessageId String? @unique
@@map("chats")
}
model Message {
id String @id @default(auto()) @map("_id") @db.ObjectId
chat Chat @relation("messages", fields: [chatId], references: [id])
chatId String
authorId String
content String?
deleted Boolean?
createdAt DateTime @default(now())
@@map("messages")
}
type chatRecipient {
userId String @db.ObjectId
nickname String?
}
enum ChatType {
Direct
Group
}
Query
const result1 = await prisma.chat.findUnique({
where: {
id: CHAT_ID
},
include: {
messages: {
take: 3
}
}
})
aggregation query logs
prisma:query db.chats.aggregate([ { $match: { $expr: { $and: [ { $and: [ { $eq: [ "$_id", ObjectId("62ac11f944f9818e45e3889e"), ], }, { $or: [ { $ne: [ { $ifNull: [ "$_id", null, ], }, null, ], }, { $eq: [ "$_id", null, ], }, ], }, ], }, ], }, }, }, { $project: { _id: 1, name: 1, chatType: 1, createdAt: 1, recipients.userId: 1, recipients.nickname: 1, lastMessageId: 1, }, }, ])
prisma:query db.messages.aggregate([ { $match: { $expr: { $and: [ { $in: [ "$chatId", [ "62ac11f944f9818e45e3889e", ], ], }, { $or: [ { $ne: [ { $ifNull: [ "$chatId", null, ], }, null, ], }, { $eq: [ "$chatId", null, ], }, ], }, ], }, }, }, { $sort: { _id: 1, }, }, { $project: { _id: 1, chatId: 1, authorId: 1, content: 1, deleted: 1, createdAt: 1, }, }, ])
Environment & setup
- OS: Pop!_OS
- Database: MongoDB
- Node.js version: v16.15.0
Prisma Version
prisma : 3.15.2
@prisma/client : 3.15.2
Current platform : debian-openssl-3.0.x
Query Engine (Node-API) : libquery-engine 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/libquery_engine-debian-openssl-3.0.x.so.node)
Migration Engine : migration-engine-cli 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/migration-engine-debian-openssl-3.0.x)
Introspection Engine : introspection-core 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/introspection-engine-debian-openssl-3.0.x)
Format Binary : prisma-fmt 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/prisma-fmt-debian-openssl-3.0.x)
Default Engines Hash : 461d6a05159055555eb7dfb337c9fb271cbd4d7e
Studio : 0.462.0
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:8 (3 by maintainers)
Top Results From Across the Web
Model One-to-Many Relationships with Embedded Documents
This page describes a data model that uses embedded documents to describe a one-to-many relationship between connected data. Embedding connected data in a ......
Read more >MongoDB One to Many Relationship - Stack Overflow
One to Many Relations. In this relationship, there is many, many entities or many entities that map to the one entity.
Read more >Spring Data MongoDB - Relation Modelling
In this blog post, we take a look at the different possibilities of linking documents with manual references and DBRefs when the need...
Read more >MongoDB One-to-Many Relationship tutorial with Mongoose ...
You will also know 3 criteria to choose Referencing or Embedding for improving application performance. Then we're gonna make some MongoDB One- ...
Read more >API with NestJS #44. Implementing relationships with MongoDB
We implement the one-to-many and many-to-one relationships when a document from the first collection can be linked to multiple documents from ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@peacechen thanks for this! I expanded this into a full reproduction with comparisons between MongoDB and Postgres. Building on the schema and query you’ve posted above I found the following:
Note that the times I found were significantly faster (4sec vs 30sec) but using a local MongoDB database.
Thanks @andrewicarlson for the detailed analysis 🎉
We’re using Mongo Atlas so there’s more of a network penalty. Our models are more complex than the simplified example. There are more relationships to other collections not included. Glad that you’re able to repro the linear relationship with the number of records.