question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MongoDB: Performance issue with nested `take` on many-to-one relationship

See original GitHub issue

Bug description

It seems like a nested take (inside a include) query doesn’t actually include a $limit operator on the mongodb aggregation pipeline which i believe causes mongodb to return all document from a collection, this results in a really slow query time when the collection has lots of documents. So this is how it goes from my understanding:

  1. query is executed
  2. mongodb returns all documents matching the filter
  3. prisma client takes 3 documents from what mongodb returned

this means a large chunk of data has to be downloaded only to return 3 documents…

for example, the query included in the prisma information section takes around 1:33 minutes on my internet (db on mongodb atlas) whereas an equivalent query using findUnique and aggregateRaw took around 700ms:

const chat = await prisma.chat.findUnique({
    where: {
        id: CHAT_ID
    }
});

const messages = await prisma.message.aggregateRaw({
    pipeline: [
        { $match: { chatId: CHAT_ID }, },
        { $sort: { _id: 1, }, },
        { $limit: 3 },
        { $project: { _id: 1, chatId: 1, authorId: 1, content: 1, deleted: 1, createdAt: 1, }, },
    ]
});
chat.messages = messages;

How to reproduce

  1. git clone https://github.com/itsarnob/prisma-nested-take-repro
  2. set db url in the .env file
  3. Run yarn seed to seed the database with a chat and 100k messages
  4. Run yarn start to see the time differences between nested take and the equivalent query using findUnique & aggregateRaw

Expected behavior

  • the query should include $limit operator in the aggregation pipeline so mongodb returns n documents instead of all documents in the collection.
  • should be quick to return results.

Prisma information

Schema

// This is your Prisma schema file,
// learn more about it in the docs: https://pris.ly/d/prisma-schema

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "mongodb"
  url      = env("DATABASE_URL")
}

model Chat {

  id            String          @id @default(auto()) @map("_id") @db.ObjectId
  name          String?
  chatType      ChatType
  createdAt     DateTime        @default(now())
  recipients    chatRecipient[]
  messages      Message[]       @relation("messages")
  lastMessageId String?         @unique

  @@map("chats")
}

model Message {
  id        String   @id @default(auto()) @map("_id") @db.ObjectId
  chat      Chat     @relation("messages", fields: [chatId], references: [id])
  chatId    String
  authorId  String
  content   String?
  deleted   Boolean?
  createdAt DateTime @default(now())

  @@map("messages")
}

type chatRecipient {
  userId   String  @db.ObjectId
  nickname String?

}

enum ChatType {
  Direct
  Group
}

Query

const result1 = await prisma.chat.findUnique({
    where: {
        id: CHAT_ID
    },
    include: {
        messages: {
            take: 3
        }
    }
})

aggregation query logs

 prisma:query db.chats.aggregate([ { $match: { $expr: { $and: [ { $and: [ { $eq: [ "$_id", ObjectId("62ac11f944f9818e45e3889e"), ], }, { $or: [ { $ne: [ { $ifNull: [ "$_id", null, ], }, null, ], }, { $eq: [ "$_id", null, ], }, ], }, ], }, ], }, }, }, { $project: { _id: 1, name: 1, chatType: 1, createdAt: 1, recipients.userId: 1, recipients.nickname: 1, lastMessageId: 1, }, }, ])
 prisma:query db.messages.aggregate([ { $match: { $expr: { $and: [ { $in: [ "$chatId", [ "62ac11f944f9818e45e3889e", ], ], }, { $or: [ { $ne: [ { $ifNull: [ "$chatId", null, ], }, null, ], }, { $eq: [ "$chatId", null, ], }, ], }, ], }, }, }, { $sort: { _id: 1, }, }, { $project: { _id: 1, chatId: 1, authorId: 1, content: 1, deleted: 1, createdAt: 1, }, }, ])

Environment & setup

  • OS: Pop!_OS
  • Database: MongoDB
  • Node.js version: v16.15.0

Prisma Version

prisma                  : 3.15.2
@prisma/client          : 3.15.2
Current platform        : debian-openssl-3.0.x
Query Engine (Node-API) : libquery-engine 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/libquery_engine-debian-openssl-3.0.x.so.node)
Migration Engine        : migration-engine-cli 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/migration-engine-debian-openssl-3.0.x)
Introspection Engine    : introspection-core 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/introspection-engine-debian-openssl-3.0.x)
Format Binary           : prisma-fmt 461d6a05159055555eb7dfb337c9fb271cbd4d7e (at node_modules/@prisma/engines/prisma-fmt-debian-openssl-3.0.x)
Default Engines Hash    : 461d6a05159055555eb7dfb337c9fb271cbd4d7e
Studio                  : 0.462.0

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:2
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
andrewicarlsoncommented, Nov 4, 2022

@peacechen thanks for this! I expanded this into a full reproduction with comparisons between MongoDB and Postgres. Building on the schema and query you’ve posted above I found the following:

Screen Shot 2022-11-03 at 23 38 15 Screen Shot 2022-11-03 at 23 38 21

Note that the times I found were significantly faster (4sec vs 30sec) but using a local MongoDB database.

1reaction
peacechencommented, Nov 4, 2022

Thanks @andrewicarlson for the detailed analysis 🎉

We’re using Mongo Atlas so there’s more of a network penalty. Our models are more complex than the simplified example. There are more relationships to other collections not included. Glad that you’re able to repro the linear relationship with the number of records.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Model One-to-Many Relationships with Embedded Documents
This page describes a data model that uses embedded documents to describe a one-to-many relationship between connected data. Embedding connected data in a ......
Read more >
MongoDB One to Many Relationship - Stack Overflow
One to Many Relations. In this relationship, there is many, many entities or many entities that map to the one entity.
Read more >
Spring Data MongoDB - Relation Modelling
In this blog post, we take a look at the different possibilities of linking documents with manual references and DBRefs when the need...
Read more >
MongoDB One-to-Many Relationship tutorial with Mongoose ...
You will also know 3 criteria to choose Referencing or Embedding for improving application performance. Then we're gonna make some MongoDB One- ...
Read more >
API with NestJS #44. Implementing relationships with MongoDB
We implement the one-to-many and many-to-one relationships when a document from the first collection can be linked to multiple documents from ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found