question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if youā€™re still stuck at the end, weā€™re happy to hop on a call to see how we can help out.

Prisma not applying limit when take specified with id cursor

See original GitHub issue

Bug description

I was noticing that when paginating through one of my larger tables with prisma, the first page was always loading quickly but subsequent ones were taking many seconds. My queries look like this:

function getPaginationArgs(cursor: string | undefined) {
  return {
    take: 10,
    cursor: cursor ? { id: cursor } : undefined,
    skip: cursor ? 1 : undefined,
    orderBy: { createdAt: "desc" },
  } as const;
}

  const page1 = await prisma.page.findMany(getPaginationArgs(undefined));
  const cursor = page1[page1.length - 1].id;
  const page2 = await prisma.page.findMany(getPaginationArgs(cursor));

I went in and looked at the queries prisma was issuing:

First page:

SELECT "prisma_test_schema_1"."page"."id", "prisma_test_schema_1"."page"."created_at", "prisma_test_schema_1"."page"."url" FROM "prisma_test_schema_1"."page" WHERE 1=1 ORDER BY "prisma_test_schema_1"."page"."created_at" DESC LIMIT $1 OFFSET $2

Second Page:

SELECT "prisma_test_schema_1"."page"."id", "prisma_test_schema_1"."page"."created_at", "prisma_test_schema_1"."page"."url" FROM "prisma_test_schema_1"."page", (SELECT "prisma_test_schema_1"."page"."created_at" FROM "prisma_test_schema_1"."page" WHERE ("prisma_test_schema_1"."page"."id") = ($1)) AS "order_cmp" WHERE "prisma_test_schema_1"."page"."created_at" <= "order_cmp"."created_at" ORDER BY "prisma_test_schema_1"."page"."created_at" DESC OFFSET $2

The big difference here is that LIMIT is present in the first query but not the second.

How to reproduce

https://github.com/TLadd/prisma-unique-composite-key-query-bug/blob/master/README.md Follow instructions in README for a reproducible example. Basically just the above code snippet.

Expected behavior

I would expect the LIMIT to be applied to the second query as well.

Prisma information

generator client {
  provider      = "prisma-client-js"
  binaryTargets = ["native", "debian-openssl-1.1.x"]
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

model Page {
  id              String     @id @default(dbgenerated())
  createdAt       DateTime   @default(now()) @map("created_at")
  url             String     @unique

  @@map("page")
}

Environment & setup

  • OS: Mac OS
  • Database: PostgreSQL
  • Node.js version: v12.16.2
  • Prisma version: 2.12.1

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:4
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

10reactions
dpetrickcommented, Dec 31, 2020

Short version: This is by design and not a bug. We have to fall back to an inefficient query because the query orders by createdAt, which is not on a unique non-nullable (required) field. If you can, adding a required unique field to the orderBy and you will see that the pagination becomes efficient again. Or use pure skip/take-based pagination without cursor, because cursors are hard in SQL.

Long version: Cursor based pagination is hard in SQL and requires a few trade-offs for the abstraction that Prisma uses. Let me explain why. On the surface, one might think itā€™s a fairly simple issue to solve - ā€œa few > / >=, ā€¦ here and there and you got itā€. Unfortunately, this is not the case. The issues come with the interplay of different factors, such as linear vs non-linear data, if the ordering is ā€œuniqueā€, how many order-by clauses you have. Let me illustrate the thought processes and issues with a few examples. Any ideas how to solve it differently are welcome.

Note: Regardless of the above, the first query is always fast because we donā€™t have to pin a cursor. We can just apply the ordering and take the first x elements, resulting in a simple, efficient query.

Now we get to anything past the first page, we want to build a query that returns all the data that we want. A common thing for SQL is to use CURSOR .. FETCH . The tl;dr here is: The Prisma architecture and use cases donā€™t match this approach, additional to the complications of the transaction lifecycle that would be required to make this work. This leaves the option of building a query from scratch for Prisma - so letā€™s do that.

Sample data (assuming non-linear random IDs for worst-case-ish illustration), ordered by colA ASC:

id colA colB
rng1 A B
rng2 A A
rng3 B B
rng4 B B
rng5 B A
rng6 C C

Assume we have a cursor at rng4 with ORDER BY colA ASC after the first page. Weā€™d need a query that fetches rng4, rng5 and rng6 (Prisma returns the cursor by convention, but not really relevant for the complexity of the issue).

A naive first take like WHERE id >= "rng4" ORDER BY colA ASC LIMIT <PageSize> doesnā€™t work, because we canā€™t assume that id contains linear data, meaning thereā€™s actually no way to use id in the comparison in a meaningful way. For all we know, rng1 could come after rng4, lexicographically.

This means we have to rely on the rest of the row the cursor points to to determine a way to fetch the next records. We have colA = B and colB = B. We only have useful information about colA because itā€™s in the ordering and it states it is ascending, so we can use that info to fetch rows after the cursor. For that matter, we need to fetch the value of colA of the row with cursor rng4 in a subquery so we can use it in the rest of our query (cursors always pin exactly one record):

SELECT ...
FROM
(
    SELECT `Table`.`colA`,
    FROM   `Table`
    WHERE  `Table`.`id` = <rng4>
) as subquery
...

Based on colA ASC we can make an assertion that the records following rng4 have to have colA >= subquery.colA (which is ā€œBā€ in this example). A glaring issue with this query is that it still doesnā€™t guarantee that returned records come after the specified cursor, because as seen in the example, rng3 also has colA = B, meaning the result set is now rng3 - 6. To make things worse, we canā€™t apply a limit now because we donā€™t know how many records with identical colA value we have that come before the cursor, so we canā€™t do something like LIMIT <user specified take> + <num identical order rows before cursor>.

Note 1: The end result is even more complicated (not all edge cases like multiple order bys or NULL values in the cursor row / order have been taken into account for brevity), but the above should suffice to illustrate the core issue.

Note 2: You can maybe come up with some fancy SQL to get a good heuristic going to reduce the fetch size to below all records after cursor, we didnā€™t look into that yet.

This is where it actually stops - we canā€™t achieve 100% accuracy here. All we can do is a best-effort to fetch a superset of the rows we want, which is admittedly bad because we effectively fetch all rows after and sourrounding the cursor and post-process those in the query engine to reduce the record set to exactly the page the query requested.

However, on the bright side, all of the above is basically solved with one thing: a non-nullable unique field in the orderBy. Any required unique field will do (or combination of required fields as specified in @@unique). The entire issue of records with identical order row (colA) vanishes because we know that it can only occur once, so we can simple say colA >= <unique value> and we can apply limits again directly in the database.

2reactions
pantharshit00commented, Dec 10, 2020

I can reproduce this. Thanks for the reproduction.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pagination (Reference) - Prisma
Cursor -based pagination uses cursor and take to return a limited set of results before or after a given cursor. A cursor bookmarks...
Read more >
Prisma Client API (Reference)
cursor, UserWhereUniqueInput, No, Specifies the position for the list (the value typically specifies an id or another unique value). take, number, No ......
Read more >
Raw database access (Reference) - Prisma
Learn how you can send raw SQL and MongoDB queries to your database using the raw() methods from the Prisma Client API.
Read more >
Prisma Migrate limitations and known issues
Prisma Migrate generates SQL files that are specific to your provider. This means that you cannot use the same migration files for PostgreSQL...
Read more >
Super-fast Offset Pagination with Prisma2 - Medium
Skip works like LIMIT statement of SQL and take works like OFFSET ... Based on this cursor system of Prisma, I could make...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found