question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Condition on joined table doesn't work with limit

See original GitHub issue

What are you doing?

I have a Supplier model with associated Calendar models.

I want to fetch suppliers who either

  • have a calendar which is set to available
  • don’t have a calendar

I.e. exclude suppliers with calendars that are set to unavailable, but include all other suppliers.

Furthermore, I want to limit my results to the first 10 suppliers.

  1. Define models and associations:
const Supplier = Sequelize.define('suppliers', {
  uuid: { type: Sequelize.UUID, primaryKey: true },
});

const Calendar = Sequelize.define('suppliers_calendars', {
  uuid: { type: Sequelize.UUID, primaryKey: true },
  start_time: Sequelize.DATE,
  end_time: Sequelize.DATE,
  state: { type: Sequelize.ENUM, values: ['available', 'unavailable', null] },
});

Supplier.hasMany(Calendar, { as: 'calendars', foreignKey: 'supplier_id' });
  1. Generate query:
Supplier.findAll({
  include: [
    {
      model: Calendar,
      as: 'calendars',
      required: false,
      where: {
        start_time: { [Op.lte]: date },
        end_time: { [Op.gte]: date },
      },
    },
  ],
  where: {
    '$calendars.state$': {
      [Op.or]: [
        { [Op.in]: ['available'] },
        { [Op.eq]: null },
      ],
    },
  },
  limit: 10,
});

What do you expect to happen?

I expect the following SQL to be generated and run:

SELECT "suppliers".*, "calendars"."uuid" AS "calendars.uuid", "calendars"."end_time" AS "calendars.end_time", "calendars"."start_time" AS "calendars.start_time", "calendars"."supplier_id" AS "calendars.supplier_id", "calendars"."state" AS "calendars.state"
FROM (
  SELECT "suppliers"."uuid"
  FROM "suppliers" AS "suppliers"
  ORDER BY "suppliers"."uuid"
  LIMIT 10
) AS "suppliers"
LEFT OUTER JOIN "suppliers_calendars" AS "calendars" ON
  "suppliers"."uuid" = "calendars"."supplier_id"
  AND "calendars"."start_time" <= '2019-05-27 23:00:00.000 +00:00'
  AND "calendars"."end_time" >= '2019-05-27 23:00:00.000 +00:00'
WHERE (("calendars"."state" IN ('available') OR "calendars"."state" IS NULL))
ORDER BY "suppliers"."uuid";

What is actually happening?

The following SQL is generated:

SELECT "suppliers".*, "calendars"."uuid" AS "calendars.uuid", "calendars"."end_time" AS "calendars.end_time", "calendars"."start_time" AS "calendars.start_time", "calendars"."supplier_id" AS "calendars.supplier_id", "calendars"."state" AS "calendars.state"
FROM (
  SELECT "suppliers"."uuid"
  FROM "suppliers" AS "suppliers"
  WHERE (("calendars"."state" IN ('available') OR "calendars"."state" IS NULL))
  ORDER BY "suppliers"."uuid"
  LIMIT 10
) AS "suppliers"
LEFT OUTER JOIN "suppliers_calendars" AS "calendars" ON
  "suppliers"."uuid" = "calendars"."supplier_id"
  AND "calendars"."start_time" <= '2019-05-27 23:00:00.000 +00:00'
  AND "calendars"."end_time" >= '2019-05-27 23:00:00.000 +00:00'
ORDER BY "suppliers"."uuid";

Note that it is erroneous, as the WHERE condition on calendars is put inside the inner query, before the join has occured (so it’s referencing a nonexistent table).

NOTE: removing the limit property on the findAll options, produces the following correct SQL:

SELECT "suppliers"."uuid", "calendars"."uuid" AS "calendars.uuid", "calendars"."end_time" AS "calendars.end_time", "calendars"."start_time" AS "calendars.start_time", "calendars"."supplier_id" AS "calendars.supplier_id", "calendars"."state" AS "calendars.state"
FROM "suppliers" AS "suppliers"
LEFT OUTER JOIN "suppliers_calendars" AS "calendars" ON
  "suppliers"."uuid" = "calendars"."supplier_id"
  AND "calendars"."start_time" <= '2019-05-27 23:00:00.000 +00:00'
  AND "calendars"."end_time" >= '2019-05-27 23:00:00.000 +00:00'
WHERE (("calendars"."state" IN ('available') OR "calendars"."state" IS NULL))
ORDER BY "suppliers"."uuid"

So I suspect it is the limit logic which is producing the wrong subquery and not handling joins properly.

Environment

Dialect:

  • mysql
  • postgres
  • sqlite
  • mssql
  • any Dialect library version: 6.1.0 Database version: 11.2 Sequelize version: 5.8.6 Node Version: 10.15.3 OS: macOS 10.14.4 (Mojave) Tested with latest release:
  • No
  • Yes, specify that version: 5.8.6

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
felamaslencommented, Jul 16, 2019

I think you’re right, and I’d hazard a guess that my PR doesn’t handle the above case properly.

On Tue, 16 Jul 2019, 13:09 Laurent VB, notifications@github.com wrote:

Hi @felamaslen https://github.com/felamaslen

I’m experiencing the same issue, but I’m not sure your expected generated SQL query is correct as it will first limit the suppliers to 10 results, then join, then apply the condition on the joined table, meaning that if none of the originally selected 10 suppliers happen to match the condition, your result set will be empty, although you could still have had valid results with other suppliers selected first.

I think the valid query would be something like:

SELECT “suppliers”., “calendars”.“uuid” AS “calendars.uuid”, “calendars”.“end_time” AS “calendars.end_time”, “calendars”.“start_time” AS “calendars.start_time”, “calendars”.“supplier_id” AS “calendars.supplier_id”, “calendars”.“state” AS "calendars.state"FROM ( SELECT DISTINCT “suppliers”. FROM “suppliers” AS “suppliers” LEFT OUTER JOIN “suppliers_calendars” AS “calendars” ON “suppliers”.“uuid” = “calendars”.“supplier_id” AND “calendars”.“start_time” <= ‘2019-05-27 23:00:00.000 +00:00’ AND “calendars”.“end_time” >= ‘2019-05-27 23:00:00.000 +00:00’ WHERE ((“calendars”.“state” IN (‘available’) OR “calendars”.“state” IS NULL)) ORDER BY “suppliers”.“uuid” LIMIT 10 ) AS "suppliers"LEFT OUTER JOIN “suppliers_calendars” AS “calendars” ON “suppliers”.“uuid” = “calendars”.“supplier_id” AND “calendars”.“start_time” <= ‘2019-05-27 23:00:00.000 +00:00’ AND “calendars”.“end_time” >= ‘2019-05-27 23:00:00.000 +00:00’;

where the limit, order (and offset), join conditions and where clauses on the joined table are added in the subquery to make sure the suppliers fetched match all conditions. A distinct keyword is added in the subquery as well to make sure suppliers who have multiple calendars are only considered once.

wdyt? Does your submitted PR handle this properly?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sequelize/sequelize/issues/10962?email_source=notifications&email_token=AAI5BNCVZEFUNN6YMU3I6XLP7W3BJA5CNFSM4HNT6XM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2AUO4I#issuecomment-511788913, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI5BNCVELBDRX776LZC7BLP7W3BJANCNFSM4HNT6XMQ .

1reaction
LaurentVBcommented, Jul 16, 2019

Hi @felamaslen

I’m experiencing the same issue, but I’m not sure your expected generated SQL query is correct as it will first limit the suppliers to 10 results, then join, then apply the condition on the joined table, meaning that if none of the originally selected 10 suppliers happen to match the condition, your result set will be empty, although you could still have had valid results with other suppliers selected first.

I think the valid query would be something like:

SELECT "suppliers".*, "calendars"."uuid" AS "calendars.uuid", "calendars"."end_time" AS "calendars.end_time", "calendars"."start_time" AS "calendars.start_time", "calendars"."supplier_id" AS "calendars.supplier_id", "calendars"."state" AS "calendars.state"
FROM (
  SELECT DISTINCT "suppliers".*
  FROM "suppliers" AS "suppliers"
  LEFT OUTER JOIN "suppliers_calendars" AS "calendars" ON
    "suppliers"."uuid" = "calendars"."supplier_id"
    AND "calendars"."start_time" <= '2019-05-27 23:00:00.000 +00:00'
    AND "calendars"."end_time" >= '2019-05-27 23:00:00.000 +00:00'
  WHERE (("calendars"."state" IN ('available') OR "calendars"."state" IS NULL))
  ORDER BY "suppliers"."uuid"
  LIMIT 10
) AS "suppliers"
LEFT OUTER JOIN "suppliers_calendars" AS "calendars" ON
  "suppliers"."uuid" = "calendars"."supplier_id"
  AND "calendars"."start_time" <= '2019-05-27 23:00:00.000 +00:00'
  AND "calendars"."end_time" >= '2019-05-27 23:00:00.000 +00:00';

where the limit, order (and offset), join conditions and where clauses on the joined table are added in the subquery to make sure the suppliers fetched match all conditions. A distinct keyword is added in the subquery as well to make sure suppliers who have multiple calendars are only considered once.

wdyt? Does your submitted PR handle this properly?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Sequelize condition on joined table doesn't work with limit ...
Here we have some problems. First - Sequelize makes aliases for each table while generating SQL query. Second - Sequelize puts all columns ......
Read more >
MySQL LIMIT within JOIN - DBA Stack Exchange
Try this: SELECT sym.id, sym.symbol, s.marketCapitalization FROM symbols AS sym INNER JOIN company_key_statistics AS s ON sym.id = s.
Read more >
LIMIT and OFFSET Can Work with JOIN | by Dan Crews
When you JOIN, a single article can turn into many rows, and you end up filtering your content out. So yeah, if you...
Read more >
Understanding MySQL LEFT JOIN Clause By Examples
In other words, LEFT JOIN returns all rows from the left table regardless of whether a row from the left table has a...
Read more >
JOIN - Snowflake Documentation
A JOIN operation combines rows from two tables (or other table-like sources, such as views or table functions) to create a new combined...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found