Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Representing a complex update/insert/delete CTE

See original GitHub issue

Hi,

I wanted to ask if the following complex CTE query which comprises of inserts, deletions, and updates could be modeled with this library.

Suppose we have a post system with attachments. Post attachments are identified by ETag’s. ETag’s can be associated with many posts, and posts can be associated with several ETag’s at once. A many-to-many relations table is used to define this relationship.

CREATE TABLE public.posts (
    id text NOT NULL,
    author_id text NOT NULL,
    content text NOT NULL,
    created_at timestamp(3) without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL,
    updated_at timestamp(3) without time zone DEFAULT CURRENT_TIMESTAMP NOT NULL
);

CREATE TABLE public.etags_on_posts (
    post_id text NOT NULL,
    storage_id public."Storage" NOT NULL,
    etag_id text NOT NULL
);

CREATE TABLE public.etags (
    storage_id public."Storage" NOT NULL,
    id text NOT NULL,
    type text NOT NULL,
    num_posts_attached integer DEFAULT 0 NOT NULL,
    num_avatars_attached integer DEFAULT 0 NOT NULL
);

CREATE TYPE public."Storage" AS ENUM (
    'r2'
);

Suppose we want to update a post’s content and attachment given {postId: string, content: string, etags: { storageId: "r2", id: string, type: string }}[].

Old entries to “etags_on_posts” should be removed, and their corresponding entries in “etags” should have “num_posts_attached” decremented by 1.
New entries to “etags_on_posts” should be added, and their corresponding entries in “etags” should have “num_posts_attached” incremented by 1. If no corresponding entries in “etags” exist, they are created with the etag’s type inserted as well.
The entry of the post in “posts” should have its content updated.

The following raw porsager/postgres code represents the query:

async updatePostOrComment(sql: Sql, id: string, content: string, etags: { type: string, etag: string }[]) {
    const input = etags.map(e => (["r2", e.etag, e.type]));

    return sql`
        with updated_post as (update posts set ${sql({ content, updatedAt: new Date() })} where id = ${id} returning *),
        new_posts as (select * from posts union select * from updated_post)
        ${etags.length > 0 ?
            sql`
            , input as (select storage_id::"Storage", id::text, type::text from (values ${sql(input)}) as input(storage_id, id, type))
            , removed_etags_on_post as (delete from etags_on_posts where post_id = (select id from updated_post) and (storage_id, etag_id) not in (select storage_id, id as etag_id from input) returning *)
            , etags_on_post_to_add as (select (select id from updated_post) as post_id, storage_id, id as etag_id from input where (storage_id, id) not in (select storage_id, etag_id from removed_etags_on_post))
            , added_etags_on_post as (insert into etags_on_posts select * from etags_on_post_to_add returning *)
            , update_added_etags as (
                insert into etags (storage_id, id, type) select * from input where (storage_id, id) in (select storage_id, etag_id as id from added_etags_on_post)
                on conflict (storage_id, id) do update set num_posts_attached = excluded.num_posts_attached + 1 returning *
            )
            , update_removed_etags as (
                update etags set num_posts_attached = num_posts_attached - 1
                where (storage_id, id) in (select storage_id, etag_id as id from removed_etags_on_post) returning *
            )
            , new_etags_on_posts as (select * from etags_on_posts union select * from added_etags_on_post except select * from removed_etags_on_post)
            , new_etags as (select * from etags union select * from update_added_etags union select * from update_removed_etags)`
            :
            sql``}
        
        ${this.submissions(sql, sql`updated_post`, { posts: sql`new_posts`, etagsOnPosts: etags.length > 0 ? sql`new_etags_on_posts` : sql`etags_on_posts`, etags: etags.length > 0 ? sql`new_etags` : sql`etags` })}`.then(([post]) => post);
},

… where this.submissions joins some additional fields and performs some extra transformations to updated_post.

If perhaps the following query might be too complicated, could it be significantly simplified with this library then?

Issue Analytics

State:
Created 9 months ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

juanluispazcommented, Dec 7, 2022

For now yes, due I don’t support that construction; or, if it is important for you to be in a single query, just keep as is, in raw sql if for you it is important to be in a single query due a very important performance reason (you must be able to measure and confirm you need it).

If you want to access to the sql object used by PostgresQueryRunner you can do this (I didn’t include in PostgresQueryRunner documentation, but I mention in PrismaQueryRunner documentation):

Accessing to the porsager/postgres sql object from the connection

If you want to access the underlying porsager/postgres sql object (to the transaction one when it is in one) from your connection object you can define an accesor method in your connection class like:

import type { Sql } from 'postgres'

class DBConnection extends PostgreSqlConnection<'DBConnection'> {
    getSqlClient(): Sql {
        const client = this.queryRunner.getCurrentNativeTransaction() || this.queryRunner.getNativeRunner();
        // ideally I will do something like if (client instanceof Sql) { return client; } else { throw new Error('...'); }
        // but I don't know how to perform this validation in porsager/postgres, then I just cast to any without control allowing the function return the proper type.
        return client as any;
    }
}

In this way you will be able to keep raw sql in some places.

Note: TransactionSql extends Sql interface in porsager/postgres

1reaction

juanluispazcommented, Dec 6, 2022

Oh, you can do it in several queries if you want, but that is not what I tried to tell you.

In our case, we have very complex queries in a special part of the system that manage the accountability of a project, In that case what we did is split the query in several functions that build that part of the system, and the put together in another one. It had a very interesting effect, some subqueries where reused in several places, that made even easer to understand and think in the system.

Let me put an example with the case you put in #63:

function buildFollowersCountSubquery(connection: DBConnection) { // I intentionally omit the return type, allowing TS infer it
    return connection
        .subSelectUsing(accounts)
        .from(follows)
        .where(follows.followingId.equals(a.id)
        .selectOneColumn(connection.countAll())
        .forUseAsInlineQueryValue()  // At this point is a value that you can use in other query
        .valueWhenNull(0);
}

function buildFollowingCountSubquery(connection: DBConnection) { // I intentionally omit the return type, allowing TS infer it
    return connection
        .subSelectUsing(accounts)
        .from(follows)
        .where(follows.followerId.equals(a.id)
        .selectOneColumn(connection.countAll())
        .forUseAsInlineQueryValue()  // At this point is a value that you can use in other query
        .valueWhenNull(0);
}

async function getAccountsInformation(connection: DBConnection) {
    const result = await connection
        .from(accounts)
        .select({
            id: a.id,
            numFollowers: buildFollowersCountSubquery(connection),
            numFollowing: buildFollowingCountSubquery(connection),
            // the others counts you need
        })
        .orderBy('id', 'asc')
        .executeSelectMany();

    return result;
}

I can go eve further:

function buildAccountsInformationQuery(connection: DBConnection) { // I intentionally omit the return type, allowing TS infer it
    return connection
        .from(accounts)
        .select({
            id: a.id,
            numFollowers: buildFollowersCountSubquery(connection),
            numFollowing: buildFollowingCountSubquery(connection),
            // the others counts you need
        })
        .orderBy('id', 'asc')
}

and the use it as:

async function getAccountsInformationForVerifiedUsers(connection: DBConnection) {
    const result = await  buildAccountsInformationQuery()
        .where(accounts.isVerified)
        .executeSelectMany();

    return result;
}

This strategy allowed us to deal with very complex queries in an easy and understandable way without scarify execute everything in a single call to the database.