question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

postgresql - to_tsquery docs and implementation detail

See original GitHub issue

Migrated issue, originally created by jvanasco (@jvanasco)

I recently realized some implementation details of to_tsquery that are incompatible with the docs I drafted and how sqlalchemy integrates it.

Not sure how to handle this.

I based the docs/examples on existing tests and text. So we have this as the first bit of “Full Text Search” ( http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#full-text-search )

select([sometable.c.text.match("search string")])
SELECT text @@ to_tsquery('search string') FROM table

well, if we put this into psql…

badsql=> select to_tsquery('search string') ;
ERROR:  syntax error in tsquery: "search string"

that’s because tsquery has a rigid enforcement of input. text must either be tokenized and joined with acceptable operators :

select to_tsquery('cat & rat');

be quoted :

select to_tsquery('''search string''');

or use the alternate function

select plainto_tsquery('search string');

So, these are acceptable:

select to_tsquery('search & string');
select to_tsquery('''search string''');
select plainto_tsquery('search string');

but this is not:

select to_tsquery('search string');

I’m not sure the best way to handle this nuance in the docs.

As far as the implementation detail goes…

this creates issues with Column.match, because that generates (invalid) sql like this:

table.column @@ to_tsquery('search string')

not valid sql like:

table.column @@ to_tsquery('''search string''')
table.column @@ plainto_tsquery('search string')

This is an annoying implementation detail , but the content/type of string will affect the function or format t search on.

i thought about just regexing this into submission, but the multitude of possible operators and edge cases suggests that the text would need to be parsed for tokenization instead – otherwise throwing in a bit of text that has an operator will function as a real operator and probably trigger an invalid syntax.

i think the easiest scenario would be to replace match’s to_tsquery with plainto_tsquery

Issue Analytics

  • State:closed
  • Created 9 years ago
  • Reactions:1
  • Comments:29 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
sqlalchemy-botcommented, Nov 27, 2018

Joe Futrelle (@joefutrelle) wrote:

ELI5: psql to_tsquery fails with syntax error on typical freeform text queries (e.g., something a user typed into a form), and plainto_tsquery is provided for that case. Match in sqla uses to_tsquery and so there’s no obvious way in sqla to perform a full-text search on a freeform text query.

To add to jvanasco’s workaround above, here’s an ORM-based workaround, that should probably at least be documented:

search_string = 'some freeform search string'
tq = sqlalchemy.func.plainto_tsquery('english', search_string)
q = session.query(MyClass).filter(MyClass.some_column.op('@@')(tq))

jvanasco, should I be passing ‘english’ to the ‘text’ function like you are? This works for me without doing that.

1reaction
sqlalchemy-botcommented, Nov 27, 2018

jvanasco (@jvanasco) wrote:

Likely a 1.0

Wanted to bring this up for discussion; reach out to others who have been working on full text support.

Still wrapping my head around this, and what the proper flow should be. Perhaps no code change, but a lot of docs… But I’m not sure

There are times where “to_tsquery” would be appropriate, and others where it’s not. I think the bulk of usage scenarios in the orm would fall under a “plainto” match though.

Personally, I’ve replaced all my ‘col1.match(col2)’ with “col1.op(”@@“)(func.plainto_tsquery(col2))”

Read more comments on GitHub >

github_iconTop Results From Across the Web

15: 9.13. Text Search Functions and Operators - PostgreSQL
Constructs a phrase query, which matches if the two input queries match at successive lexemes. to_tsquery('fat') <-> to_tsquery('rat') → 'fat' <-> 'rat'.
Read more >
Documentation: 15: 8.11. Text Search Types - PostgreSQL
The tsvector type represents a document in a form optimized for text search; the tsquery type similarly represents a text query. Chapter 12...
Read more >
Documentation: 15: 12.3. Controlling Text Search - PostgreSQL
To implement full text searching there must be a function to create a tsvector from a document and a tsquery from a user...
Read more >
Documentation: 15: 12.1. Introduction - PostgreSQL
Full text indexing allows documents to be preprocessed and an index saved for ... In principle token classes depend on the specific application,...
Read more >
Mastering PostgreSQL Tools: Full-Text Search and Phrase ...
The next function that we're interested in, is to_tsquery() , which accepts a list of words that will be checked against the normalized...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found