question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Observed flakyness when using the "fast_executemany" option with the "DataDirect PostgreSQL ODBC Driver"

See original GitHub issue

Hi there,

we ran into a specific issue when using CrateDB with ODBC, might have identified a flaw and wanted to share the outcome of our investigations with you.

Problem report

When connecting to CrateDB’s PostgreSQL interface using unixODBC through its pyodbc binding and enabling the fast_executemany option, the Progress DataDirect PostgreSQL ODBC Driver shows flaky communication and synchronization behavior.

Everything works well when using PostgreSQL. Also, when using the vanilla psqlODBC - PostgreSQL ODBC driver, no flaws happened.

Details

The software versions we are using are:

  • CrateDB 4.5.1
  • PostgreSQL 13.2
  • Docker 20.10.5
  • Python 3.9.5
  • pyodbc 4.0.30
  • Progress DataDirect Connect ODBC PostgreSQL Wire Protocol Driver 7.1.6
  • psqlODBC - ODBC driver for PostgreSQL 11.00.0000

Reproduction

The whole setup for investigating this issue is wrapped into a repository [1] in order to make reproduction effortless. Its README document also outlines the observations in more detail. The setup must be run on Linux or a respective emulated environment because it only includes ODBC drivers for Linux.

A visual representation of the flaky behavior in the specific scenario is attached in form of a screenshot capturing the outcome of the test suite.

Thoughts

Maybe this is related to what we investigated on behalf of https://github.com/brianc/node-postgres/issues/2454 and https://github.com/brianc/node-postgres/issues/2455 and mitigated with https://github.com/crate/crate/pull/10979. However, that is really just a wild guess.

With kind regards, Andreas.

[1] https://github.com/amotl/cratedb-datadirect-odbc

/cc @hammerhead, @proddata, @jayeff


image

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
seutcommented, Jun 16, 2021

We have investigated this issue and it seems that the Progress DataDirect ODBC driver expects that a response to a Bind/Execute/Close/Sync (logically: binding values to a prepared statement) will fit into one data frame, so a flush should only happen after the Sync->ReadyForQuery outbound message. But CrateDB will flush after every Execute->CommandComplete message to ensure correct outbound ordering of the messages as internally, CrateDB run operations asynchronously.

To support this drivers expectation, we’d have to ensure that a flush would only happen after a Sync->ReadyForQuery message, but every approach to ensure that would lead to significant overhead (e.g. thread context switches) which all other clients would suffer from. On the other hand, we don’t think that this behaviour/expectation is correct as a flush can always happen implicit when some buffer size reaches certain thresholds at multiple layers. As far as we know, there is no strict rule about when a flush is allowed and when not.

We have contacted the vendor of this proprietary driver now to notify them about a possible wrong behaviour of their driver.

0reactions
mfusseneggercommented, Oct 5, 2021

Closing this here. As far as we can tell this is a problem on the driver side. Solving it on the CrateDB side would require a workaround that would introduce a performance penality affecting other drivers as well.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DataDirect PostgreSQL ODBC 8.0 driver new feature Use ...
DataDirect PostgreSQL ODBC 8.0 driver now supports a connection option "Use Declare Fetch" that enables the PostgreSQL ODBC client to ...
Read more >
PostgreSQL ODBC driver: psqlodbc
psqlODBC is developed and supported through the pgsql-odbc@postgresql.org mailing list. You can browse the source code at the psqlODBC git repository at git....
Read more >
PostgreSQL configuration and data type considerations - IBM
This document describes data type considerations for accessing a PostgreSQL data source with the Optim solutions.
Read more >
DataDirect ODBC Drivers for Pivotal Greenplum
Install KornShell ( ksh ) on your system if it is not available. Note the appropriate serial number and license key (use the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found