Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Observed flakyness when using the "fast_executemany" option with the "DataDirect PostgreSQL ODBC Driver"

See original GitHub issue

Hi there,

we ran into a specific issue when using CrateDB with ODBC, might have identified a flaw and wanted to share the outcome of our investigations with you.

Problem report

When connecting to CrateDB’s PostgreSQL interface using unixODBC through its pyodbc binding and enabling the fast_executemany option, the Progress DataDirect PostgreSQL ODBC Driver shows flaky communication and synchronization behavior.

Everything works well when using PostgreSQL. Also, when using the vanilla psqlODBC - PostgreSQL ODBC driver, no flaws happened.

Details

The software versions we are using are:

CrateDB 4.5.1
PostgreSQL 13.2
Docker 20.10.5
Python 3.9.5
pyodbc 4.0.30
Progress DataDirect Connect ODBC PostgreSQL Wire Protocol Driver 7.1.6
psqlODBC - ODBC driver for PostgreSQL 11.00.0000

Reproduction

The whole setup for investigating this issue is wrapped into a repository [1] in order to make reproduction effortless. Its README document also outlines the observations in more detail. The setup must be run on Linux or a respective emulated environment because it only includes ODBC drivers for Linux.

A visual representation of the flaky behavior in the specific scenario is attached in form of a screenshot capturing the outcome of the test suite.

Thoughts

Maybe this is related to what we investigated on behalf of https://github.com/brianc/node-postgres/issues/2454 and https://github.com/brianc/node-postgres/issues/2455 and mitigated with https://github.com/crate/crate/pull/10979. However, that is really just a wild guess.

With kind regards, Andreas.

[1] https://github.com/amotl/cratedb-datadirect-odbc

/cc @hammerhead, @proddata, @jayeff

Issue Analytics

State:
Created 2 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

2reactions

seutcommented, Jun 16, 2021

We have investigated this issue and it seems that the Progress DataDirect ODBC driver expects that a response to a Bind/Execute/Close/Sync (logically: binding values to a prepared statement) will fit into one data frame, so a flush should only happen after the Sync->ReadyForQuery outbound message. But CrateDB will flush after every Execute->CommandComplete message to ensure correct outbound ordering of the messages as internally, CrateDB run operations asynchronously.

To support this drivers expectation, we’d have to ensure that a flush would only happen after a Sync->ReadyForQuery message, but every approach to ensure that would lead to significant overhead (e.g. thread context switches) which all other clients would suffer from. On the other hand, we don’t think that this behaviour/expectation is correct as a flush can always happen implicit when some buffer size reaches certain thresholds at multiple layers. As far as we know, there is no strict rule about when a flush is allowed and when not.

We have contacted the vendor of this proprietary driver now to notify them about a possible wrong behaviour of their driver.

0reactions

mfusseneggercommented, Oct 5, 2021

Closing this here. As far as we can tell this is a problem on the driver side. Solving it on the CrateDB side would require a workaround that would introduce a performance penality affecting other drivers as well.

Top Results From Across the Web

DataDirect PostgreSQL ODBC 8.0 driver new feature Use ...

DataDirect PostgreSQL ODBC 8.0 driver now supports a connection option "Use Declare Fetch" that enables the PostgreSQL ODBC client to ...

PostgreSQL ODBC driver: psqlodbc

psqlODBC is developed and supported through the pgsql-odbc@postgresql.org mailing list. You can browse the source code at the psqlODBC git repository at git....

PostgreSQL configuration and data type considerations - IBM

This document describes data type considerations for accessing a PostgreSQL data source with the Optim solutions.

DataDirect ODBC Drivers for Pivotal Greenplum

Install KornShell ( ksh ) on your system if it is not available. Note the appropriate serial number and license key (use the...