question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[HELP] 0.3.2 pre-release for public testing

See original GitHub issue

Background

v0.3.2 was a minor release scheduled to be released months ago, but now it’s a complete rewrite mainly for two reasons:

  1. decoupling(see #570 for details)

    • Java client is async and lightweight
    • JDBC driver is built on top of Java client
  2. switching data format to RowBinary to fix issues and improve performance Benchmark results…

    0.3.2-test1...
    • clickhouse-grpc-jdbc and clickhouse-http-jdbc are new JDBC driver(0.3.2) using RowBinary data format
    • clickhouse-jdbc is the old JDBC driver(0.3.1-patch) based on TabSeparated
    • clickhouse-native-jdbc is ClickHouse-Native-JDBC 2.6.0 Benchmark settings: thread=1, sampleSize=100000, fetchSize=10000, mode=throughput(ops/s). image
    0.3.2-test3...

    Unlike previous round of testing, ClickHouse container is re-created a few minutes before benchmarking each driver.

    • Single thread
      • Comparison image Note: HttpClient is async(uses more than one thread in runtime); gRPC uses gzip(why?) which is slower than lz4.
      • VM utilization image Note: on client side, the new driver consumes less memory and CPU than others, BUT higher CPU on server side(due to overhead of http protocol?).
    • 4 threads
      • Comparison image

      • VM utilization image

    0.3.2...

    image

    Query performance is similar as shown in 0.3.2-test3 so this time we only focus on insertion. image Note: gRPC does not support LZ4 compression so we use GZIP in the test.

    • Single thread image
    • 4 threads image

0.3.2-test1, 0.3.2-test2, and 0.3.2-test3 are pre-release for public testing.

Downloads

Maven dependency:

<dependency>
    <!-- will stop using group id "ru.yandex.clickhouse" starting from 0.4.0  -->
    <groupId>com.clickhouse</groupId>
    <!-- or clickhouse-grpc-client to use gRPC client  -->
    <artifactId>clickhouse-http-client</artifactId>
    <version>0.3.2-test3</version>
</dependency>

To download JDBC drivers:

Package Size Legacy New HTTP gRPC Remark
clickhouse-jdbc-0.3.2-all.jar 18.6MB Y Y Y Y Both old and new JDBC drivers(besides netty, okhttp is included as well)
clickhouse-jdbc-0.3.2-http.jar 756KB N Y Y N New JDBC driver with only http support
clickhouse-jdbc-0.3.2-grpc.jar 17.3MB N Y N Y New JDBC driver with only grpc support(only netty, okhttp is excluded)
clickhouse-jdbc-0.3.2-shaded.jar 2.8MB Y Y Y N Both old and new JDBC drivers

Note: the first two are recommended. grpc is experimental so you’d better use http.

Known Issues

  • new driver(com.clickhouse.jdbc.ClickHouseDriver) does not work with version before 21.3
  • java.io.IOException: HTTP/1.1 header parser received no bytes when using JDK 11+ and http_connection_provider is set to HTTP_CLIENT
  • RESOURCE_EXHAUSTED: Compressed gRPC message exceeds maximum size - increase max_inbound_message_size to resolve
  • select 1 format JSON works in http but not grpc, because grpc client is not aware of response format
  • insert into table values(?, ?) is slow in batch mode - try insert into table select c2,c3 from input('c1 String, c2 UInt8, c3 Nullable(UInt32)') instead
  • use_time_zone and use_server_time_zone_for_dates properties do not work
  • no table/index show up under jdbc(*) database
  • roaringbitmap is not included in the shaded jar

Key Changes

  • Java client and JDBC driver are now in different modules, along with JPMS support
  • Replaced data format from TabSeparated to RowBinary
  • Support more data types including Date32, Geo types, and mixed use of nested types
  • JDBC connection URL now supports abbrebation, protocol and optional port
    • jdbc:ch://localhost is same as jdbc:clickhouse:http://localhost:8123
    • jdbc:ch:grpc://localhost/db is same as jdbc:clickhouse:grpc://localhost:9100/db
  • New JDBC driver class is com.clickhouse.jdbc.ClickHouseDriver(will remove ru.yandex.clickhouse.ClickHouseDriver starting from 0.4.0)
  • JDBC connection properties are simplified
    • use custom_http_headers and custom_http_params for customization - won’t work for grpc client
    • jdbcCompliant(defaults to true) to support fake transaction and standard synchronous UPDATE and DELETE statements
    • typeMappings to customize type mapping(e.g. DateTime=java.lang.String,DateTime32=java.lang.String)

Some more details can be found at #736, #747, #769, and #777.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:17 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
zhicwucommented, Dec 6, 2021

Thanks @dynaxis, these are all good points.

It seems that the Java client is currently relying on thread pools underneath to implement its async API, I mean, instead of being fully async at its core. Is it tentative or intended?

Unfortunately this is intended, because JDBC driver is built on top of the client, meaning we prefer least dependency and we still need to support JDK 8. I hope we can find somewhere in the middle - a compact lib to serve very basic functions for both JDBC and R2DBC drivers.

For instance, if I want to make a few requests to ClickHouse in the context of an incoming HTTP call, then what am I supposed to reuse across the requests to a ClickHouse node? ClickHouseClient? ClickHouseRequest?

Yes, you can reuse ClickHouseRequest. Each time you call its execute()/send() method, it will create a sealed copy for the execution, which is similar as copy-on-write data structure for thread safety. On the other, ClickHouseClient is responsible for handling protocol-specific details like how to execute a request and get response. Taking http as an example, depending on whether the concrete http connection(e.g. HttpURLConnection) is reusable, it may suggest to create new connection for each request or simply reuse the same one.

1reaction
zhicwucommented, Dec 2, 2021

This is one of the crucial parts of the usage:

  • Are clients simply calling getString?

  • Are clients calling different methods depending on the type of the column reported? If so, which methods do they use?

    • getObject(int)
    • getObject(int, Class<?> – which class?
    • getTime(int)
    • getDate(int)
    • getTimestamp(int)

Looks like a combination of getObject() and then convert LocalDateTime(timestamp without time zone)/OffsetDateTime(timestamp with time zone) to string. DBeaver on the other hand has a display issue - submitted dbeaver/dbeaver#14772 to track status.

Read more comments on GitHub >

github_iconTop Results From Across the Web

public/Get-FixVersion.ps1 0.3.2 - PowerShell Gallery
This script helps out with this when both a 4-part version is used, and when a prerelease have been passed. .PARAMETER Version
Read more >
Pre-Release Public Testing (Concept) - Giant Bomb
A public testing period is, or has been, offered for the general public prior to sale or general availability. Access is generally granted ......
Read more >
Changelog | Cypress Documentation
Cypress component tests now correctly load assets with Angular. Fixes #23797. Imports in component testing support files are no longer tree-shaken by Webpack....
Read more >
timm 0.3.2 - PyPI
Support for native Torch AMP and channels_last memory format added to train/validate scripts ( --channels-last , --native-amp vs --apex-amp ); Models tested ......
Read more >
keywords:cisco - npm search
@cse-public/webex-node-bot-framework. Webex Teams Bot Framework for Node JS ... JS. It also support NetFlow v9 options template & data.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found