Query performance
See original GitHub issueCrateDB version: 1.0.5
JVM version:
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
OS version / environment description:
Linux world-db-21 3.13.0-110-generic #157-Ubuntu SMP Mon Feb 20 11:54:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
Problem description:
Running a query against a smaller subset of data that was ingested post-1.0.5 is 10x slower than querying a larger, 2x, subset of data that was ingested pre-1.0.5.
Feature description:
First, the total row counts of the data in question. The issue pertains to these two ingestid
datasets.
doc=> select count(*)
from ourdata
where ingestid = '3a9869fb2963cee218e296db79d8b612';
count(*)
-----------
375092997
(1 row)
Time: 1.986 ms
doc=> select count(*)
from ourdata
where ingestid = '9b4cbbe4dd1e0d26debb375328c74354';
count(*)
-----------
867072963
(1 row)
Time: 2.007 ms
Of those ☝️ two ingestid
, the 9b4...
was ingested pre-1.0.5, while 3a9...
was ingested post-1.05. And those are respectable query times (i picked the best of 5 consecutive queries).
The problems comes when i add to the where clause:
doc=> select count(*)
from ourdata
where ingestid = '3a9869fb2963cee218e296db79d8b612'
and data['t_deviceid'] = '1aab41025c6c62cc20320e65510a464cc9d85c057b38faa8e5c3409f2bcef673'
and data['t_lastseen'] != '\N';
count(*)
----------
2733
(1 row)
Time: 306.863 ms
doc=> select count(*)
from ourdata
where ingestid = '9b4cbbe4dd1e0d26debb375328c74354'
and data['t_deviceid'] = '46fe779f2cb3d2926b6a4fee98afb7b5db719faab321181bc429f8f133d3d06a'
and data['t_lastseen'] != '\N';
count(*)
----------
5478
(1 row)
Time: 29.191 ms
The 3a9...
ingestid is 10x slower than 9b4...
even though 9b4...
is 2x more total rows.
doc=> select count(*)
from ourdata
where ingestid = '3a9869fb2963cee218e296db79d8b612'
and data['t_lastseen'] != '\N';
count(*)
-----------
230985658
(1 row)
Time: 1332.693 ms
doc=> select count(*)
from ourdata
where ingestid = '9b4cbbe4dd1e0d26debb375328c74354'
and data['t_lastseen'] != '\N';
count(*)
-----------
395407210
(1 row)
Time: 435.651 ms
Ok, it seems related to the !=
condition. For 3a9...
it’s returning 61% of the rows vs 45% for 9b4...
. Even so, querying 61% of 375M rows should be faster than 45% of 870M rows, right?
The ourdata
table’s composite primary key includes an md5 and is relatively evenly distributed amongst our five nodes and 32 shards.
ourdata 22 p STARTED 59542887 27.8gb 10.0.111.22 Roteck
ourdata 30 p STARTED 59558618 27.8gb 10.0.111.21 Hohgant
ourdata 21 p STARTED 59548595 27.8gb 10.0.111.23 Grande Rochère
ourdata 17 p STARTED 59551730 27.8gb 10.0.111.22 Roteck
ourdata 20 p STARTED 59572835 27.8gb 10.0.111.21 Hohgant
ourdata 9 p STARTED 59557794 27.8gb 10.0.111.25 Setsas
ourdata 1 p STARTED 59565419 27.8gb 10.0.111.23 Grande Rochère
ourdata 29 p STARTED 59561467 27.8gb 10.0.111.25 Setsas
ourdata 19 p STARTED 59560201 27.8gb 10.0.111.25 Setsas
ourdata 3 p STARTED 59554320 27.8gb 10.0.111.24 Moléson
ourdata 12 p STARTED 59540949 27.8gb 10.0.111.22 Roteck
ourdata 16 p STARTED 59555583 27.9gb 10.0.111.23 Grande Rochère
ourdata 5 p STARTED 59557970 27.8gb 10.0.111.21 Hohgant
ourdata 31 p STARTED 59551067 27.8gb 10.0.111.23 Grande Rochère
ourdata 2 p STARTED 59564977 27.8gb 10.0.111.22 Roteck
ourdata 24 p STARTED 59552692 27.8gb 10.0.111.25 Setsas
ourdata 6 p STARTED 59554328 27.9gb 10.0.111.23 Grande Rochère
ourdata 27 p STARTED 59565296 27.8gb 10.0.111.22 Roteck
ourdata 26 p STARTED 59552178 27.7gb 10.0.111.23 Grande Rochère
ourdata 11 p STARTED 59551670 27.8gb 10.0.111.23 Grande Rochère
ourdata 10 p STARTED 59561895 27.8gb 10.0.111.21 Hohgant
ourdata 18 p STARTED 59551674 27.8gb 10.0.111.24 Moléson
ourdata 4 p STARTED 59559826 27.8gb 10.0.111.25 Setsas
ourdata 14 p STARTED 59551315 27.8gb 10.0.111.25 Setsas
ourdata 8 p STARTED 59559626 27.8gb 10.0.111.24 Moléson
ourdata 15 p STARTED 59558203 27.8gb 10.0.111.21 Hohgant
ourdata 25 p STARTED 59564711 27.8gb 10.0.111.21 Hohgant
ourdata 28 p STARTED 59562530 27.8gb 10.0.111.24 Moléson
ourdata 13 p STARTED 59557420 27.8gb 10.0.111.24 Moléson
ourdata 7 p STARTED 59559316 27.7gb 10.0.111.22 Roteck
ourdata 23 p STARTED 59581922 27.8gb 10.0.111.24 Moléson
ourdata 0 p STARTED 59562572 27.8gb 10.0.111.21 Hohgant
The table is mostly:
SHOW CREATE TABLE doc.ourdata
-----------------------------------------------------
CREATE TABLE IF NOT EXISTS "doc"."ourdata" (
"bucket" STRING,
"cell" STRING,
"cell_" STRING INDEX USING FULLTEXT WITH (
analyzer = 'simple'
),
"data" OBJECT (DYNAMIC) AS (
"i_accuracy" LONG,
"i_devicetime" LONG,
"i_tzoffset" LONG,
"t_deviceid" STRING,
"t_filename_" STRING,
"t_ip" STRING,
"t_lastseen" STRING
),
"ingestid" STRING,
"rowid" STRING,
"shape" GEO_SHAPE INDEX USING GEOHASH WITH (
distance_error_pct = 0.025,
precision = '10.0m'
),
PRIMARY KEY ("cell", "bucket", "rowid")
)
CLUSTERED INTO 32 SHARDS
WITH (
"blocks.metadata" = false,
"blocks.read" = false,
"blocks.read_only" = false,
"blocks.write" = false,
column_policy = 'dynamic',
number_of_replicas = '0',
"recovery.initial_shards" = 'quorum',
refresh_interval = 0,
"routing.allocation.enable" = 'all',
"routing.allocation.total_shards_per_node" = -1,
"translog.disable_flush" = false,
"translog.flush_threshold_ops" = 2147483647,
"translog.flush_threshold_period" = 1800000,
"translog.flush_threshold_size" = 209715200,
"translog.interval" = 5000,
"translog.sync_interval" = 5000,
"unassigned.node_left.delayed_timeout" = 60000,
"warmer.enabled" = true
)
Re: primary key, each ingestid
is 1:1
with a bucket
and cell
is empty. So, in this table, the primary key depends almost entirely on the rowid which is an md5 of lots of row-level data + a nanosecond timestamp (so the md5 is effectively a random key).
Issue Analytics
- State:
- Created 6 years ago
- Reactions:2
- Comments:8 (5 by maintainers)
Top GitHub Comments
@nicerobot After investigation we found out that we can do some performance optimization for the
!=
part of the query only if the column involved is created with the NOT NULL constraint. The PR that does it (https://github.com/crate/crate/pull/5289) is merged to master and will be available with the next feature release.Thanks again for your detailed report!
Okay, I had hoped it would be related to query caching or something. But it seems like
!=
is the problem. As your first query shows9b4..
has 867072963 matches, more than twice as much as3a9..
So for
9b4...
the!=
part of the query has to do more work. We’ll take a closer look at it.