Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UPDATE query crashes node (java.lang.StackOverflowError: null)

See original GitHub issue

CrateDB version: 2.1.6, 2.3.2

JVM version: openjdk version “1.8.0_151” OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12) OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

OS version / environment description: Ubuntu 16.04.2 LTS (GNU/Linux 4.4.0-47-generic x86_64)

Problem description:

UPDATE query crashes node.

[2018-02-16T23:00:01,485][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [s1] fatal error in thread [elasticsearch[s1][bulk][T#2]], exiting
java.lang.StackOverflowError: null
	at java.util.HashMap.hash(HashMap.java:339) ~[?:1.8.0_151]
	at java.util.HashMap.get(HashMap.java:557) ~[?:1.8.0_151]
	at java.util.Collections$UnmodifiableMap.get(Collections.java:1454) ~[?:1.8.0_151]
	at io.crate.operation.scalar.DateTruncFunction.intervalAsUnit(DateTruncFunction.java:192) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.scalar.DateTruncFunction.rounding(DateTruncFunction.java:170) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.scalar.DateTruncFunction.compile(DateTruncFunction.java:116) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.BaseImplementationSymbolVisitor.visitFunction(BaseImplementationSymbolVisitor.java:53) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.BaseImplementationSymbolVisitor.visitFunction(BaseImplementationSymbolVisitor.java:39) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.analyze.symbol.Function.accept(Function.java:62) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.analyze.symbol.SymbolVisitor.process(SymbolVisitor.java:32) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.InputFactory$Context.add(InputFactory.java:166) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.resolveSymbols(TransportShardUpsertAction.java:319) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.processGeneratedColumns(TransportShardUpsertAction.java:573) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.prepareUpdate(TransportShardUpsertAction.java:381) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.indexItem(TransportShardUpsertAction.java:240) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.indexItem(TransportShardUpsertAction.java:261) ~[crate-app-2.1.6.jar:2.1.6]
        [repeated 100+ times]

Steps to reproduce:

SCHEMA:

CREATE TABLE IF NOT EXISTS "myschema"."notification" (
   "company_id" STRING GENERATED ALWAYS AS substr("group_id", 1, 6),
   "created_at" TIMESTAMP NOT NULL,
   "deleted_at" TIMESTAMP,
   "details" OBJECT (DYNAMIC) AS (
      "coordinates" GEO_POINT,
      "device_sn" STRING,
      "document_id" STRING,
      "geofence_id" STRING,
      "group_id" STRING,
      "limit" LONG,
      "seal_id" STRING,
      "speed" LONG,
      "user_id" STRING,
      "vehicle_id" STRING
   ),
   "group_id" STRING,
   "id" STRING NOT NULL,
   "month" TIMESTAMP GENERATED ALWAYS AS date_trunc('month', "created_at"),
   "persistent" BOOLEAN NOT NULL,
   "status" STRING NOT NULL,
   "timestamp" TIMESTAMP,
   "type" STRING NOT NULL,
   "updated_at" TIMESTAMP NOT NULL
)
CLUSTERED BY ("id") INTO 6 SHARDS
PARTITIONED BY ("month")
WITH (
   "allocation.max_retries" = 5,
   "blocks.metadata" = false,
   "blocks.read" = false,
   "blocks.read_only" = false,
   "blocks.write" = false,
   column_policy = 'dynamic',
   "mapping.total_fields.limit" = 1000,
   number_of_replicas = '1',
   "recovery.initial_shards" = 'quorum',
   refresh_interval = 1000,
   "routing.allocation.enable" = 'all',
   "routing.allocation.total_shards_per_node" = -1,
   "translog.durability" = 'REQUEST',
   "translog.flush_threshold_size" = 536870912,
   "translog.sync_interval" = 5000,
   "unassigned.node_left.delayed_timeout" = 60000,
   "warmer.enabled" = true,
   "write.wait_for_active_shards" = 'all'
)

QUERY

update "myschema"."notification"
set "deleted_at" = CURRENT_TIMESTAMP
where "timestamp" < 1518904800000 and "persistent" = false and "deleted_at" is null;

NOTE: Only fails if there are rows in table that match WHERE conditions. NOTE 2: Transforming the UPDATE into a DELETE query does not crash the node - it works as expected.

Use case:

Feature description:

Issue Analytics

State:
Created 6 years ago
Comments:17 (11 by maintainers)

Top GitHub Comments

1reaction

seutcommented, Feb 27, 2018

@rps-v Great to hear that this solves your issue. Anyway this should not happen even when a node crashes with in-flight inserts. I’m closing this issue for now but also try to investigate how documents could get into this persistent version state. Thx for reporting.

1reaction

matrivcommented, Feb 21, 2018

@rps-v The document that causes the issue is with _id: AWGOgCvq3g58eQ8H13w2. So as quick workaround take a backup of this document with: select * from "tracknamic"."notification" where _id='AWGOgCvq3g58eQ8H13w2'; and saving the output. Then delete it: delete from "tracknamic"."notification" where _id='AWGOgCvq3g58eQ8H13w2'; and insert it again: insert into "tracknamic"."notification" (....) values(....)

This document ended up with _version=-4 which causes the stackoverflow…

Top Results From Across the Web

Data nodes crashing in cascade with java.lang ... - GitHub

Data nodes crashing in cascade with java.lang.StackOverflowError: null #34715 ... I would start the lookout with Regex queries.

What causes a java.lang.StackOverflowError - Stack Overflow

The error java.lang.StackOverflowError is thrown to indicate that the application's stack was exhausted, due to deep recursion i.e your program/script ...

How to Fix java.lang.StackOverflowError in Java - Rollbar

The java.lang.StackOverflowError is a runtime error which points to serious problems that cannot be caught by an application. The java.lang.

How to resolve the "java.lang.stackoverflowerror" in Java

The simplest solution is to carefully inspect the stack trace and detect the repeating pattern of line numbers. · Once you have verified...

Task Execution Fails Because of Stack Memory ... - 华为云

When Hive performs a query operation, error "Error running child: java.lang.StackOverflowError" is reported. The error details are as follows:Error ...