question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UPDATE query crashes node (java.lang.StackOverflowError: null)

See original GitHub issue

CrateDB version: 2.1.6, 2.3.2

JVM version: openjdk version “1.8.0_151” OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12) OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

OS version / environment description: Ubuntu 16.04.2 LTS (GNU/Linux 4.4.0-47-generic x86_64)

Problem description:

UPDATE query crashes node.

[2018-02-16T23:00:01,485][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [s1] fatal error in thread [elasticsearch[s1][bulk][T#2]], exiting
java.lang.StackOverflowError: null
	at java.util.HashMap.hash(HashMap.java:339) ~[?:1.8.0_151]
	at java.util.HashMap.get(HashMap.java:557) ~[?:1.8.0_151]
	at java.util.Collections$UnmodifiableMap.get(Collections.java:1454) ~[?:1.8.0_151]
	at io.crate.operation.scalar.DateTruncFunction.intervalAsUnit(DateTruncFunction.java:192) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.scalar.DateTruncFunction.rounding(DateTruncFunction.java:170) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.scalar.DateTruncFunction.compile(DateTruncFunction.java:116) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.BaseImplementationSymbolVisitor.visitFunction(BaseImplementationSymbolVisitor.java:53) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.BaseImplementationSymbolVisitor.visitFunction(BaseImplementationSymbolVisitor.java:39) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.analyze.symbol.Function.accept(Function.java:62) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.analyze.symbol.SymbolVisitor.process(SymbolVisitor.java:32) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.operation.InputFactory$Context.add(InputFactory.java:166) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.resolveSymbols(TransportShardUpsertAction.java:319) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.processGeneratedColumns(TransportShardUpsertAction.java:573) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.prepareUpdate(TransportShardUpsertAction.java:381) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.indexItem(TransportShardUpsertAction.java:240) ~[crate-app-2.1.6.jar:2.1.6]
	at io.crate.executor.transport.TransportShardUpsertAction.indexItem(TransportShardUpsertAction.java:261) ~[crate-app-2.1.6.jar:2.1.6]
        [repeated 100+ times]

Steps to reproduce:

SCHEMA:

CREATE TABLE IF NOT EXISTS "myschema"."notification" (
   "company_id" STRING GENERATED ALWAYS AS substr("group_id", 1, 6),
   "created_at" TIMESTAMP NOT NULL,
   "deleted_at" TIMESTAMP,
   "details" OBJECT (DYNAMIC) AS (
      "coordinates" GEO_POINT,
      "device_sn" STRING,
      "document_id" STRING,
      "geofence_id" STRING,
      "group_id" STRING,
      "limit" LONG,
      "seal_id" STRING,
      "speed" LONG,
      "user_id" STRING,
      "vehicle_id" STRING
   ),
   "group_id" STRING,
   "id" STRING NOT NULL,
   "month" TIMESTAMP GENERATED ALWAYS AS date_trunc('month', "created_at"),
   "persistent" BOOLEAN NOT NULL,
   "status" STRING NOT NULL,
   "timestamp" TIMESTAMP,
   "type" STRING NOT NULL,
   "updated_at" TIMESTAMP NOT NULL
)
CLUSTERED BY ("id") INTO 6 SHARDS
PARTITIONED BY ("month")
WITH (
   "allocation.max_retries" = 5,
   "blocks.metadata" = false,
   "blocks.read" = false,
   "blocks.read_only" = false,
   "blocks.write" = false,
   column_policy = 'dynamic',
   "mapping.total_fields.limit" = 1000,
   number_of_replicas = '1',
   "recovery.initial_shards" = 'quorum',
   refresh_interval = 1000,
   "routing.allocation.enable" = 'all',
   "routing.allocation.total_shards_per_node" = -1,
   "translog.durability" = 'REQUEST',
   "translog.flush_threshold_size" = 536870912,
   "translog.sync_interval" = 5000,
   "unassigned.node_left.delayed_timeout" = 60000,
   "warmer.enabled" = true,
   "write.wait_for_active_shards" = 'all'
)

QUERY

update "myschema"."notification"
set "deleted_at" = CURRENT_TIMESTAMP
where "timestamp" < 1518904800000 and "persistent" = false and "deleted_at" is null;

NOTE: Only fails if there are rows in table that match WHERE conditions. NOTE 2: Transforming the UPDATE into a DELETE query does not crash the node - it works as expected.

Use case:

Feature description:

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:17 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
seutcommented, Feb 27, 2018

@rps-v Great to hear that this solves your issue. Anyway this should not happen even when a node crashes with in-flight inserts. I’m closing this issue for now but also try to investigate how documents could get into this persistent version state. Thx for reporting.

1reaction
matrivcommented, Feb 21, 2018

@rps-v The document that causes the issue is with _id: AWGOgCvq3g58eQ8H13w2. So as quick workaround take a backup of this document with: select * from "tracknamic"."notification" where _id='AWGOgCvq3g58eQ8H13w2'; and saving the output. Then delete it: delete from "tracknamic"."notification" where _id='AWGOgCvq3g58eQ8H13w2'; and insert it again: insert into "tracknamic"."notification" (....) values(....)

This document ended up with _version=-4 which causes the stackoverflow…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data nodes crashing in cascade with java.lang ... - GitHub
Data nodes crashing in cascade with java.lang.StackOverflowError: null #34715 ... I would start the lookout with Regex queries.
Read more >
What causes a java.lang.StackOverflowError - Stack Overflow
The error java.lang.StackOverflowError is thrown to indicate that the application's stack was exhausted, due to deep recursion i.e your program/script ...
Read more >
How to Fix java.lang.StackOverflowError in Java - Rollbar
The java.lang.StackOverflowError is a runtime error which points to serious problems that cannot be caught by an application. The java.lang.
Read more >
How to resolve the "java.lang.stackoverflowerror" in Java
The simplest solution is to carefully inspect the stack trace and detect the repeating pattern of line numbers. · Once you have verified...
Read more >
Task Execution Fails Because of Stack Memory ... - 华为云
When Hive performs a query operation, error "Error running child: java.lang.StackOverflowError" is reported. The error details are as follows:Error ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found