Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SUPPORT] Concurrent write (OCC) on distinct partitions random errors

See original GitHub issue

hudi 0.9.0, spark3.1

To experiment with OCC I setup this local tools:

local hive metastore
pyspark script
run concurrently with xargs

Sometimes it works as expected (mostrly with 2 concurrent process). But with 4 process I get randomly one of those stacktrace:

Type 1 error:

 : org.apache.hudi.exception.HoodieLockException: Unable to acquire lock, lock object LockResponse(lockid:255, state:WAITING)
 at org.apache.hudi.client.transaction.lock.LockManager.lock(LockManager.java:82)
 at org.apache.hudi.client.transaction.TransactionManager.beginTransaction(TransactionManager.java:64)

Type 2 error:

 : org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20210921153357
 at org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
 at org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
 Caused by: java.lang.IllegalArgumentException

Type 3 error:

 /tmp/test_hudi_pyspark_local/.hoodie/20210921151138.commit.requested
 at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.createImmutableFileInPath(HoodieActiveTimeline.java:544)
 at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.createFileInMetaPath(HoodieActiveTimeline.java:505)
 Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: File already exists: file:/tmp/test_hudi_pyspark_local/.hoodie/20210921151138.commit.requested

Reproduce step:

Python script:

## The idea is to generate a random partition
## They are run with a little delay in order to understand why I got the error onthe same commit timestamp
## but this is not actually needed
## There should be a COUNT=(NB+1) * 10 , where NB is the number of concurrent spark jobs

from pyspark.sql import SparkSession

import pyspark
from numpy import random
from time import sleep

sleeptime = random.uniform(2, 5)
print("sleeping for:", sleeptime, "seconds")
sleep(sleeptime)
conf = pyspark.SparkConf()
spark_conf = [
    (
        "spark.jars.packages",
        "org.apache.hudi:hudi-spark3-bundle_2.12:0.9.0,org.apache.spark:spark-avro_2.12:3.1.2",
    ),
    ("spark.serializer", "org.apache.spark.serializer.KryoSerializer"),
    ("spark.hadoop.hive.metastore.uris", "thrift://localhost:9083"),
    ("spark.hadoop.javax.jdo.option.ConnectionUserName", "hive"),
    ("spark.hadoop.javax.jdo.option.ConnectionPassword", "hive"),
    ("spark.hadoop.hive.server2.thrift.url", "jdbc:hive2://localhost:10000"),
]
conf.setAll(spark_conf)
spark = (
    SparkSession.builder.appName("test-hudi-hive-sync")
    .config(conf=conf)
    .enableHiveSupport()
    .getOrCreate()
)
sc = spark.sparkContext

# Create a table
sc.setLogLevel("ERROR")
dataGen = sc._jvm.org.apache.hudi.QuickstartUtils.DataGenerator()
inserts = sc._jvm.org.apache.hudi.QuickstartUtils.convertToStringList(
    dataGen.generateInserts(10)
)
from pyspark.sql.functions import expr

df = (
    spark.read.json(spark.sparkContext.parallelize(inserts, 10))
    .withColumn("part", expr(f"'foo{sleeptime}'"))
 # One partition per run !!
    .withColumn("id", expr("row_number() over(partition by 1 order by 1)"))
)


databaseName = "default"
tableName = "test_hudi_pyspark_local"
basePath = f"/tmp/{tableName}"

hudi_options = {
    "hoodie.table.name": tableName,
    "hoodie.datasource.write.recordkey.field": "uuid",
    "hoodie.datasource.write.partitionpath.field": "part",
    "hoodie.datasource.write.table.name": tableName,
    "hoodie.datasource.write.operation": "upsert",
    "hoodie.datasource.write.precombine.field": "ts",
    "hoodie.upsert.shuffle.parallelism": 2,
    "hoodie.insert.shuffle.parallelism": 2,
    # For hive sync metastore
    "hoodie.datasource.hive_sync.database": databaseName,
    "hoodie.datasource.hive_sync.table": tableName,
    "hoodie.datasource.hive_sync.mode": "jdbc",
    "hoodie.datasource.hive_sync.enable": "true",
    "hoodie.datasource.hive_sync.partition_fields": "part",
    "hoodie.datasource.hive_sync.partition_extractor_class": "org.apache.hudi.hive.MultiPartKeysValueExtractor",
    # For concurrency write locks with hive metastore
    "hoodie.write.concurrency.mode": "optimistic_concurrency_control",
    "hoodie.cleaner.policy.failed.writes": "LAZY",
    "hoodie.write.lock.provider": "org.apache.hudi.hive.HiveMetastoreBasedLockProvider",
    "hoodie.write.lock.hivemetastore.database": databaseName,
    "hoodie.write.lock.hivemetastore.table": tableName,
    "hoodie.write.lock.wait_time_ms": "12000",
    "hoodie.write.lock.num_retries": "4",
    "hoodie.embed.timeline.server": "false",
    "hoodie.datasource.write.commitmeta.key.prefix": "deltastreamer.checkpoint.key",
}

(df.write.format("hudi").options(**hudi_options).mode("append").save(basePath))
print(
    "@@@@@@@@@@@@@@@@ COUNT={} @@@@@@@@@@@@@@@@@@".format(
        spark.read.format("hudi").load(basePath).count()
    )
)

Bash script:

#!/usr/bin/env bash
NB=$1
rm -rf /tmp/test_hudi_pyspark_local/
python3 concurrent.py
seq 1 $NB  | xargs -n 1 -P $NB python3 concurrent.py

Run it:

./conccurrent.sh 4

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:21 (20 by maintainers)

Top GitHub Comments

1reaction

parisnicommented, Oct 13, 2021

@jdattani AFAIK, only spark sql features are broken on 3.1

0reactions

nsivabalancommented, Mar 19, 2022

@parisni : will go ahead and close this issue out. If you hit any more issues, feel free to create new issue w/ details. thanks!

Top Results From Across the Web

[GitHub] [hudi] parisni commented on issue #3731: [SUPPORT ...

[GitHub] [hudi] parisni commented on issue #3731: [SUPPORT] Concurrent write (OCC) on distinct partitions random errors · Previous message · View by thread...

Isolation levels and write conflicts on Databricks

Learn about the isolation levels and potential conflicts when performing concurrent transactions on tables on Databricks.

Partitioned Storage Engine for Enhanced Concurrent ...

In this project, we explore partitioning (i.e., sharding) to enhance the concurrent performance of our LSMTree. We develop a hash partitioner and then...

Managing Spark Partitions with Coalesce and Repartition

Spark splits data into partitions and executes computations on the partitions in parallel. You should understand how data is partitioned and when you...

Best practices and strategies for Kafka topic partitioning

Random partitioning results in the evenest spread of load for consumers, and thus makes scaling the consumers easier. It is particularly suited ...