question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

spark sql MERGE INTO There is an error Error: Error running query: java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions: List(1, 0) (state=,code=0)

See original GitHub issue
CREATE TABLE IF NOT EXISTS cdp.test_merge_001(offline_channel STRING COMMENT '_pk',unique_key STRING COMMENT '_ck',open_id STRING COMMENT '',mobile string COMMENT '_ck',hobby STRING COMMENT '',activity_time STRING COMMENT '' )
 USING iceberg;

CREATE TABLE IF NOT EXISTS cdp.test_merge_002(offline_channel STRING COMMENT '_pk',unique_key STRING COMMENT '_ck',open_id STRING COMMENT '',mobile string COMMENT '_ck',hobby STRING COMMENT '',activity_time STRING COMMENT '' )
 USING iceberg;

this is table test_merge_002: image

this is table test_merge_001: image

but run this sql,appear error image

MERGE INTO cdp.test_merge_002
 tt1  USING 
(SELECT * FROM cdp.test_merge_001) tt2 ON (  tt1.unique_key = tt2.unique_key AND tt1.mobile = tt2.mobile) WHEN MATCHED THEN UPDATE SET tt1.offline_channel = tt2.offline_channel,
tt1.unique_key = tt2.unique_key,
tt1.open_id = tt2.open_id,
tt1.mobile = tt2.mobile,
tt1.hobby = tt2.hobby,
tt1.activity_time = tt2.activity_time  WHEN NOT MATCHED THEN INSERT *

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:21 (15 by maintainers)

github_iconTop GitHub Comments

0reactions
KarlManongcommented, Jul 15, 2021

How were you running on that build? My first guess would be that that version wasn’t actually present at runtime

@RussellSpitzer I rebuilt a table with exactly the same statement(using Trino), and everything worked well. The only difference is that the old table has some data.

The failed logs: s-bigdata-402-5918f407-84557c7aa8cf66dc-driver-spark-kubernetes-driver-log.txt

The succeed logs: s-bigdata-402-21809172-57074b7aa8d5269c-driver-spark-kubernetes-driver-log.txt

The sql: create table.txt merge.txt

I run the exception sql on spark-thriftserver, and it worked. May be the old application has some problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[GitHub] [iceberg] KarlManong commented on issue #2533: spark ...
... sql MERGE INTO There is an error Error: Error running query: java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions: ...
Read more >
Can't zip RDDs with unequal numbers of partitions: List(2, 1 ...
It is a bug in AQE, clearly, for the version of Spark you are running. Set AQE out. zip works with RDD partitions...
Read more >
Can't zip RDDs with unequal numbers of partitions ... - Re
(See it here - http://pastebin.dqd.cz/RAhm/) After I've increased spark.sql.autoBroadcastJoinThreshold to 300000 from 100000 it went through ...
Read more >
Solving 5 Mysterious Spark Errors | by yhoztak - Medium
This error usually happens when two dataframes, and you apply udf on some columns to transfer, aggregate, rejoining to add as new fields...
Read more >
MNIST example cannot run because of RDD.zip() #100 - GitHub
The problem is that the zip operation assumes that the number of partitions AND the number of elements within each partition will be...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found