[SUPPORT] UPDATE command doest not working on Spark SQL
See original GitHub issueI’ve tried use SparkSQL for update rows in my table, but I’m receiving the below error:
183073 [Thread-3] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.jdbc.timeout does not exist
183075 [Thread-3] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.retries.wait does not exist
184478 [Thread-3] WARN org.apache.hadoop.hive.metastore.ObjectStore - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
184478 [Thread-3] WARN org.apache.hadoop.hive.metastore.ObjectStore - setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore UNKNOWN@172.17.0.2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/spark/python/pyspark/sql/session.py", line 723, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
File "/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
File "/opt/spark/python/pyspark/sql/utils.py", line 111, in deco
return f(*a, **kw)
File "/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o27.sql.
: java.lang.UnsupportedOperationException: UPDATE TABLE is not supported temporarily.
at org.apache.spark.sql.execution.SparkStrategies$BasicOperators$.apply(SparkStrategies.scala:716)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:67)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)
To Reproduce
I saved a dataframe as Hudi format and load it to Hudi table
spark.sql('create table events using hudi options (primaryKey = "id", preCombinedField = "updated_at", type ="cow") location "/tmp/data/delta/events"')
Then I tried update a row
spark.sql('update events set name = "eita" where id = 244603')
Environment Description
-
Hudi version : 0.9.0
-
Spark version : 3.1.2
-
Storage (HDFS/S3/GCS…) : Local
-
Running on Docker? (yes/no) : yes
My setup https://github.com/jasondavindev/delta-lake-dms-cdc/blob/main/apps/hudi.py
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
update query in Spark SQL - Stack Overflow
Spark SQL doesn't support UPDATE statements yet. Hive has started supporting UPDATE since hive version 0.14. But even with Hive, it supports ...
Read more >Spark SQL - Update Command - Cloudera Community - 136799
Solved: I am trying to update the value of a record using spark sql in spark shell I get executed the command -...
Read more >UPDATE | Databricks on AWS
Updates the column values for the rows that match a predicate. When no predicate is provided, update the column values for all rows....
Read more >Spark SQL Upgrading Guide - Spark 2.4.0 Documentation
In Spark version 2.3 and earlier, HAVING without GROUP BY is treated as WHERE. This means, SELECT 1 FROM range(10) HAVING true is...
Read more >Table deletes, updates, and merges - Delta Lake Documentation
You can update data that matches a predicate in a Delta table. ... See Configure SparkSession for the steps to enable support for...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@jasondavindev you’d need to use java 8 for hudi. pls follow readme for build instructions
@xushiyan Thanks! I built the image, but when I trying write a dataframe, I receive the error
I found a issue related to this error, but it was a compatibility issue (0.4.x version). You can see my application here https://github.com/jasondavindev/delta-lake-dms-cdc/blob/main/apps/hudi_update.py
Using
0.9.0
version was written successfully, but not update