[SUPPORT] How to create a hudi table without suffix in snapshot read mode using SparkSQL
See original GitHub issue@YannByron Hello
https://issues.apache.org/jira/browse/HUDI-4487 this fix to create rt/ro manually,but it forbid to use options(hoodie.query.as.ro.table
= ‘false’) in spark sql to create table,
Now how can i create a hudi table without suffix in snapshot read mode using SparkSQL. we just want to use hudi tale like a rdbms table, query a table can read all the data, not using suffix table name( which will make users increase learning costs and confuse SQL usage)
CREATE TABLE IF NOT EXISTS `default`.`hudi_test_snapshot_mode` (
`id` INT
,`name` STRING
,`age` INT
,`sync_time` TIMESTAMP
) USING HUDI
OPTIONS(
`hoodie.query.as.ro.table` = 'false
)
TBLPROPERTIES (
type = 'mor'
,primaryKey = 'id'
,preCombineField = 'sync_time'
,`hoodie.compaction.payload.class` = 'org.apache.hudi.common.model.OverwriteWithLatestAvroPayload'
,`hoodie.datasource.write.hive_style_partitioning` = 'false'
,`hoodie.table.keygenerator.class` = 'org.apache.hudi.keygen.NonpartitionedKeyGenerator'
,`hoodie.index.type` = 'GLOBAL_BLOOM'
)
-
Hudi version : 0.12.1
-
Spark version : 3.1.3
-
Hive version : 3.1.0
-
Hadoop version : 3.1.1
-
Storage (HDFS/S3/GCS…) : HDFS
-
Running on Docker? (yes/no) : no
Additional context
Add any other context about the problem here.
Stacktrace
Exception in thread "main" org.apache.spark.sql.AnalysisException: Creating ro/rt table need the existence of the base table.
at org.apache.spark.sql.hudi.command.CreateHoodieTableCommand.run(CreateHoodieTableCommand.scala:74)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3700)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3698)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:228)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)```
Issue Analytics
- State:
- Created 10 months ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Spark Guide - Apache Hudi
Create Table . Scala; Python; Spark SQL. // scala // No separate create table command required in spark. First batch of write to...
Read more >Spark Guide - Apache Hudi
This guide provides a quick peek at Hudi's capabilities using spark-shell. Using Spark datasources, we will walk through code snippets that allows you...
Read more >Spark Guide - Apache Hudi
This guide provides a quick peek at Hudi's capabilities using spark-shell. Using Spark datasources, we will walk through.
Read more >All Configurations | Apache Hudi
For Snapshot query on merge on read table, control whether we invoke the ... The following set of configurations help validate new data...
Read more >Writing Data | Apache Hudi
Generate some new trips, overwrite the table logically at the Hudi metadata level. The Hudi cleaner will eventually clean up the previous table...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I found another way to solve my problem. Create a table according to the rule of 0.12, and then use
alter table set tblproperties
to realize RT table functionshttps://github.com/apache/hudi/issues/7322