Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SUPPORT] Use hudi-java-client to create hudi table incorrect

See original GitHub issue

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced Use hudi-java-client create hudi table failed

A clear and concise description of the problem.

To Reproduce

Steps to reproduce the behavior:

  1. write create hudi table codes
public static void main(String[] args) throws Exception {

        String tablePath = "/home/work/hudi_catalog1";
        String tableName = "hudi_clyang_table";

        // 测试数据器
        HoodieExampleDataGenerator<HoodieAvroPayload> dataGen = new HoodieExampleDataGenerator<>();
        //init hadoopconf
        Configuration hadoopConf = new Configuration();
        hadoopConf.addResource(new Path("core-site.xml"));
        hadoopConf.addResource(new Path("hdfs-site.xml"));
        System.setProperty("HADOOP_USER_NAME", "work");
        //init properties
        Properties properties = new Properties();

        // 初始化表
        Path path = new Path(tablePath);
        FileSystem fs = FileSystem.newInstance(hadoopConf);

        HoodieTableMetaClient hoodieTableMetaClient = null;
        if (!fs.exists(path)) {
            hoodieTableMetaClient = HoodieTableMetaClient.initTableAndGetMetaClient(hadoopConf, tablePath, properties);

        // 创建write client conf
        HoodieWriteConfig hudiWriteConf = HoodieWriteConfig.newBuilder()
                // 数据schema
                // 数据插入更新并行度
                .withParallelism(2, 2)
                // 数据删除并行度
                // hudi表索引类型,内存
                // 合并
                .withCompactionConfig(HoodieCompactionConfig.newBuilder().archiveCommitsWith(20, 30).build())

        // 获得hudi write client
        HoodieJavaWriteClient<HoodieAvroPayload> client = new HoodieJavaWriteClient<>(new HoodieJavaEngineContext(hadoopConf), hudiWriteConf);

        properties = HoodieTableMetaClient.withPropertyBuilder()
        HoodieTableMetaClient hoodieTableMetaClient1 = HoodieTableMetaClient.initTableAndGetMetaClient(hadoopConf, tablePath, properties);
        HoodieTable table = HoodieJavaTable.create(hudiWriteConf, new HoodieJavaEngineContext(hadoopConf), hoodieTableMetaClient1);
  1. executed success and no error. log as following
0    [main] WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
707  [main] INFO  org.apache.hudi.client.embedded.EmbeddedTimelineService  - Starting Timeline service !!
707  [main] WARN  org.apache.hudi.client.embedded.EmbeddedTimelineService  - Unable to find driver bind address from spark config
717  [main] INFO  org.apache.hudi.common.table.view.FileSystemViewManager  - Creating View Manager with storage type :MEMORY
718  [main] INFO  org.apache.hudi.common.table.view.FileSystemViewManager  - Creating in-memory based Table View
728  [main] INFO  - Logging initialized @1483ms to
943  [main] INFO  io.javalin.Javalin  - 
           __                      __ _
          / /____ _ _   __ ____ _ / /(_)____
     __  / // __ `/| | / // __ `// // // __ \
    / /_/ // /_/ / | |/ // /_/ // // // / / /
    \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
944  [main] INFO  io.javalin.Javalin  - Starting Javalin ...
1072 [main] INFO  io.javalin.Javalin  - Listening on http://localhost:57856/
1072 [main] INFO  io.javalin.Javalin  - Javalin started in 131ms \o/
1072 [main] INFO  org.apache.hudi.timeline.service.TimelineService  - Starting Timeline server on port :57856
1073 [main] INFO  org.apache.hudi.client.embedded.EmbeddedTimelineService  - Started embedded timeline server at
1079 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Initializing /home/work/hudi_catalog1 as hoodie table /home/work/hudi_catalog1
2109 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from /home/work/hudi_catalog1
2161 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from /home/work/hudi_catalog1/.hoodie/
2327 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/work/hudi_catalog1
2327 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished initializing Table of type COPY_ON_WRITE from /home/work/hudi_catalog1
2331 [main] INFO  org.apache.hudi.common.table.view.FileSystemViewManager  - Creating View Manager with storage type :REMOTE_FIRST
2331 [main] INFO  org.apache.hudi.common.table.view.FileSystemViewManager  - Creating remote first table view
2332 [main] INFO  org.apache.hudi.client.AbstractHoodieClient  - Stopping Timeline service !!
2332 [main] INFO  org.apache.hudi.client.embedded.EmbeddedTimelineService  - Closing Timeline server
2332 [main] INFO  org.apache.hudi.timeline.service.TimelineService  - Closing Timeline Service
2332 [main] INFO  io.javalin.Javalin  - Stopping Javalin ...
2342 [main] INFO  io.javalin.Javalin  - Javalin has stopped
2342 [main] INFO  org.apache.hudi.timeline.service.TimelineService  - Closed Timeline Service
2342 [main] INFO  org.apache.hudi.client.embedded.EmbeddedTimelineService  - Closed Timeline server
  1. check result in hdfs image

Expected behavior

Should be create hudi table corrected Environment Description jdk1.8

  • Hudi version : 0.9.0
  • Spark version : 3.0.2
  • Hive version : 2.3.6
  • Hadoop version : 2.7
  • Storage (HDFS/S3/GCS…) : HDFS
  • Running on Docker? (yes/no) : no

Additional context

Add any other context about the problem here.


Add the stacktrace of the error.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

James601232commented, Nov 14, 2021

@xushiyan yes. codes workde fine. just for path incorrect when use spark read data. this issue already solved. thank u su very much

xushiyancommented, Nov 13, 2021

but no data insert into table

You meant no parquet file written at all?

94684 [main] INFO - Merging new data into oldPath /home/work/hudi_catalog4/2020/01/03/158308ae-8ac1-474c-85bc-6f0c4f88243d-0_0-0-0_20211111203048.parquet, as newPath /home/work/hudi_catalog4/2020/01/03/158308ae-8ac1-474c-85bc-6f0c4f88243d-0_0-0-0_20211112071954.parquet

The log showed some files written. Can you also exam your content in .hoodie/ and see the content in commit files. the logs showed that data committed. client.insert(writeRecords, newCommitTime); returns a list of write status. can you debug that? Basically you’re running the client example code. I don’t see why this won’t work. Can you debug the application yourself first

Read more comments on GitHub >

github_iconTop Results From Across the Web

[SUPPORT]Hudi Java client don't support Multi Writing #6584
Hudi Java client don't support Multi Writing, and throw errors: "Cannot resolve conflicts for overlapping writes" Hudi version: 0.12.0, ...
Read more >
[SUPPORT] Use hudi-java-client to create hudi table incorrect
FileSystemViewManager - Creating View Manager with storage type :MEMORY 81147 [main] INFO org.apache.hudi.common.table.view.
Read more >
Troubleshooting - Apache Hudi
The fix for this is to try and create uber schema using all the schema versions ... Such incompatible data type conversions are...
Read more >
Synchronizing Hudi Table Data to Hive_MapReduce ... - 华为云
You can run to synchronize data in the Hudi table to Hive.For example, run the following command to synchronize the Hudi table...
Read more >
Apache Hudi Native AWS Integrations - Onehouse
All of these services help you efficiently build an industry leading Lakehouse data platforms with EMR. Many people don't realize, but you can ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Post

No results found

github_iconTop Related Hashnode Post

No results found