question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SUPPORT]How to auto sync "add column" by flink ?

See original GitHub issue

To Reproduce

Steps to reproduce the behavior:

  1. Create a mysql-cdc table A_cdc to capture mysql table A by flink sql
  2. Create a hudi table B_hudi to write data to hive table B by flink sql
  3. Execute flink sql insert into B_hudi select * from A_cdc
  4. Add a column to table A

It seems that if you want to add columns, you must redefine the schema and restart the task.

How to sync “add column” event without restart? Screenshot from 2021-11-10 17-16-02

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version : 0.10.0-SNAPSHOT

  • Spark version :

  • Hive version :

  • Hadoop version :

  • Storage (HDFS/S3/GCS…) :

  • Running on Docker? (yes/no) : no

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
lsyldliucommented, Dec 4, 2021

IMO, this requires the flink can support schema evolution, current community flink doesn’t support it, you can consider use aliyun enterprise flink, we are support it.

0reactions
codopecommented, Apr 20, 2022

Closing this issue. As per @danny0405 this is hard to do w/o making changes in te flink engine in the flink community version. Please log an issue in flink.

Read more comments on GitHub >

github_iconTop Results From Across the Web

CREATE Statements | Apache Flink
Flink SQL supports the following CREATE statements for now: CREATE TABLE ... The column's data type is derived automatically from the given expression...
Read more >
flink-cdc-connectors/mysql-cdc.md at master - GitHub
The MySQL CDC connector allows for reading snapshot data and incremental data from MySQL database. This document describes how to setup the MySQL...
Read more >
Add option to automatically include (or exclude) new columns
I have a use case where a source table has 500+ columns, more are created on a regular basis, and I only want...
Read more >
Flink - Table SQL API - Add a column to a table - Stack Overflow
The first SQL query in not immediately executed. It is more similar to a view definition that is automatically in-lined and optimized together ......
Read more >
Create a Message Queue for Apache Kafka source table
Flink can infer the data types of columns in a table only in the JSON ... during data synchronization, Flink attempts to automatically...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found