[hivewriter] with partition result with empty data
See original GitHub issuemy flinkx config with kafkareader and hivewriter like this
{
"job": {
"content": [
{
"reader": {
"parameter": {
"topic": "flinkx_test",
"mode": "earliest-offset",
"codec": "json",
"consumerSettings": {
"bootstrap.servers": "demo:9092"
},
"column": [
{
"name": "distinct_id",
"type": "string"
},
{
"name": "event_id",
"type": "string"
},
{
"name": "event",
"type": "string"
},
{
"name": "properties_push_content",
"type": "string"
}
]
},
"name": "kafkareader"
},
"writer": {
"parameter": {
"jdbcUrl": "jdbc:hive2://ip:10000/demo;principal=hive/_HOST@demo.COM",
"username": "demo",
"fileType": "parquet",
"writeMode": "append",
"compress": "SNAPPY",
"charsetName": "UTF-8",
"maxFileSize": 134217728,
"tablesColumn": "{\"flinkx_test\":[{\"key\":\"distinct_id\",\"type\":\"string\"},{\"key\":\"event_id\",\"type\":\"string\"},{\"key\":\"event\",\"type\":\"string\"},{\"key\":\"type\",\"type\":\"string\"},{\"key\":\"time\",\"type\":\"string\"},{\"key\":\"properties_push_content\",\"type\":\"string\"},{\"key\":\"properties_push_status\",\"type\":\"string\"}]}",
"partition": "pt_mi",
"partitionType": "MINUTE",
"defaultFS": "hdfs://nameHAservice",
"hadoopConfig": {}
},
"name": "hivewriter"
}
}
],
"setting": {
"restore": {
"isRestore": false,
"isStream": false
},
"speed": {
"readerChannel": 1,
"writerChannel": 1
}
}
}
}
and the table and partition path is created in hdfs. but the data path is empty, no parquet file exist;
spark-sql> dfs -du -h /user/hive/warehouse/demo.db/flinkx_test;
0 /user/hive/warehouse/demo.db/flinkx_test/pt_mi=202105251620
0 /user/hive/warehouse/demo.db/flinkx_test/pt_mi=202105251621
0 /user/hive/warehouse/demo.db/flinkx_test/pt_mi=202105251622
spark-sql>
so why the data not write in the path.
I use this kafka reader cant write to mysql, but not working with hivewrite.
any idea would be appreciated!
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Query from partitioned table returns empty result
MSCK REPAIR TABLE <tablename>;. But the problem is when new partition is added, say new date, we need to run this command again...
Read more >Does INSERT OVERWRITE create new empty partition if ...
Yes, it will create empty partition even if SELECT returned 0 results. ... and run SELECT part again and check if it returns...
Read more >empty result on hive table with integer partition keys #2029
Querying this table with presto seems to work fine. Haven't done anything special to the hdfs config except to add the AWS key...
Read more >Hive Writer - Striim
When a HiveWriter target is deployed on multiple Striim servers, partition the input stream or use an environment variable in table mappings ...
Read more >External Hive Partitioned table, is empty!! - Cloudera Community
Im trying to create an external hive partitioned table which location ... to get my appending HDFS data in to an external Partition...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
配置CK间隔,CK时flush数据
semantic语义配置
exactly-once
时,会以2pc模式提交;at-least-once
时,会直接提交到db;jdbc
exactly-once
时开启con.setAutoCommit(false)
stmt.addBatch
批量写操作tx.commit
;tx.rollback
;hdfs
parquet/orc
的writer.write到内存;.data
目录的文件),将.data/xx.parquet
文件复制到实际数据目录;.data/xx.parquet
;ckComplete
操作;