Iceberg specific files (manifest files, manifest list file, metadata file) are not created, only Flink specific manifest file is created
See original GitHub issue[ Update on the issue - an updated description which describes the problem more accurately: Jan 12, 2021 ]
Hello, Iceberg community,
I’m facing an issue with metadata files in S3 for the Iceberg table. After the job was running for a while, and few times suspended via savepoints and started again, Flink only creates Flink specific manifest files, and Iceberg specific files are not created. Only in the next job suspension time, Flink will create Iceberg specific files and remove Flink specific manifest files.
Flink job consists of single Kafka source, mapper and a FlinkSink.
Iceberg Flink connector 0.10.0
version is used.
Flink job ingests data into S3, with a checkpointing interval of 1 hour.
Normal scenario - expected behavior every hour in the metadata folder 3 files are created:
- Metadata file: 00001-180a83f3-c229-4ed5-a9dc-2f9c235f6d52.metadata.json
- Manifest list file: snap-5807373598091371828-1-87p0b872-3f55-9b79-8cee-b1d354f2c378.avro
- Manifest file: 39d0b872-4f56-4b79-8cee-c0a354f2c575-m0.avro
After a few successful checkpoints and the job is suspended via savepoint, the job started again: every hour in the metadata folder only a single file is created
- Manifest file : 4c806ffdb03c41e09337b90f18781570-00000-0-466-00001.avro
The metadata file and manifest list file are NOT created or updated either.
This causes the issue of new data(partitions) not be available until the metadata and manifest list file is created.
When the next time the Flink job is suspended again (via savepoint), the Flink job will create Iceberg specific files for all missing checkpoints (thus the missing already ingested partitions will become visible Iceberg), and remove Flink specific manifest files, and then shut down the job.
============================ [BELOW is an outdated old description, please ignore]
Hello, Iceberg community,
I’m facing an issue with metadata files in S3 for the Iceberg table when bucket versioning is enabled for S3 bucket.
Iceberg Flink connector 0.10.0
version is used.
Flink job ingests data into S3, with a checkpointing interval of 1 hour.
for S3 bucket with versioning disabled: every hour in the metadata folder 3 files are created:
- Metadata file: 00001-180a83f3-c229-4ed5-a9dc-2f9c235f6d52.metadata.json
- Manifest list file: snap-5807373598091371828-1-87p0b872-3f55-9b79-8cee-b1d354f2c378.avro
- Manifest file: 39d0b872-4f56-4b79-8cee-c0a354f2c575-m0.avro
for S3 bucket with versioning enabled: every hour in the metadata folder only a single file is created
- Manifest file : 4c806ffdb03c41e09337b90f18781570-00000-0-466-00001.avro
The metadata file and manifest list file are NOT created or updated either.
This causes the issue of new data(partitions) not be available until the metadata and manifest list file is created.
If I restart the Flink job, the new metadata and manifest list file are created, and the missing partitions become visible again.
Few questions:
-
Did anyone face a similar issue with the S3 bucket versioning?
-
Why metadata files and manifest list files are not created/updated with every checkpoint?
-
How can S3 bucket versioning impact the manifest file version? “-m0.avro” suffix V1 version vs “.avro” suffix V2 version
Thank you.
Issue Analytics
- State:
- Created 3 years ago
- Comments:23 (3 by maintainers)
Anyone facing the same issue, to bypass the issue until this bug is fixed on the Flink side: You do not need to fix corrupted metadata files where
flink.max-committed-checkpoint-id
is set toLong.MAX_VALUE
.Just follow this workflow for stateful upgrades going forward - this flow works as expected and you do not get corrupted metadata file anymore:
./bin/flink savepoint ${JOB_ID} /tmp/flink-savepoints
After this flow, your
flink.max-committed-checkpoint-id
will be set to correctcheckpointId
.Hi @elkhand, I have reported your investigation and my speculation in https://issues.apache.org/jira/browse/FLINK-21132. Hopefully, there will be response in next few days.