[SUPPORT] `show fsview latest` throwing IllegalStateException...pending compactions for merge_on_read table
See original GitHub issueDescribe the problem you faced
When trying to run show fsview latest --partitionPath dt=2022-06-01
from the hudi-cli on a MERGE_ON_READ table, we receive an exception
java.lang.IllegalStateException: Hudi File Id (HoodieFileGroupId{partitionPath='dt=2022-06-15', fileId='797bf5d2-f24e-4645-a337-c6978dc95a9f-0'}) has more than 1 pending compactions.
We tried a bunch of flags, but all of them threw the same exception:
show fsview latest --partitionPath dt=2022-06-01 --readOptimizedOnly true
show fsview latest --partitionPath dt=2022-06-01 --includeInflight true
show fsview latest --partitionPath dt=2022-06-01 --excludeCompaction true
show fsview latest --partitionPath dt=2022-06-01 --merge true
show fsview latest --partitionPath dt=2022-06-01 --readOptimizedOnly true --includeInflight true --excludeCompaction true --merge true
To Reproduce
Run show fsview latest --partitionPath dt=2022-06-01
on a MERGE_ON_READ table where some compaction is pending
Expected behavior
Our expectation is to receive the list of all the files (base and the delta or at least the base files) for a given partition.
Environment Description
-
Hudi version : 0.5.0-incubating
-
Spark version : 2.4.4
-
Hive version : 3.1.2
-
Hadoop version : 3.2.1
-
Storage (HDFS/S3/GCS…) : S3
-
Running on Docker? (yes/no) : no
Additional context
We are trying to check whether we are keeping a lot of stale files in S3 and understand how much is the ratio [number or size of stale files / number or size of all the files]. So, any pointer to an alternative approach will also be helpful.
Stacktrace
Command failed java.lang.IllegalStateException: Hudi File Id (HoodieFileGroupId{partitionPath='dt=2022-06-15', fileId='797bf5d2-f24e-4645-a337-c6978dc95a9f-0'}) has more than 1 pending compactions. Instants: (20220615172826,{"baseInstantTime": "20220615161228", "deltaFilePaths": [".797bf5d2-f24e-4645-a337-c6978dc95a9f-0_20220615161228.log.1_632-33030-18862910"], "dataFilePath": "797bf5d2-f24e-4645-a337-c6978dc95a9f-0_4-32989-18834108_20220615161228.parquet", "fileId": "797bf5d2-f24e-4645-a337-c6978dc95a9f-0", "partitionPath": "dt=2022-06-15", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 216.0, "TOTAL_LOG_FILES_SIZE": 1773005.0, "TOTAL_IO_WRITE_MB": 214.0, "TOTAL_IO_MB": 430.0}}), (20220615161228,{"baseInstantTime": "20220615144442", "deltaFilePaths": [".797bf5d2-f24e-4645-a337-c6978dc95a9f-0_20220615144442.log.1_701-32977-18829784"], "dataFilePath": "797bf5d2-f24e-4645-a337-c6978dc95a9f-0_1342-32926-18795310_20220615144442.parquet", "fileId": "797bf5d2-f24e-4645-a337-c6978dc95a9f-0", "partitionPath": "dt=2022-06-15", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 216.0, "TOTAL_LOG_FILES_SIZE": 1869763.0, "TOTAL_IO_WRITE_MB": 214.0, "TOTAL_IO_MB": 430.0}})
Hudi File Id (HoodieFileGroupId{partitionPath='dt=2022-06-19', fileId='644d4e16-7e2d-4939-8fe0-f787f3e03240-0'}) has more than 1 pending compactions. Instants: (20220620113713,{"baseInstantTime": "20220620105044", "deltaFilePaths": [".644d4e16-7e2d-4939-8fe0-f787f3e03240-0_20220620105044.log.1_364-480-318680"], "dataFilePath": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0_6-443-289906_20220620105044.parquet", "fileId": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0", "partitionPath": "dt=2022-06-19", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 222.0, "TOTAL_LOG_FILES_SIZE": 1609727.0, "TOTAL_IO_WRITE_MB": 220.0, "TOTAL_IO_MB": 442.0}}), (20220620105044,{"baseInstantTime": "20220620100020", "deltaFilePaths": [".644d4e16-7e2d-4939-8fe0-f787f3e03240-0_20220620100020.log.1_256-431-286367"], "dataFilePath": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0_66-394-258097_20220620100020.parquet", "fileId": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0", "partitionPath": "dt=2022-06-19", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 221.0, "TOTAL_LOG_FILES_SIZE": 621263.0, "TOTAL_IO_WRITE_MB": 220.0, "TOTAL_IO_MB": 441.0}})
java.lang.IllegalStateException: Hudi File Id (HoodieFileGroupId{partitionPath='dt=2022-06-19', fileId='644d4e16-7e2d-4939-8fe0-f787f3e03240-0'}) has more than 1 pending compactions. Instants: (20220620113713,{"baseInstantTime": "20220620105044", "deltaFilePaths": [".644d4e16-7e2d-4939-8fe0-f787f3e03240-0_20220620105044.log.1_364-480-318680"], "dataFilePath": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0_6-443-289906_20220620105044.parquet", "fileId": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0", "partitionPath": "dt=2022-06-19", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 222.0, "TOTAL_LOG_FILES_SIZE": 1609727.0, "TOTAL_IO_WRITE_MB": 220.0, "TOTAL_IO_MB": 442.0}}), (20220620105044,{"baseInstantTime": "20220620100020", "deltaFilePaths": [".644d4e16-7e2d-4939-8fe0-f787f3e03240-0_20220620100020.log.1_256-431-286367"], "dataFilePath": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0_66-394-258097_20220620100020.parquet", "fileId": "644d4e16-7e2d-4939-8fe0-f787f3e03240-0", "partitionPath": "dt=2022-06-19", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 221.0, "TOTAL_LOG_FILES_SIZE": 621263.0, "TOTAL_IO_WRITE_MB": 220.0, "TOTAL_IO_MB": 441.0}})
at org.apache.hudi.common.util.CompactionUtils.lambda$getAllPendingCompactionOperations$5(CompactionUtils.java:161)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at org.apache.hudi.common.util.CompactionUtils.getAllPendingCompactionOperations(CompactionUtils.java:149)
at org.apache.hudi.common.table.view.AbstractTableFileSystemView.init(AbstractTableFileSystemView.java:95)
at org.apache.hudi.common.table.view.HoodieTableFileSystemView.init(HoodieTableFileSystemView.java:87)
at org.apache.hudi.common.table.view.HoodieTableFileSystemView.<init>(HoodieTableFileSystemView.java:81)
at org.apache.hudi.common.table.view.HoodieTableFileSystemView.<init>(HoodieTableFileSystemView.java:72)
at org.apache.hudi.common.table.view.HoodieTableFileSystemView.<init>(HoodieTableFileSystemView.java:110)
at org.apache.hudi.cli.commands.FileSystemViewCommand.buildFileSystemView(FileSystemViewCommand.java:256)
at org.apache.hudi.cli.commands.FileSystemViewCommand.showLatestFileSlices(FileSystemViewCommand.java:130)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:216)
at org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:68)
at org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:59)
at org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:134)
at org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:533)
at org.springframework.shell.core.JLineShell.run(JLineShell.java:179)
at java.lang.Thread.run(Thread.java:748)
Issue Analytics
- State:
- Created a year ago
- Comments:12 (8 by maintainers)
Top GitHub Comments
guess, @minihippo is asking you to list “.hoodie” folder and post your output here. ensure the result is sorted based on file mod time.
@amit-ranjan-de Hudi version : 0.5.0-incubating is pretty ancient. Do you want to give 0.12.1 a try and see if problem resolves?