Hello,
I have Iceberg tables on a Cloudian S3 Bucket connected with Dremio. After I execute a vacuum Statement via Dremio SQL Query, I can still see the vXX.metadata.json files on my file system and they continually keep growing in size. All other files (.avro, snapshot, data files) are correctly deleted according to the timestamp I provided.
Is there a reason for this behavior or an option to delete these files too?
E.g.:
VACUUM TABLE tablepath
EXPIRE SNAPSHOTS older_than ‘2024-09-20 00:00:00.000’
Thanks in advance,
Sebastian
1 Like
@Sebastian metadata.json only for expired snapshots would be cleared. What is the timestamp on the metadata.json file you think should be deleted?
Hi Balaji,
the timestamp of the metadata.json file I would expect to be deleted is older than the one provided in the query. The same occurs if I provide a retain_last, all files except the metadata.json are correctly deleted up to the provided amount / timestamp and only metadata.json remain looking like the screenshot below.
criteria: older_than ‘2024-07-31 11:37:00.000’
@Sebastian
The ExpireSnapshots query doesn’t clean the metadata json files. Because, they are not part of old snapshots. There are two ways you can clean old metadata json files
-
Set the Iceberg table’s property: write.metadata.delete-after-commit.enabled
. Then, Iceberg will be able to remove old metadata files, when the old ones are over a given number, 100 by default. (More readings).
-
Use Spark’s current query: RemoveOrphanFiles query, which can help to vacuum those orphan metadata files.
Hello Balaji,
thanks for your answer.
How is the removal triggered in option 1, when I have set the tableproperties?
@Sebastian Are you saying you have set the table property and not working as expected?
If I do not have to trigger the file deletion anyhow, then yes I set the properties as above and still have the standalone metadata.json files which exceed the amount of 20 specified in the properties.
@Sebastian How many days do those 20 metadata.json span? Can you add a listing?
Hello Balaji,
the active snapshots span about 2 months. I ran the last optimize & vacuum statement at beginning of October, resulting in the inactive metadata.json files below.
Any chance you can send the output of the below command from Spark SQL?
show create table <table_name>
describe formatted table <table_name>
I am only using Dremio SQL to work with Iceberg tables, here are the results from running these queries in Dremio (my table is a copy of sys.recent_jobs that I am using to test Iceberg functions).
show create table TEST.“ICEBERG_PARTITION”.“iceberg_test_jobs”;
path |
sql_definition |
[TEST.ICEBERG_PARTITION.iceberg_test_jobs] |
SELECT * FROM TEST.ICEBERG_PARTITION.iceberg_test_jobs |
describe table TEST.“ICEBERG_PARTITION”.“iceberg_test_jobs”;
COLUMN_NAME |
DATA_TYPE |
IS_NULLABLE |
NUMERIC_PRECISION |
NUMERIC_SCALE |
EXTENDED_PROPERTIES |
MASKING_POLICY |
SORT_ORDER_PRIORITY |
job_id |
CHARACTER VARYING |
YES |
|
|
|
|
|
status |
CHARACTER VARYING |
YES |
|
|
|
|
|
query_type |
CHARACTER VARYING |
YES |
|
|
|
|
|
user_name |
CHARACTER VARYING |
YES |
|
|
|
|
|
queried_datasets |
CHARACTER VARYING |
YES |
|
|
|
|
|
scanned_datasets |
CHARACTER VARYING |
YES |
|
|
|
|
|
attempt_count |
INTEGER |
YES |
32 |
0 |
|
|
|
submitted_ts |
TIMESTAMP |
YES |
|
|
|
|
|
attempt_started_ts |
TIMESTAMP |
YES |
|
|
|
|
|
metadata_retrieval_ts |
TIMESTAMP |
YES |
|
|
|
|
|
planning_start_ts |
TIMESTAMP |
YES |
|
|
|
|
|
query_enqueued_ts |
TIMESTAMP |
YES |
|
|
|
|
|
engine_start_ts |
TIMESTAMP |
YES |
|
|
|
|
|
execution_planning_ts |
TIMESTAMP |
YES |
|
|
|
|
|
execution_start_ts |
TIMESTAMP |
YES |
|
|
|
|
|
final_state_ts |
TIMESTAMP |
YES |
|
|
|
|
|
submitted_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
attempt_started_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
metadata_retrieval_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
planning_start_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
query_enqueued_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
engine_start_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
execution_planning_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
execution_start_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
final_state_epoch_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
planner_estimated_cost |
DOUBLE |
YES |
53 |
|
|
|
|
rows_scanned |
BIGINT |
YES |
64 |
0 |
|
|
|
bytes_scanned |
BIGINT |
YES |
64 |
0 |
|
|
|
rows_returned |
BIGINT |
YES |
64 |
0 |
|
|
|
bytes_returned |
BIGINT |
YES |
64 |
0 |
|
|
|
accelerated |
BOOLEAN |
YES |
|
|
|
|
|
queue_name |
CHARACTER VARYING |
YES |
|
|
|
|
|
engine |
CHARACTER VARYING |
YES |
|
|
|
|
|
error_msg |
CHARACTER VARYING |
YES |
|
|
|
|
|
query |
CHARACTER VARYING |
YES |
|
|
|
|
|
is_profile_incomplete |
BOOLEAN |
YES |
|
|
|
|
|
execution_allocated_bytes |
BIGINT |
YES |
64 |
0 |
|
|
|
execution_cpu_time_millis |
BIGINT |
YES |
64 |
0 |
|
|
|
@balaji.ramaswamy We updated on 25.1.1 and now I get the opposite result when running optimize and vacuum.
The metadata.json files are being correctly deleted but the snapshot and datafiles remain in the S3 storage.
In the screenshot below the last valid snapshots after vacuum are v38 (left) and v41 (right). I changed nothing to the code or table properties we discussed prior. Has anything changed with the new Dremio version?
And is there an option to remove legacy / outdated / invalid Iceberg files from the storage using Dremio SQL (like the spark “removeOrphanFiles”)?
@Sebastian There is a MIN_SNAPSHOTS_TO_KEEP = 5;
AFAIK, metadata.json will not get cleaned in new version for Vacuum command, let me check if there is a change in behavior