Hi Team,
I have an iceberg table called table_ingestion_log which gets updated frequently.
I have it in my Minio Storage and I observe that it’s size grows rapidly.
I tried using OPTIMIZE TABLE
once a day, but then it is not working. It’s size still keeps increasing.
My only recourse is to RECREATE table once in a while, so that it will lose all metadata (just keeping the size small)
Below shows the size of the table a few minutes after recreation. You can see that it’s the metadata size increases and a lot of small files get created (which I don’t want)
minio@minio-prod:~$ du -sh /mnt/data/finance/ctrl/table_ingestion_log/*
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32882-9839-1f6c-6e8c-15d9ffe1fd00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32884-6f8b-ae6a-1ff3-ed11077c5500
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32888-6fe1-a694-d415-f2460ffea700
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32889-c044-9d76-bf6d-0cc0aecbcf00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a3288e-1a56-bf84-aa60-f4e96d4cb500
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32891-4fe8-5585-1f54-50117d57aa00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32895-cb96-eb8d-1b2e-4f0a586bad00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32897-4e83-b94a-8167-327fc3d8bc00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a3289a-8081-1902-925f-056f87eac100
12K /mnt/data/finance/ctrl/table_ingestion_log/18a3289c-c5e4-7fb6-6ee9-3f8fee5a1300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328a0-ae1b-937d-9ae3-d8edc05f3f00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328a3-4b17-e18a-6621-b58fdd4c6b00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328a7-2e7d-498d-93b8-2195b9b7cf00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328aa-44c4-34d5-a823-b9db057d4700
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328ad-dbfe-8fed-cb07-1a3d1be4ec00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328af-665c-61fa-46ca-d55a5ec64a00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328b1-56ee-d7f2-4cfa-20d039a03300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a328b3-14e1-c4f4-3948-6eae52b7d800
12K /mnt/data/finance/ctrl/table_ingestion_log/18a3293e-722b-c337-b63a-fd06b51c1200
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32947-4d5c-a043-6911-afa02399cd00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329aa-12eb-37e4-57ed-476f1b37b800
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329ab-444b-0d7e-0bda-6f849acd9400
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329ad-aa6f-5d81-219c-24f127a6a800
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329af-4d35-4f13-7a9b-7589d8990d00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329b1-a5e5-d742-3a54-92ce40ec5300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329b2-909d-d1e0-40b5-4f2880ce2900
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329b7-55cc-d3db-7396-d95a19940200
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329bb-771c-6a86-4590-ea461976b400
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329bf-c289-684a-8e91-7a587b49ad00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329c2-6845-9bd0-fd88-b79b5a816800
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329c4-0302-9da2-b862-153c0c7d9b00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329c5-c685-492b-e040-75c60ca76d00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329cb-80f6-cee1-b615-233cd09e3700
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329cc-271c-adaa-8561-d2bdef301700
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329cd-568b-d942-f1c9-542b429d7300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329d0-690c-6709-659f-9c3bf1e61300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329d1-8c97-97c8-e835-39ec9b567d00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329d6-a736-6e1c-b88e-60b24917dd00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329d9-4c40-5e6d-0e4b-506842150c00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329e5-60b4-295e-548b-c0dfe4436c00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329f9-e348-fdbc-7151-0fb6b011fc00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a329fa-d646-0475-670a-57fddad2ad00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ade-905b-5242-a0d6-bbf974aefd00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ade-b50e-5cff-2a0a-ff1362ef1600
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ae1-452a-fadd-74d7-3778bebbf300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ae2-7f77-f80b-7e04-a610b1552400
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ae4-4a5e-9db2-14fb-64054f826300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ae4-e4e2-dc69-1775-e548ddb37000
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ae7-169f-0c1b-770b-bbdbfc406500
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32ae8-7102-ef9c-9b6a-955ea6d69d00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32aea-77b8-5057-627e-74fb1800bc00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32aea-a4a9-8792-974c-0b6c9f7ef100
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32aed-1e70-479d-222a-3f15b649d000
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32aee-79a2-930a-5b68-3d2b06956300
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32aef-aa94-d1a2-1224-95c072ca4b00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32af1-2dbf-4468-09fc-7fdbed1e0700
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32af3-3f16-ff30-a04a-936dfe365b00
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32af3-daf1-901d-0469-491b5ce5e400
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32b77-9dea-e2c1-46a2-b4ee245cb700
12K /mnt/data/finance/ctrl/table_ingestion_log/18a32b81-78de-3fa3-9169-90bf3c762c00
100K /mnt/data/finance/ctrl/table_ingestion_log/18a32bd6-e86d-d4b3-b3bb-d360fd637c00
1.7M /mnt/data/finance/ctrl/table_ingestion_log/metadata
and this is my table properties
SHOW TBLPROPERTIES Minio.finance.ctrl.table_ingestion_log;
I have read through this page (OPTIMIZE TABLE | Dremio Documentation) and tried all these commands.
ALTER TABLE Minio.finance.ctrl.table_ingestion_log SET TBLPROPERTIES (
'history.expire.min-snapshots-to-keep' = '0',
'history.expire.max-snapshot-age-ms' = '0',
'write.metadata.delete-after-commit.enabled' = 'true',
'write.metadata.previous-versions-max' = '0'
);
VACUUM TABLE {table_name} EXPIRE SNAPSHOTS OLDER_THAN '{current_time_str}' retain_last 1 ;
OPTIMIZE TABLE Minio.finance.ctrl.table_ingestion_log REWRITE MANIFESTS ;
OPTIMIZE TABLE Minio.finance.ctrl.table_ingestion_log REWRITE DATA (MIN_FILE_SIZE_MB=100, MAX_FILE_SIZE_MB=1000, TARGET_FILE_SIZE_MB=512);
I set ‘{current_time_str}’ to the current time to expire everything, but then the metadata still grows.
WHat have I done wrong?