I see weird behavior.
I have Iceberg table in aws glue (created in Glue from Athena).
This table is being updated every 1 hour from Athena(DELETE+INSERT).
Also Table Optimization is enabled in AWS glue(runs Optimize periodically)
My problem is that in Athena
select * from "table$files"
outputs 181 rows.
While when i run this command from Dremio:
select * from TABLE(table_files('"..table'))
it outputs 970 rows.
Table was created recently so it seems that this number(970) is growing…
When i ran OPTIMIZE from Athena number of files decrease in Athena select * from "table$files"
query, but not in Dremio.
table_manifests, table_snapshots outputs are same in Athena and Dremio…
Can someone explain this behavior?
Can it be that dremio shows files from all metadata files, while Athena shows latest(still strange cos i ran VACUUM and number fo files outputted by Dremio has not decreased)
Maybe this is important note - DELETEs are issues from Athena and they use MOR mode(COR is not supported by Athena) While Dremio doesn’t support MOR delete mode, maybe this can have any side effect