Hi @balaji.ramaswamy we make some new test reproducing same behavior
I attach more information that can help you we run (before and after vacuum) also attached vacuum query profile
select * from TABLE( table_history( 'lake.prod.posts' ) );
select * from TABLE( table_snapshot( 'lake.prod.posts' ) );
select * from TABLE( table_files( 'lake.prod.posts' ) );
before vacuum 233GB on disk , table files > 4030
after vacuum 209G on disk
select SUM(file_size_in_bytes) from TABLE( table_files( ‘lake.prod.posts’ ) );
– return 2151709359
as you can see after vacuum we have 209G on disk that not correspond of real data, even not correspond with sum of file_size_in_bytes
also attach the folder hierachy before (test3.txt) and after vacuum (test3_after.txt)
vacuum3.zip (3,8 MB)
thank by your help, I think this is a good case to optimize dremio.
I’m on the lookout for anything you need, including access to our environment and data or a remote session if is necesary