Huge Reflection Total footprint

We have a reflection over vds that involves AWS Glue table.
The reflection is incremental (reflection partion column(with truncate) = glue table partition column).
Reflection tab shows: Footprint: 594.96 MB (226.55 GB) where Total footprint = 226 GB.
The total size of the glue table underlying files is 2.7 GB while VDS filters away 60% of records.

Reflection is updated every hour.

From Dremio wiki:

**Total Footprint**

Shows the current size, in kilobytes, of all of the existing materializations of the Reflection. More than one materialization of a Reflection can exist at the same time, so that refreshes do not interrupt running queries that are being satisfied by the Reflection.

i guess this is not the case here. Why total footprint is so huge?(i suspect that it is being growing over time constantly) How can we control its size? Is there a way to do teh cleanup?

Build Info:

  • Build: 25.1.6-202501021803480419-127757a4
  • Edition: AWS Edition (activated)

@vladislav-stolyarov Look the the reflection refresh job, use the first UUID which is the reflection id and under your location configured in dremio.conf under dist:/// there will be a folder called accelerator, under that, the first UUID is a folder and under that the secondd UUID is the materialization id, are you able to do a du -sh * from the reflection folder?

In my dremio.conf i have slightly different schema. I guess you meant this one?
paths.accelerator = "dremioS3:///dremio-me-...../dremio/accelerator"

if i go to the folder and search for subfolder==reflection_id i was able to find it.
and its space is: Total number of objects 33,152 Total size 74.4 GB.

For the experiment i took another VDS that is not incremental like problematic one above and found:

  1. Tooltal footprint is x6 more that current footprint which is normal.
  2. On s3 it has 7 subfolders 6 of them has subfolder == materialization_id (from materializations table for this reflection_id) Only one folder is orphan.
  3. subfolder names are in format {materialization_id}_0

On the contrast my problematic reflection folder has one subfolder with thousands of subfolders inside. What is interesting none of the materialization_id is used for those subfolders names.
So i guess in case of incremental reflections some other storage patterns apply.

Reflection details from reflections table:

Спойлер

{
“reflection_id”: “8d999388-2870-4fe1-a656-01e62ff23264”,
“reflection_name”: “Raw Reflection”,
“type”: “RAW”,
“created_at”: “2025-05-26 10:10:40.950”,
“updated_at”: “2025-06-10 09:51:04.752”,
“status”: “CAN_ACCELERATE”,
“dataset_id”: “a3bd83f6-00d8-4aa7-98d9-3ee6d0e6fcdb”,
“dataset_name”: “xxx.”,
“dataset_type”: “VIRTUAL_DATASET”,
“sort_columns”: “”,
“partition_columns”: “TRUNCATE(1,xxx)”,
“distribution_columns”: “”,
“dimensions”: “”,
“measures”: “”,
“display_columns”: “xxx”,
“external_reflection”: “”,
“num_failures”: 0,
“last_failure_message”: “”,
“last_failure_stack”: “”,
“arrow_cache”: false,
“refresh_status”: “SCHEDULED”,
“acceleration_status”: “AVAILABLE”,
“record_count”: 16139062,
“current_footprint_bytes”: 624331469,
“total_footprint_bytes”: 248875651699,
“last_refresh_duration_millis”: 43040,
“last_refresh_from_table”: “2025-09-18 11:03:43.477”,
“refresh_method”: “INCREMENTAL”,
“available_until”: “3025-01-19 11:03:43.477”,
“considered_count”: 4496,
“matched_count”: 4496,
“accelerated_count”: 4496
}

Hi @vladislav-stolyarov , this is a known bug that is specific to INCREMENTAL reflections. Reflections track stats for each refresh job and its not adding them up correctly when there is compaction between the refreshes. There’s a sys.refreshes table with more details about these refreshes.

It should be fixed after Dremio 26.1.

How can we upgrade to 26.xxx we are on AWS. seems lates there is Ver 25.2.20

You probably know that AWSE will be deprecated 1/31/2026: AWS Edition (Deprecated) | Dremio Documentation so there is no 26.x version for AWSE.

To migrate from AWSE to 26, first you take a backup of your KV store which has all the sources, tables, views, jobs, reflections, etc.

Using helm, deploy dremio 26 into a k8s cluster

Then, restore the KV store back up using the admin cli: