What is stored in /opt/dremio/data/db/catalog

I’m noticing that files in /opt/dremio/data/db/catalog are taking up a the majority of the space for /opt/dremio/data volume and we are approaching 100%.

What is /opt/dremio/data/db/catalog used for?

That is where Dremio stores all persisted state (for example job details). You can run bin/dremio-admin clean to print some statistics that may help understand what is using up the space.

You can also read more about how to clean things up here.

Running bin/dremio-admin clean requires stopping Dremio. This is a bit of an issue, as k8s will rebuild the node when the readinessProbe fails.

Anyway to get stats without stoping Dremio?

Correction: It’s not the readinessProbe that’s restarting the container, I think it’s because Dremio is being started in the foreground (via bin/dremio start-fg) and killing it, kills the container.