Storage running full under /cm/fs/

I am running Dremio OSS 24 in docker and have mounted a volume for local storage as well as an minio bucket:

dremio.conf

paths: {
  # the local path for dremio to store data.
  local: ${DREMIO_HOME}"/data"

  # https://docs.dremio.com/software/deployment/dist-store-config/
  # the distributed path Dremio data including job results, downloads, uploads, etc
  #dist: "pdfs://"${paths.local}"/pdfs"
  #dist: "hdfs://<NAMENODE_HOST>:8020/path"}
  dist: "dremioS3:///dremio/"
}

Now I can see that the path /var/lib/docker/volumes/iod_dremio_app/_data/cm/fs with many subdirectories of numbered folders has ~ 93Gb in size. I have some reflections enabled, but on the web interface it tells me that they are between 1 and 200 MB each. So I am not sure why the storage need is so high.

I tried cleanup with dremio-admin clean, but could not lower the space needed.
Here is the ouput of one of the dremio-admin clean commands I ran:

dremio-admin-log.zip (1,6 KB)

[root@toolx01 fs]# ls -laht
total 4.9M
drwxr-xr-x. 131 root root 4.0K Oct  5 14:49 .
drwxr-xr-x.   2 root root    6 Oct  5 14:49 boostedSubDir
drwxr-xr-x.   2 root root  16K Oct  5 13:57 000042
drwxr-xr-x.   2 root root  16K Oct  5 13:57 000050
drwxr-xr-x.   2 root root  16K Oct  5 13:57 000093
drwxr-xr-x.   2 root root  16K Oct  5 13:57 000031
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000023
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000018
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000045
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000086
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000038
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000094
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000044
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000043
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000105
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000079
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000036
drwxr-xr-x.   2 root root  16K Oct  5 13:56 000017
drwxr-xr-x.   2 root root  16K Oct  5 12:01 000104
drwxr-xr-x.   2 root root  80K Oct  3 16:08 000049
drwxr-xr-x.   2 root root  80K Oct  3 16:08 000124
drwxr-xr-x.   2 root root  80K Oct  3 16:08 000110
drwxr-xr-x.   2 root root  80K Oct  3 16:08 000028
drwxr-xr-x.   2 root root  80K Oct  3 16:08 000115
drwxr-xr-x.   2 root root  80K Oct  3 16:08 000111
drwxr-xr-x.   2 root root  84K Oct  3 16:08 000022
drwxr-xr-x.   2 root root  80K Oct  3 16:08 000034
.......
many more
.......
rwxr-xr-x.   2 root root 8.0K Aug 21 16:00 000116
drwxr-xr-x.   2 root root 8.0K Aug 21 16:00 000118
drwxr-xr-x.   2 root root 8.0K Aug 21 16:00 000112
drwxr-xr-x.   2 root root 8.0K Aug 21 16:00 000107
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000060
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000055
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000058
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000056
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000051
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000065
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000053
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000066
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000061
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000063
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000054
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000067
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000062
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000064
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000052
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000057
drwxr-xr-x.   2 root root  16K Aug  7 12:00 000059
drwxr-xr-x.   2 root root 8.0K Jul 20 08:00 000037
drwxr-xr-x.   2 root root 8.0K Jul 20 08:00 000041
drwxr-xr-x.   2 root root 8.0K Jul 11 14:47 000026
drwxr-xr-x.   2 root root 8.0K Jul 11 14:47 000025
drwxr-xr-x.   2 root root 8.0K Jul 11 12:00 000027
drwxr-xr-x.   2 root root 8.0K Jul 11 12:00 000024

There are 132 folders with each one between 200 MB and 2.5GB in Size

What is causing this large storage need? What can I do to minimize it?
Thanks!

@tha This look like the C3 cache files for both data and reflections.One level up, do you see a folder called “cm” and under “cm” you will see 2 folders “db” and “fs”

So I turned of caching by setting reflection.cloud.cache.enabled to false two weeks ago. Still, Dremio did not evict the old and now unused caching files. Should I delete them manually?

If yes, should I just delete db and fs folders?

I also read that Dremio uses 70 percent of the total available disk space for the specified database and file system mount paths. Can this number be adapted?

@tha Yes, old files you can clean them manually. 705 can be changed, the documentation link above should explain that

Hello @balaji.ramaswamy
im working on the community edition of dremio .
I see that the cm/fs/ folder is occupying a lot of space after adding a s3 storage and running few queries.
47G cm/
5.6G db/
6.4G pdfs/
55K security/
87K spill/
50K zk/

any suggestions to optimize or remove older files to maintain the storage efficiently ?

@Hunter That is your Cloude cache files that reads blocks from local rather than going to your sidt store. It means your users are querying more Dremio :slight_smile:

Is it possible to increase disk size? Good problem to have

Hi @balaji.ramaswamy
are u saying that the storage is getting full since im using its local UI to query ?

i was kind of using its UI for testing the performance…

if i use a JDBC client to read data from dremio , then i wont have this issue .. right ?