Rocks DB Regularly Filling up

amar_ikigai · June 7, 2023, 7:39pm

We have a setup of Dremio on AWS. We run a lot of API requests against Dremio using flight and create a ton of virtual and physical datasets on top of CSVs and Parquets.

The error we get when the db gets full is
exception: org.rocksdb.RocksDBException: While appending to file: /var/lib/dremio/db/catalog/002260.log: No space left on device

I have added screenshots below.

Here is the version of Dremio running on AWS

Screenshot 2023-06-07 at 2.43.55 PM1070×870 57.3 KB
Here is what the system disk allocation looked like when the db was full
Here is the tail of the LOG file in the ~{DREMIO_HOME}/db/catalog

Screenshot 2023-06-07 at 11.01.14 AM1920×1147 413 KB

I’ve also set these parameters in support:

For me the only effective way to fix the RocksDB getting full has been to restart the Dremio deployment via CloudFormation and that clears the RocksDB without affecting the data.

Please advise if there’s an automated way to prevent the RocksDB from getting full and blocking all functionalities and log ins?

Thanks!

balaji.ramaswamy · June 12, 2023, 5:11pm

@amar_ikigai During restart, RocksDB will flush out .sst files no longer needed for recovery. What is the total size of the disk?

Can you run the below API and send the zip file it creates? Can be done when Dremio is up and running

curl --location --request GET 'localhost:9047/apiv2/kvstore/report?store=none' \
--header 'Authorization: _dremioirr8hj6qnpc3tfr3omiqvev51c' > kvstore_summary.zip

amar_ikigai · June 12, 2023, 8:30pm

Hi @balaji.ramaswamy,
We have 2 different dremio instances.
1 for development that’s this one 50 GB disk
1 for production 150 GB disk

The development instance has tons of .log files that keep eating up all the disk space and we are worried this will happen on production which would be a huge hit for our customers so we wanted an automated and safe way to clear these RocksDB files within the catalog folder without down time for our Dremio instances.

Here is the KVstore summary
kvstore_summary 2.zip (2.5 KB)

balaji.ramaswamy · June 12, 2023, 11:59pm

@amar_ikigai From your report, none of the internal stores are using any significant space. Wherever you took this report now, what is the disk usage now?

amar_ikigai · June 13, 2023, 3:24pm

Hi @balaji.ramaswamy,

The disk space is filling up pretty quickly. Here are some summaries:

# df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G  508K   16G   1% /run
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/nvme1n1     50G   28G   20G  59% /mnt/c1

# du -sch *
80M	cm
28G	db
104K	etc
16K	lost+found
36K	results
12K	s3Backup
12K	spilling
28G	total

du -sch *
232K	blob
28G	catalog
232K	metadata
24M	search
28G	total

du -sch *.log
74M	004125.log
73M	004128.log
...
73M	005264.log
73M	005268.log
73M	005270.log
71M	005272.log
26G	total

Looking through the live LOG files as well there is some information from the DB STATS section:

** DB Stats **
Uptime(secs): 531096.8 total, 3338.4 interval
Cumulative writes: 8890K writes, 8890K keys, 8797K commit groups, 1.0 writes per commit group, ingest: 25.80 GB, 0.05 MB/s
Cumulative WAL: 8890K writes, 0 syncs, 8890516.00 writes per sync, written: 25.80 GB, 0.05 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 13K writes, 13K keys, 13K commit groups, 1.0 writes per commit group, ingest: 72.51 MB, 0.02 MB/s
Interval WAL: 13K writes, 0 syncs, 13915.00 writes per sync, written: 0.07 MB, 0.02 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

Here we can see that the Cumulative WAL has grown to a really large 26 GB which doesn’t make a lot of sense to me.

amar_ikigai · June 13, 2023, 7:14pm

I did some digging into the RocksDB options ini files that govern the RocksDB settings within the Dremio coordinator node.

A few options jump out to me:
max_total_wal_size=0
WAL_size_limit_MB=0
db_write_buffer_size=0
max_log_file_size=0

Based on these it seems like the WAL for RocksDB can grow to an arbitrarily large size and that log files can grow to an arbitrarily large size before being flushed to SST files.
Reference: https://github.com/facebook/rocksdb/blob/23af6786a997d3592e8a68f1a8d9e0699a6eae36/include/rocksdb/options.h#L621

Main Question: Is there a safe way to modify these parameters and trigger the WAL flush of .log files through Dremio settings or is this something I would have to manually tinker with?

Secondary Question: **Is there a hidden support key that we could configure to stop the WAL from being bigger than n GB? **

balaji.ramaswamy · June 15, 2023, 5:16am

@amar_ikigai Changes to RocksDB settings are not tested.

kyleahn · August 9, 2023, 12:11am

hi!

what was the solution you went with eventually? I am facing the same issue! It would be awesome if you could share your approach briefly!

thanks,
kyle

balaji.ramaswamy · August 15, 2023, 6:51am

@kyleahn Does restart clear the files? That tells us they are just files needed for recovery when Dremio goes down unexpectedly

Would also like to understand where the space is used, What is the total disk space where the “db” folder is created? Does this mount also have the Dremio Cloud cache (C3) files configured

Are you able to run the below API and send us the output?

curl --location --request GET ‘localhost:9047/apiv2/kvstore/report?store=none’
–header ‘Authorization: _dremioirr8hj6qnpc3tfr3omiqvev51c’ > kvstore_summary.zip

kyleahn · August 15, 2023, 7:08am

I was able to set jobs.max.age_in_days and results.max_age_in_days to a lower value, and that resolved the issue. But, I do still see the disk space getting filled up quite aggressively. Currently the master node has 32GB, and we use Dremio only with Glue Iceberg tables with external reflection. Should I still expect to see 32GB to be filled up every day when the query volume isn’t high? (less than 500 a day)

kvstore_summary.zip (2.0 KB)

balaji.ramaswamy · August 15, 2023, 7:15am

@kyleahn I am asking about the disk space? Are you saying free space is 32 GB? The Profiles and jobs are using 70 GB and you need some to store recovery files, so I would allocate a 150 GB to 200 GB disk. Now that you reduced jobs/profiles retention, it will come down. You can force an offline cleanup of jobs by shutting the coordinator and then do dremio-admin clean -j 7, this wil lclean all jobs/profiles older than 7 days

jobs
	basic rocks store stats
		* Estimated Number of Keys: 8215060
		* Estimated Live Data Size: 3459823720
		* Total SST files size: 4091536398
		* Pending Compaction Bytes: 0
		* Estimated Blob Count: 0
		* Estimated Blob Bytes: 0


profiles
	basic rocks store stats
		* Estimated Number of Keys: 7977327
		* Estimated Live Data Size: 71978302888
		* Total SST files size: 72334638944
		* Pending Compaction Bytes: 0
		* Estimated Blob Count: 0
		* Estimated Blob Bytes: 0

Topic		Replies	Views
java.io.IOException: org.rocksdb.RocksDBException:No space left on device	4	4511	October 22, 2018
Dremio Mounted on Azure external ssd 96% full disk Usage	12	2076	January 9, 2020
Dremio server disk management	10	1888	January 8, 2023
AWS Dremio community edition drive filling up	10	931	April 17, 2023
Reduce disk usage of /catalog folder of Dremio	7	1572	January 16, 2019

Rocks DB Regularly Filling up

Related topics