GKE Master disk usage

unni · September 16, 2020, 5:24am

Hi,
We have deployed Dremio 4.5 on GKE.
1 master node (100GB disk), 3 executor pod (100GB disk)

We are getting No space left on device error on the master pod.
98GB is used by the catalog.

Can you tell me how to debug what is taking up so much of disk space?

tid · September 16, 2020, 8:21am

Hello,
Dremio stores various things under ./data, e.g. a Lucene index (in db/search) for searching in the “Jobs” UI.
catalog/ contains the actual profiles for previous job executions. If you run many queries and have a long retention period configured, it can take up quite a lot of disk space.

You can do two things to reduce the disk consumption:
Bring the cluster into administration mode and run the “dremio-admin clean” task (instructions in the helm chart) – this will introduce a cluster downtime.
Or: set jobs.max.age_in_days in Admin > Support > Support Keys to a smaller value than the default (30 days I think). It will then do a nightly cleanup (default: 01:00 in the morning, support key: job.cleanup.start_at_hour)

Best, Tim

unni · September 16, 2020, 8:53am

Thank you Tim.

Can you share the other support settings?
I could not find jobs.max.age_in_days setting in the docs.

ben · September 16, 2020, 5:11pm

Hi @unni, it is not documented, but it should still appear if you enter it into the Support key field as described in the link.

tid · September 17, 2020, 1:13pm

Hi, @unni
Big portions of Dremio are open-source. I found the config keys in “ExecConstants.java” in the GitHub project. Here’s the link for the current 4.7 release: https://github.com/dremio/dremio-oss/blob/d255abfabad2c9122e1cdf030ea6bbe8f9b7ce50/sabot/kernel/src/main/java/com/dremio/exec/ExecConstants.java
Since some keys might have been added in Dremio versions more recent than yours, you might need to switch to an older version of ExecConstants via git history.

(I’m not an Dremio engineer, so please re-confirm with Dremio that it is actually okay to change some of the keys you’ll find in ExecConstants. Some of the stuff is pretty low-level and you might break things when tweaking the settings.)

Best regards, Tim

unni · September 18, 2020, 4:50am

@tid Thank you Tim. Will check out the code.

balaji.ramaswamy · September 24, 2020, 7:43am

@unni

This is what you do

Shutdown the executors
Shutdown the coordinator
cd <DREMIO_HOME>/bin
./dremio-admin clean
save output to a file
start coordinator
start executor

Send us the saved file, we can exactly say where the space is occupied

Mostly it would be in

Metadata splits or Metadata multi splits which can be cleaned offline using “clean -o”
As @tid said, jobs and profiles - We keep for 30 days, you can set the parameter via the support key or do offline “clean -j n”, where n is number of days to keep
“clean -i” does a reindex
'clean -c" does compacting of the database

unni · November 9, 2020, 9:09am

We ran ./dremio-admin clean on our cluster, please find the attached file.

Currently 95GB of the disk space is used by the master.
I noticed that the profile section is taking most of the space.

Please let us know what this means and how to reclaim the space.dremio-admin-clean-command-output.txt.zip (1.1 KB)

balaji.ramaswamy · November 10, 2020, 8:04am

@unni

All your 90 GB is in jobs and profiles, do you have verbose profile on? If not then how many days of profiles is this, as said above you can delete jobs > n days, documentation below

https://docs.dremio.com/advanced-administration/metadata-cleanup.html#delete-jobs

jobs
basic rocks store stats
* Estimated Number of Keys: 2056631
* Estimated Live Data Size: 4365202377
* Total SST files size: 5280083846
* Pending Compaction Bytes: 179204667
* Estimated Blob Count: 0
* Estimated Blob Bytes: 0
Index Stats
* live records: 2059900

* deleted records: 270045

profiles
basic rocks store stats
* Estimated Number of Keys: 3834547
* Estimated Live Data Size: 87999247546
* Total SST files size: 93266898549
* Pending Compaction Bytes: 364795392
* Estimated Blob Count: 0
* Estimated Blob Bytes: 0

unni · November 10, 2020, 10:32am

The jobs.max.age_in_days: is set to 2.
We are running close to 150000 SQL queries within a span of 8 hours. Could this lead to GC issues ?
The master pod restarts after 10 hours because master is unable to connect to Zookeeper.

balaji.ramaswamy · November 11, 2020, 8:26am

@unni

What version of Dremio is this? Clicking on jobs page might cause a full GC as you have so many jobs. Also if the SQL’s are very big that can add to the issue. Here are a few things you can do

Make sure verbose profile is not on
Add the below to dremio-master.yaml under DREMIO_JAVA_EXTRA_OPTS section and restart pods
-Xloggc:/opt/dremio/data
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=5
-XX:GCLogFileSize=4000k
-XX:+PrintClassHistogramBeforeFullGC
-XX:+PrintClassHistogramAfterFullGC
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/opt/dremio/data
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:MaxGCPauseMillis=500
-XX:InitiatingHeapOccupancyPercent=25
-XX:ErrorFile=/opt/dremio/data/hs_err_pid%p.log
Make sure we have a 16 GB heap on the coordinator
Once the problem happens, send us the GC logs

Thanks
Bali

unni · November 11, 2020, 8:53am

We are using Dremio 4.5.
DREMIO_JAVA_EXTRA_OPTS are already set. Heap is set to 40GB.

dremio@dremio-master-0:/opt/dremio$ ps -ef | grep dremio
dremio       1     0 99 07:16 ?        02:21:15 /usr/local/openjdk-8/bin/java -Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler -Djava.library.path=/opt/dremio/lib -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Ddremio.plugins.path=/opt/dremio/plugins -Xmx40000m -XX:MaxDirectMemorySize=11000m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dremio -Dio.netty.maxDirectMemory=0 -DMAPR_IMPALA_RA_THROTTLE -DMAPR_MAX_RA_STREAMS=400 -Dzookeeper=zk-hs:2181 -Dservices.coordinator.master.embedded-zookeeper.enabled=false -Dservices.executor.enabled=false -Xloggc:/opt/dremio/data/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=4000k -XX:+PrintClassHistogramBeforeFullGC -XX:+PrintClassHistogramAfterFullGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dremio/data -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=25 -XX:ErrorFile=/opt/dremio/data/hs_err_pid%p.log -cp /opt/dremio/conf:/opt/dremio/jars/*:/opt/dremio/jars/ext/*:/opt/dremio/jars/3rdparty/*:/usr/local/openjdk-8/lib/tools.jar com.dremio.dac.daemon.DremioDaemon

unni · November 18, 2020, 10:29am

@balaji.ramaswamy anything that we can do from our side so that this issue does not occur.

balaji.ramaswamy · November 19, 2020, 3:34am

@unni

Great !

Can we have the GC logs and so that we can review and see where the issue is?

unni · November 25, 2020, 11:27am

Sharing the latest logs
2020-11-25 10:43:56,374 [zk-curator-231] ERROR ROOT - Dremio is exiting. Node lost its master status.
Dremio is exiting. Node lost its master status.
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
2020-11-25 10:43:56,387 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to LOST
172.38.2.223 - - [25/Nov/2020:10:43:56 +0000] “GET / HTTP/1.1” 200 2522 “-” “kube-probe/1.16”
2020-11-25 10:43:56,419 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to RECONNECTED

gclogs_25Nov.zip (3.1 MB)

balaji.ramaswamy · November 26, 2020, 7:19am

@unni

I see your heap is filled with planner, metadata information, can we validate if you are not refreshing metadata very frequently. I see you are using Hive UDF’s, lot of expression based queries (maybe)?

How big is the OS RAM? Have you tried to add more scale out coordinators (since 4.8)?

unni · November 26, 2020, 8:23am

We are refreshing metadata before refreshing the reflection. We are not using any Hive UDF’s, but the queries have expressions and join
Sharing a sample query.

WITH “_data1” AS
(SELECT “tx_1562674474561”.“Month” AS “Month”,
“tx_1562674474561”.“Year” AS “Year”,
“tx_1562674474561”.“DayOfMonth” AS “DayOfMonth”,
case
WHEN 100count(distinct(tif_flag_1562674600281))/nullif(count(distinct(referencenumber_1561656334791)),0) = 0 THEN
null
ELSE 100count(distinct(tif_flag_1562674600281))/nullif(count(distinct(referencenumber_1561656334791)),0)
END AS “__tif_orders_invoiced_in_te1562675768886”
FROM
(SELECT “tx_1562674474561”.,
“period”.
FROM
(SELECT “tif_flag_1562674600281”,
tdsr_1558686830005,
CAST(the_date AS DATE) AS the_date,
referencenumber_1561656334791
FROM “projectid”.“system”.“tx_1562674474561” AS “tx_1562674474561”
WHERE ( “tx_1562674474561”.“tdsr_1558686830005” IN ( ‘RO1’ ) )
AND (“tx_1562674474561”.“the_date” <= ‘2020-12-31’
AND “tx_1562674474561”.“the_date” >= ‘2019-01-01’) ) AS “tx_1562674474561”
JOIN
(SELECT “DayOfMonth”,
“yyyyMMdd”,
“Year”,
“Month”,
TO_DATE(“period”.“the_date”,
‘YYYY-MM-DD’) AS “period_the_date”
FROM “projectid”.“system”.“period” AS “period”
WHERE (“period”.“the_date” <= ‘2020-12-31’
AND “the_date” >= ‘2019-01-01’) ) AS “period”
ON “tx_1562674474561”.“the_date” = “period”.period_the_date ) AS “tx_1562674474561”
WHERE “tx_1562674474561”.“tdsr_1558686830005” IN ( ‘RO1’ )
AND ((“tx_1562674474561”.“yyyyMMdd”
BETWEEN 20200101
AND 20201231)
OR (“tx_1562674474561”.“yyyyMMdd”
BETWEEN 20190101
AND 20191231) )
GROUP BY “tx_1562674474561”.“Month”,“tx_1562674474561”.“Year”,“tx_1562674474561”.“DayOfMonth” ) , “period” AS
(SELECT min(yyyymmdd) AS min_date,
max(yyyymmdd) AS max_date,
“period”.“Month” AS “Month”,
“period”.“Year” AS “Year”,
“period”.“DayOfMonth” AS “DayOfMonth”
FROM “projectid”.“system”.“period” AS “period”
WHERE ((“period”.“yyyyMMdd”
BETWEEN 20200101
AND 20201231)
OR (“period”.“yyyyMMdd”
BETWEEN 20190101
AND 20191231) )
GROUP BY “period”.“Month”, “period”.“Year”, “period”.“DayOfMonth” )
SELECT *
FROM
(SELECT “__tif_orders_invoiced_in_te1562675768886”,
“period”.“Month” AS “Month”,
“period”.“Year” AS “Year”,
“period”.“DayOfMonth” AS “DayOfMonth”,
min_date,
max_date
FROM “_data1” FULL OUTER
JOIN “period”
ON ( COALESCE(“_data1”.“Month”)=“period”.“Month”
AND COALESCE(“_data1”.“Year”)=“period”.“Year”
AND COALESCE(“_data1”.“DayOfMonth”)=“period”.“DayOfMonth” ) )
WHERE (“__tif_orders_invoiced_in_te1562675768886” IS NOT NULL )AND( min_date IS NOT NULl
AND max_date IS NOT NULL )

Dremio master pod has 15 core, 59 GB. We are using Kubernetes with Dremio version 4.5, we tried using multiple co-ordinator but many of the queries get timed out or the connection is lost.

balaji.ramaswamy · November 26, 2020, 8:45pm

@unni

Scale out coordinator is a feature from Dremio 4.7

unni · November 27, 2020, 4:24am

@balaji.ramaswamy
The co-ordinator pods & values were there in the dremio-cloud-tools chart when we were using Dremio 4.5 version.Maybe 4.7 had been released when we used the chart.
It had worked for queries with 4.5 but we will definitely work on upgrading to 4.9.

Any other suggestions that you can provide based on the sample query. Any options for the planner to be less verbose which can improve the performance?

balaji.ramaswamy · November 27, 2020, 9:14pm

@unni

Send us the profile for that query, interested in seeing the logical plan and how big it is ?

Topic		Replies	Views
Dremio Mounted on Azure external ssd 96% full disk Usage	12	2077	January 9, 2020
Dremio admin clean compact running for hours	9	1498	July 20, 2021
AWS Dremio community edition drive filling up	10	932	April 17, 2023
Query failing because of ZK connection SUSPENDED, RECONNECT results in master killing the job	16	3874	November 11, 2020
Dremio master unhealthy	6	1013	March 5, 2021

GKE Master disk usage

Related topics