Have set up Dremio, latest community edition, on an EKS cluster with 3 executor pods on r5d.4xlarge nodes.
Successfully onboarded an AWS Catalog.
However, each and every query takes > 10 mins due to Metadata Retrieval, compounded by failed attempts.
Here’s an excerpt from a query’s 3rd attempt:
Job Summary
State:
COMPLETED
Coordinator:
dremio-master-0.dremio-cluster-pod.default.svc.cluster.local
Threads:
34
Command Pool Wait:
0ms
Total Query Time:
390,819ms
State Durations
Pending:
0ms
Metadata Retrieval:
389,873ms
Planning:
108ms
Engine Start:
Queued:
14ms
Execution Planning:
60ms
Starting:
10ms
Running:
754ms
I,ve read other thread where users have been complaining of extremly slow S3 metadata refresh rates. Has this issue been resolved? If not, it seems to also affect AWS Glue.
I have also attempted to prevent metata refresh by running:
ALTER PDS REFRESH METADATA AVOID PROMOTION
Which did not seem to help.
Please advise.
kubectl get pods
NAME READY STATUS RESTARTS AGE
dremio-executor-0 1/1 Running 0 11h
dremio-executor-1 1/1 Running 0 11h
dremio-executor-2 1/1 Running 0 11h
dremio-master-0 1/1 Running 0 11h
zk-0 1/1 Running 0 11h
zk-1 1/1 Running 0 11h
zk-2 1/1 Running 0 11h