Hello everyone.
I’m facing this error in dremio cloud.
CONNECTION
b3e4344e-fdd5-4bda-a00e-3609c60afa9f.zip (69,6,KB)
ERROR: Exceeded timeout (30000) while waiting after sending work fragments to remote nodes. Sent 1 and only heard response back from 0 nodes
Node(s) that did not respond 10.13.5.198
Do you know what’s happening?
@Filipe.Souza The node IP listed, did not respond to another Dremio node. Could be a RPC taking too much long or a Full GC
If you the see the time taken for “Starting” under the raw profile-query tab-under “State Durations”, you will see it was 30 seconds
Starting:
30,007ms
Usually this should less than 100ms, this tells me the executors are very busy and even to assign the work fragment from it takes 30s
Can you please check if there was long GC pause or on the executor log (When this query ran) see if there are any WARN messages
Hi @balaji.ramaswamy
In AWS we cannot access the logs of the machines created by dremio cloud, we can only check the use of the resources as below.
Is this GC configuration parameterized within the machines?
If so, is there any way to make this change?
All machines are managed exclusively by dremio itself.