we have a cluster of 6 workers and 1 cordinator.
100GB same sample data of TPCDS on hive+orc. No buffer on trino, and no reflections on dremio.
Dremio’s resp time is 172s and trino 51s.
We found dremio got much lower performance than trino, here is q64 profile:
068dbcdb-2128-4701-b351-1f8639ed16e4.zip (2.8 MB)
one query even run for more than 2 hours with no response:
a9d9a4c3-c74c-449d-a536-108d61b80f7d.zip (125.0 KB)
Tats sad. Dremio performance blogs claims it is 3-4x faster then presto/trino. We are planning to a comparison test between trino and dremio ? It is worth our time. Can you put more examples on where dremio is slower than trino ?
Reviewed both profiles
Job ID# 1f8a63c1-1272-ec2f-bbff-1182bfb7ac00 suffers from CPU wait time and some IO wait time. For CPU wait, would you know if there were other concurrent queries running, if yes, how many? For IO wait time, Dremio has C3 only for Parquet not for C3
https://docs.dremio.com/deployment/cloud-cache-config.html
Job ID# 1f894a1e-d6c1-2d30-ff56-0bf04c3d4b00 looks like executor slave6 had issues. Can you please send us the server.log from that node when the query ran “2021-04-14 09:29:04” UTC, you can grep for “took too long” in the server.log (might be under the Dremio log/archive folder) and see if there was another big reflection job or a scan jon that was part of the “took too long”, The line would also print the query ID causing it