Hello,
We are using AWS Community Edition (details are below). For queries that transfer 100K-1M records,
we are seeing extremely slow performance under concurrency (about 1-5 concurrent queries). I have attached a sample. We have turned on reflections and also reduced the # of parquet files (around 150 of them). Are there any options for tuning available?
@ben,
Sent the profile attachments over to support@dremio.com. Please acknowledge when you receive them. Please let me know if you need anything else.
Regards
This is caused probably due to the fact the client closed the connection, the query was cancelled in the 6th minute, the query is single threaded as the estimate on the parquet scan seems incorrect. Can you please refresh metadata and retry the query?
#2 Query completes in 7.8s
Close to 4.85s is spent in wait time on the Parquet Scan. All reads are remote, see operator metrics below the scan. I see it is a S3 source so implementing C3 Cache should help
External sort takes 1.8s, are we spilling to SSD’s?
Hint: make planner.slice_target to a lower value than 40,000 and you can see more parallel threads and will reduce the wait time. Do not make it too low as this can over parallelize small scans causing overhead
@balaji.ramaswamy
thanks for your response. for #1, this happens when there are 4 to 5 queries of similar type running concurrently. if i run the same query from console, it is reasonable (like below).
for #2, is C3 Cache used by aws edition automatically or do we have to manually configure it?
we will try both of the suggestions and provide a response.
Click on admin-support-support key and enter “planner.slice_target” and click show. Change to a lower value so you have at least 2 threads and see if the scan is paralleized