The Convert To Rel phase is taking 3,359 ms, the cluster has 10 or more spaces with 3 to 4 VDS per space. Similar query on a different cluster with less number of spaces and VDS gets completed in 300ms.
How do we reduce time take by Convert to Rel phase?
In general what are the best practices to reduce the time taken in planning phase.
@desi
I tried the cleanup, but that did not reduce the planning time.
@balaji.ramaswamy Sorry to tag you directly. But is there some document where I can understand about convert to rel and what impacts the performance of convert to rel.
The planning time for the same query on different clusters are different. The Kuberenetes pod sizing is same. One cluster has more spaces and VDS.
The query only uses two VDS.
Basically planning is reading the db metadata. Look in your dremio conf on master node where it is storing the db folder - ensure that access to it is fast from the cluster which is running it slow.
If you have two clusters each with master node, then check they both have their own unique db folder end-point, not shared. If it is shared, one of the coordinators will automatically become a stand-by as the master will lock the db folder database.
Either way, the problem is in the environment. As a test you can do this test, stop the cluster that is running planning fast. Take the 2 config files dremio conf and env and copy them to slow planning running master node, start it and test it. Ensure you only do select type operations, no not create anything. Ensure no one is connecting to your cluster also.
@Rakesh_Malugu
Please find the below profile.
The cluster with Dremio 4.0.5 takes less than a second to plan a simple select query.
Cluster with Dremio 4.6.1 takes 2.4 seconds for planning.
The dataset is a single file in S3. dremio-4.6.1_profile.zip (16.3 KB) dremio-4.0.5_profile.zip (11.6 KB)
@balaji.ramaswamy
For this cluster the validation time has reduced to 1ms. But the same dataset on another cluster always gives validation time greater than 3secs, cannot share the profile of that cluster.
We are creating VDS on top of this PDS and querying the VDS after enabling the reflection.
Can you suggest what we can do to get a consistent planning time. Any information about the factors which affect the planning time would be helpful.
The time you see in validation was metadata, we usually Cache Metadata via the source background refresh job but if the table is new (first time queried) or metadata has expired then we do it during query run time, that is the reason you did not see the high planning time on the second run. Regrading your second cluster, see if the time is again metadata then second onwards should be fine. It is also a good idea to check your metadata settings on your source to make sure your expiry interval > refresh interval
We have filed an internal ticket to address the issue, meanwhile please use the fully qualified name like “s3-cuddle-data”.“dev-backup”.cuddle_bauer.“merged-uk”.“merged.csv”