How to control the execution order for two customized plugins


We have two REST API data sources, e.g. A and B.

We build two plugins to connect them, e.g. PluginA and PluginB.

When we do the join (select A., B. from A, B where A.join_key = B.join_key ), logically it will get all result from both of A and B, then do the hash join with join_key.

Currently, we want to do some optimization: execute PluginA firstly and get the join_key, then send the join_key to PluginB.

But we found issues that the execution order of PluginA and PluginB is not consistent (seems it will execute the plugin with more fields). We hope it always run PluginA firstly. Could anyone advice how to control the execution order?

I tried to update the cost for different plugin to let dremio change the execution order. But seems not work. From the screenshot below, the different cost but have same execution order plan

@popejune What are the 2 sources? when you you build 2 plugins, are they any standard data source?

They are not standard one. It’s some kinds of REST API. We built it by ourselves since dremio not support it.

@popejune This is just for testing, try and disable planner.enable_join_optimization and see if the join order is maintained as in SQL. This is a global setting so may affect other queries

Thanks @balaji.ramaswamy

It’s working. Let me do more testing.

@popejune This may affect reordering of joins, so testing would be great