Dynamic Partition Handling for Event-Date Queries in Dremio

Hi team,

I have a table called cust_rec_data that contains columns like customer_number, event_date, recharge_amount, recharge_channel, and part_idx (which represents the month extracted from event_date in MM format). The event_date column is indexed. Since the table holds a large amount of data for the last 12 months, we use part_idx for partitioning. When integrating Dremio with BI tools like Superset, the user wants to view the sum of recharge amounts by recharge_channel for a specific range between different event_date values. They don’t expect to manually select part_idx because they’ve already selected the date range based on event_date. Can we manipulate the query in Dremio to automatically add the relevant part_idx values based on the selected date range, without user input, so that the query runs much faster?

Thanks in Advance

Hello Dremio Team,

Any update on this query?

Thanks in advance.

@JoiceJacob What is the source? If RDBMS, Dremio will just push down query written. You can create a view and add column you need in the filter. I am not entirey following your question.

If the backend is Iceberg table format there is a concept called partition transform which can help

Kindly let me know the source and if you have a job profile and we can start there

Hi @balaji.ramaswamy,

It’s MySQL 8.x, and column in VDS can show an example? In my case event_date will be change based on user input from BI tool.

Thanks.

@AjayBabuM For RDBMS, Dremio does a push down of the query written, can you provide a job profile of the query you are running on Dremio so we can see what is getting pushed down?

Hi @balaji.ramaswamy

Please find the query profile. Here, when the BI tool executes the query towards Dremio,

expecting dremio should convert it to,

So that BI tool no need to select part_idx(dervied from event_date) again since already selected event_date. Actually here the table size was huge if we use part_idx it will execute more fast.

12926816-d159-47cc-99a3-61a46db068f3.zip (15.4 KB)

Thanks,
Ajay Babu Maguluri

@AjayBabuM Dremio is unaware of the indexes created in MySQL. It simply pushes down all the work to the server side (MySQL) and does not change the filter condition to add columns that have indexes. The optimization it can do is if you join two MySQL tables with a filter condition, it can push the entire join with the filter into MySQL