Issue with Incremental Reflection Not Filtering by Last_Modified When Using Dynamic Date in VDS

Hi team,

I am facing an issue with incremental reflections in Dremio. Here’s the scenario:

  1. Setup:
  • I enabled an incremental reflection refresh every 5 minutes on a table, using the Last_Modified column (a TIMESTAMP column) as the incremental update column.

  • I created a VDS that selects specific columns from the table and applies a dynamic CURRENT_DATE filter on the Event_Date column:

SELECT camp_id, Msisdn, sumit_time, devlivery_time
FROM mysql16.EDA.“dremio_incremental_test007”
WHERE Event_Date = TO_CHAR(CURRENT_DATE(), ‘YYYY-MM-DD’)

  1. Issue:
  • After every 5-minute reflection refresh, I noticed that only the Event_Date filter is applied in the JDBC query, but the Last_Modified column is not being used for filtering.

Example JDBC query:

SELECT camp_id, Msisdn, sumit_time, devlivery_time
FROM (SELECT dremio_incremental_test005.camp_id, dremio_incremental_test005.Msisdn, dremio_incremental_test005.Event_Date, dremio_incremental_test005.sumit_time, dremio_incremental_test005.devlivery_time
FROM EDA.dremio_incremental_test005) AS dremio_incremental_test005
WHERE Event_Date = ‘2025-01-02’

  1. Observation with Static Date:
  • When I replaced the dynamic date filter with a static date, as shown below, the reflection job filters on both the Last_Modified and Event_Date columns:

SELECT camp_id, Msisdn, sumit_time, devlivery_time
FROM mysql16.EDA.“dremio_incremental_test007”
WHERE Event_Date = ‘2025-01-02’

Example JDBC query:

SELECT camp_id, Msisdn, sumit_time, devlivery_time, Last_Modified AS $_dremio_$_update_$
FROM (SELECT dremio_incremental_test006.camp_id, dremio_incremental_test006.Msisdn, dremio_incremental_test006.Event_Date, dremio_incremental_test006.sumit_time, dremio_incremental_test006.devlivery_time, dremio_incremental_test006.Last_Modified
FROM EDA.dremio_incremental_test006) AS dremio_incremental_test006
WHERE Last_Modified > ‘2025-01-02 20:56:51.000’ AND Event_Date = ‘2025-01-02’

4.Further Tests:

  • I tried using different dynamic date formats (CURRENT_DATE() directly and through a function returning the current date).

  • I also tested by setting the ID column as the incremental update column instead of Last_Modified.

  • In all these tests, when using dynamic date formats, the Last_Modified column is not considered for filtering in the reflection query.

Why is the Last_Modified column not being used for filtering when a dynamic date filter is applied on the Event_Date column in the VDS? How can I resolve this behavior to ensure both columns are filtered correctly with dynamic dates?

Reflection Job Profile Query of VDS with Dyamic Date
143e7537-254e-408a-bb28-e914fe072528.zip (14.7 KB)
Reflection Job Profile Query of VDS with Static Date
cbad55e8-8ab2-4948-a67f-d577ce2d603c.zip (14.7 KB)

Thanks in advance for your insights!

Hi @JoiceJacob Thanks for including the profiles. The query profile contains this message in the planning tab:

Refresh Decision (2 ms)

Full Refresh. 
Cannot do incremental update because the reflection has dynamic function(s): CURRENT_DATE

We don’t support incremental refreshes when there are dynamic functions in the reflection’s SQL. There are situations this can produce wrong results when old data is supposed to be removed by the dynamic filter condition.

You can workaround this by building the reflection on a view without the dynamic filter and then building another access layer view on top of the first view that includes the dynamic filter.

Hopefully, you are partitioning by date because with 5 minute refreshes, you are accumulating many small files which would impact read performance.

FYI, if your incremental reflection was based on all Iceberg tables, Dremio will automatically perform table maintenance and basically eliminate the small files problem.

Hi @Benny_Chow

In my use case, I want to refresh only the latest reflection based on the current date. Since my table contains a large volume of data, I need to refresh the reflection for the current date only, rather than performing a full table refresh. How can this be achieved in Dremio?

Thanks in Advance

Hi @Benny_Chow

Could you please provide an update on the information mentioned above?

Thanks in Advance.

Hi Joice, did you try out my suggestion? The refresh should only process new data and not do a full refresh. Can you provide query profiles if its not working for you?