How to get “Identify new records using the field” appear?

gypsysunny · October 14, 2019, 2:30am

could you please give a link or some steps on how to use a modified date as the “Identify new records using the field”? currently I am not able to find “Identify new records using the field” in dataset settings -> reflection refresh. How do I make it appear? do I have to add a new column to the dataset/PDS as the modified date?

Thanks!

balaji.ramaswamy · October 14, 2019, 5:25am

@gypsysunny

I think you are asking about the field to define for incremental refreshes.

If it is a file system based source then we have only one choice “Incremental update based on new files”, if it is a source like a RDBMS (Oracle/Postgres) or NoSQL like Mongo then you can define a field with the below restrictions

https://docs.dremio.com/acceleration/updating-reflections.html#full-and-incremental-refresh

Kindly let us know if you have any other questions

Thanks
@balaji.ramaswamy

gypsysunny · October 14, 2019, 8:44am

thanks a ton for your quick response. I have one more question: why can’t we manipulate reflection refresh in dataset settings on top of VDS? i.e. there is no reflection refresh policy provided in dataset settings in VDS whereas there is in PDS. Why?

gypsysunny · October 14, 2019, 8:55am

I did a test and found out the VDS’s refresh policy follows the one created on top of PDS. is that the rule you set up?

balaji.ramaswamy · October 14, 2019, 6:57pm

@gypsysunny

Currently we work on bottom up approach where we always refresh the PDS. To refresh a reflection on a VDS, simple refresh one of the PDS referenced in the VDS definition using a REST API call

Thanks
@balaji.ramaswamy

gypsysunny · October 15, 2019, 1:23am

ok. I see. and what we discussed is about raw reflections. Does aggregation reflections have incremental refresh policy supported?
As I tested,
If I choose “incremental update based on new files” in Refresh Method and then click “Refresh Now” in Refresh Policy, the aggregation reflection can be used to accelerate the sql: SELECT sum(“Count”) FROM testfile. Does that mean dremio is supportive of incremental refresh for aggregation reflections?

Thanks a million.

kprifogle · October 17, 2019, 6:21pm

Is there anything that would cause this not to work? I am adding new batches of files to an s3 source and the dependent reflection does not update no matter what I do. In fact the only way to get a different query result is if I don’t have any reflections defined at all.

kprifogle · October 17, 2019, 6:23pm

Edit: Manually hitting Reflect reflections makes it refresh, but thats the only thing that works, including hitting the Dataset refresh api endpoint which doesn’t work either.

doron · October 17, 2019, 6:39pm

Reflections will refresh every X hours which can be configured per source under Reflection Refresh. The refresh now button using the REST API, are you sure you are getting the correct id?

kprifogle · October 17, 2019, 8:08pm

Ok so I figured out the issue (repeating what you said mostly here @doron)

I was under the impression that incremental refresh actually updated the reflection as some function of the new files arriving but it still refreshes hourly, just uses incremental refresh to make it more efficient.
The refresh api is for the physical data source, it was hard to get the physical data source had to use url encoding and then tweak that some, but eventually found it. Once I found that the refresh worked although it was very slow, up to 20 -30 seconds for a simple data set that just consisted of id’s under 10,000

One issue that I am noticing that may be a big one. I was under the impression that when an impression is stale that it simply affects the speed with which the query is run and not its accuracy, but I’m finding on testing that the query actually produces stale results when reflections are not up to date. Does this match your understanding?

Topic		Replies	Views
Incremental reflection update question	3	2373	October 14, 2019
Update partition of a reflection	19	2975	December 9, 2021
Question about incremental reflection refreshes	1	1448	May 26, 2019
Refresh asset after data load	7	2495	April 27, 2020
Issue with Incremental Reflection Not Filtering by Last_Modified When Using Dynamic Date in VDS	4	45	January 9, 2025

How to get “Identify new records using the field” appear?

Related topics