I have a reflection on a MySQL view (essentially a table to Dremio) called dw_data
The raw reflection is made up many display fields, however the ones in the query are id, data, signature and data_type_id. Additionally, this reflection is sorted by the field id and the “id” refresh policy is set to incremental update on the “id” field. Finally, the reflection is partitioned by the data_type_id field as we will use this in subsequent reflections and in our where clause for most queries against this reflection.
OK… That is the reflection. Now here comes the weirdness.
The reflection is utilized and the query is accelerated for all of the below.
SELECT MIN(id), MAX(id) FROM “production-data-analysis”.analysis.dw_data
This query looks to be good to go. It returns a 1 for the MIN and 171878598 for the MAX. The id’s are accurate based on the source data id values.
SELECT MIN(id), MAX(id) FROM “production-data-analysis”.analysis.dw_data WHERE data_type_id=6 OR data_type_id=7 OR data_type_id=8 OR data_type_id=32
Adding data_type_id (the partitioned field in the reflection), returns us inaccurate MIN and MAX values for the “id” field. 1527187755000 MIN and 1528386742000 MAX
SELECT * FROM “production-data-analysis”.analysis.dw_data WHERE data_type_id=6 OR data_type_id=7 OR data_type_id=8 OR data_type_id=32 LIMIT 1000
Selecting a subset of 1000 records provides us with accurate “id” values when using the data_type_id in the WHERE clause
SELECT * FROM “production-data-analysis”.analysis.dw_data WHERE data_type_id=6 OR data_type_id=7 OR data_type_id=8 OR data_type_id=32 ORDER BY id DESC LIMIT 1000
Adding the ORDER BY id DESC produces values that are incorrect an inline with query number 2 from above where the values are somewhere between the MIN and MAX from query 2 and repetitive. The id field is a unique column, auto incremented column on the data source in MySQL.
Why is query 2 and 4 returning unexpected results for a unique auto incremented “id” field that is sorted (by id) or grouped (to produce a min/max)?
Is the partition on the reflection the issue?
Is the sort on the reflection the issue?
Thank You for any insight into this. Let me know if you need query profiles to investigate the usage of the reflection.