VDS sending Full Table Scan to Oracle

Good morning.
My organization is in the process of moving our reporting from our production database into Dremio.

We created a VDS using the following query:

SELECT   
    to_char( T.CREATION_DATETIME, 'YYYY-MM-DD' ) "Activity Date",
    t.LOCATION_ID "Location",
          COUNT( * ) "Moves"
    FROM   prod.MYSCHEMA.MY_TABLE T                               
   WHERE   T.SERVICE_CODE = 'INGAT' AND
                   T.COMPLETE_DATETIME IS NOT NULL AND
                   T.CANCELLATION_DATETIME IS NULL AND
                   T.CREATION_DATETIME >=  TO_DATE ( '05-01-2021', 'MM-DD-YYYY') AND
                   T.CREATION_DATETIME < TO_DATE ( '06-01-2021', 'MM-DD-YYYY') 
                   and t.LOCATION_ID in ('AB')
GROUP BY   
          t.LOCATION_ID,
          to_char( T.CREATION_DATETIME, 'YYYY-MM-DD' )
order by 
    t.LOCATION_ID,
    to_char( T.CREATION_DATETIME, 'YYYY-MM-DD' )

We can run this query directly against Oracle in less than a second, but when we try to pull it into dremio it takes up to 12 minutes. This is the query Oracle receives from dremio for the above:

SELECT *
FROM (SELECT "Activity Date", CAST('AB' AS VARCHAR(4000)) "Location", COUNT(*) "Moves"
FROM (SELECT CAST('AB' AS VARCHAR(4000)) "Location", TO_CHAR("CREATION_DATETIME", 'YYYY-MM-DD') "Activity Date"
FROM 
(SELECT "MY_TABLE"."SERVICE_CODE", "MY_TABLE"."LOCATION_ID", "MY_TABLE"."CREATION_DATETIME", "MY_TABLE"."COMPLETE_DATETIME", "MY_TABLE"."CANCELLATION_DATETIME"
FROM "MYSCHEMA"."MY_TABLE") "MY_TABLE"
WHERE "SERVICE_CODE" = 'INGAT' AND "CANCELLATION_DATETIME" IS NULL AND
 "CREATION_DATETIME" >= TIMESTAMP '2021-05-01 00:00:00.000' AND
  "CREATION_DATETIME" < TIMESTAMP '2021-06-01 00:00:00.000' AND
   "LOCATION_ID" = 'AB' AND "COMPLETE_DATETIME" IS NOT NULL) "MY_TABLE"
GROUP BY "Location", "Activity Date") "MY_TABLE"
ORDER BY "Location", "Activity Date"

“MY_TABLE” is a large table with millions of records.The unfiltered subquery against MY_TABLE is causing a full table scan and tanking performance.

What is the solution to this? I thought dremio was supposed to push filters down to the source.

@hmarchman-jones The reason this is happening is because Dremio maps Oracle date to a timestamp as Oracle date can also store time portion. This lead to the above problem so in our upcoming release we allow users to define the mapping, so in your case if you map it to a DATE it should use the index, even now it is pushing down but the index is not getting used resulting in a FTS

Thank you for the response.
Are you referring to the table index in Oracle, or indexes in Dremio?
Is there a way to work around this by casting the Activity Date field to another datatype?

@hmarchman-jones Talking about the index on the date column inside Oracle, the upcoming version would allow you to map Oracle DATE to a Dremio DATE (instead of a TIMESTAMP) and hence pushdown into Oracle as a DATE and the index should be used

1 Like