Oracle Making Full table Scans in some cases

sgoldsmith · August 18, 2023, 8:56pm

I am attempting to use a list of Indexed ID’s stored in Dremio as a way to quickly lookup entries in a table with 200+ million rows. IOn oracle, the ID’s are stored as the Data Type “NUMBER”

When I run a query like

select * from oracle.big_table where big_table_id = 123456789

the query runs and I get the results back in less than a second.

Even if I run something like

select * from oracle.big_table where big_table_id in (List of 100 ID’s)

I get good performance.

But if I have a table in Dremio with the same 100 ID’s (dremio.table_ids) and try to run

select * from oracle.big_table where big_table_id in (select ID from dremio.table_ids)

It takes forever to run. Is there any way to make this work?

balaji.ramaswamy · August 22, 2023, 3:48am

@sgoldsmith Wondering if the push down to Oracle is causing a FTS, any chance you can send us the 3 job profiles?

sgoldsmith · August 28, 2023, 8:11pm

e0f4d974-a14b-43f1-aebc-eeae94fe16a1.zip (12.3 KB)
1efd4d45-f535-489b-9ca9-b0c06b9b8b8d.zip (10.4 KB)
9579f45a-3d5c-4d95-afed-fb6ff7f7323a.zip (10.2 KB)

I recreated the scenario, here are the different profiles.

balaji.ramaswamy · August 29, 2023, 4:00am

@sgoldsmith I see 3 profiles

Job ID# 1b1302e9-cc8c-32e5-6b73-31b0e6adca00 I see the filter gets pushed down and query completes in 0.24s
Job ID# 1b130214-c8d3-ae86-1ba2-db98095ed200 It is a IN FILTER rewritten as OR and filter does get push down and query completes in 0.26s
Job ID# 1b1301f3-8d23-d05d-f8c7-9d727c04b700 is a join between an Oracle Table and a PARQUET dataset, there is no push down possible in this and query completes in 1.3s

What is your specific concern?

Do you have the profile when it runs forever? Cancel after it runs for say 30m and send us the profile

sgoldsmith · August 29, 2023, 5:19pm

Here is the profile of the long run.
e6eacacd-183b-4533-95be-4692229cd409.zip (18.8 KB)

balaji.ramaswamy · August 31, 2023, 12:21pm

@sgoldsmith Looks like there is no FILTER in the query and it is selecting all records from Oracle, since the query is cancelled, not sure how many records in the Oracle table. How many records are there in HOLDINGDBO.POSITION_DETAIL? Since the inner query is from PARQUET, the result of the inner query cannot be pushed down into Oracle as they are 2 different sources in Oracle

sgoldsmith · August 31, 2023, 6:59pm

I cancelled the query after about an hour. The HOLDINGDBO.POSITION_DETAIL has around 4.3 billion records.

balaji.ramaswamy · September 5, 2023, 4:37pm

@sgoldsmith Currently pushing down results from another source’s results into a different source. Can we archive the table and then create a reflection on the live table?

Topic		Replies	Views
VDS sending Full Table Scan to Oracle	3	918	July 2, 2021
VDS using native SQL	11	3053	November 27, 2018
Dremio not pushing JDBC Predicates when UDF used?	0	1050	November 25, 2019
Is there a way to push down manually?	4	1541	August 3, 2020
Dremio slower than SQL Developer?	7	1779	May 15, 2020

Oracle Making Full table Scans in some cases

Related topics