Dremio not using partition filter

Hello we have a test case to reproduce, If I create a table with a partition hierachy for example partition by date and ID (bucket count) if we run a query using field ID Dremio not using partitionFilters and scan all iceberg files, but if I use identity partition in ID column Dremio use partitionFilters

Query Profile: (date and bucket partition)
4cfe2cca-dd79-4966-acd0-45c7030d732f.zip (12,1 KB)

Query Profile: (date and identity partition)
60b4f517-dbb4-4969-88d2-b9bd9a50c71d.zip (12,5 KB)

Test case:

DDL:

create table lake.staging.dacopan1
(
    ID                        varchar,
    MODIFIED_DATE             timestamp
) PARTITION BY (day(MODIFIED_DATE), bucket(30,ID))

test data:

insert into lake.staging.dacopan1  (ID, MODIFIED_DATE) VALUES (1, TO_TIMESTAMP('2024-05-16 02:45:00', 'YYYY-MM-DD HH24:MI:SS')),
(2, TO_TIMESTAMP('2024-05-17 02:45:00', 'YYYY-MM-DD HH24:MI:SS')),
(3, TO_TIMESTAMP('2024-05-20 02:45:00', 'YYYY-MM-DD HH24:MI:SS'));

insert into lake.staging.dacopan1  (ID, MODIFIED_DATE) VALUES (111, TO_TIMESTAMP('2024-05-16 02:45:00', 'YYYY-MM-DD HH24:MI:SS')),
(222, TO_TIMESTAMP('2024-05-17 02:45:00', 'YYYY-MM-DD HH24:MI:SS')),
(666, TO_TIMESTAMP('2024-05-20 02:45:00', 'YYYY-MM-DD HH24:MI:SS'));

Query

SELECT * from lake.staging.dacopan1 where ID= 222

please @Benny_Chow I would greatly appreciate your help

@dacopan interesting, I was just testing out bucket partition transform and did not hit this scenario, let me reproduce and get back to you

1 Like

@dacopan Try recreating ID column as INT/BIGINT, if you have to absolutely keep ID as STRING then can you ad singles quotes to your filter?

using varchar in insert and in select works well and partitionFilters=[partition_range_filter:=($0, ‘222’);] are present in plan,
thanks @balaji.ramaswamy but even I think this should be automatically managed cast by Dremio.

Thanks I’ll continue report other bugs in other posts

1 Like