Very slow query for glue datasource when "is not null" is used

dremio 20.1

I have a glue catalog datasource we’ll call glueci which has a table with ~3000 partitions, the partition being queried has 1 row in it (assume column1=‘abc’ is the partition lookup). When ‘is not null’ is used and a specific column is selected it runs for a very long time (1-2 min), otherwise the response is <=1s. I tried to reproduce with an s3 datasource and same parquet file, but that seems to works fine. dropping/adding ‘glueci’ did not help.

-- * returns fine
SELECT *
FROM glueci.catalog1.table1
WHERE column1='abc' and column2=945737 and column3 is not null

-- not isnull() returns fine
SELECT column3 
FROM glueci.catalog1.table1
WHERE column1='abc' and column2=945737 and not isnull(column3)

-- no 3rd condition returns fine
SELECT column3 
FROM glueci.catalog1.table1
WHERE column1='abc' and column2=945737

-- this takes a long time
SELECT column3 
FROM glueci.catalog1.table1
WHERE column1='abc' and column2=945737 and column3 is not null

-- this takes a long time
SELECT column3 
FROM glueci.catalog1.table1
WHERE column1='abc' and column2=945737 and not(column3 is null)

image

@benw Would it be possible to provide the profile for the below 2 queries?

SELECT *
FROM glueci.catalog1.table1
WHERE column1='abc' and column2=945737 and column3 is not null
SELECT column3 
FROM glueci.catalog1.table1
WHERE column1='abc' and column2=945737 and column3 is not null