[Hive][Delta lake] Can not query with with = operator in Dremio 24.1.0

I connect source hive metastore hive 3 store delta lake tables. and I have table users with id and name data

id name
1 Alex
2 Su

But when I excuse query SELECT * FROM datalakelocal.ducvo.users WHERE id = 1;

The result should be

id name
1 Alex

rather than

id name
1 Alex
1 Su

So this is a bug of dremio? and how to work around with it

This is result of query SELECT * FROM datalakelocal.ducvo.users;

@ducvo

Can you please send me the profiles of both queries with and without the WHERE clause

Thanks
Bali

@balaji.ramaswamy
The profile: SELECT * FROM datalakelocal.ducvo.users WHERE id = 1;
f7935e29-931a-4c4f-81c8-d54e604f2c3d.zip (13.3 KB)

The profile: SELECT * FROM datalakelocal.ducvo.users;
55db70bd-0f54-4140-836e-d00b0e66bafc.zip (13.0 KB)

@ducvo Can you please disable planner.query_plan_cache_enabled and retry both queries and send profiles?

@balaji.ramaswamy
After I turn off the planner.query_plan_cache_enabled
The profile: SELECT * FROM datalakelocal.ducvo.users WHERE id = 1;
f1b148ed-2db1-4d2a-93f2-3ff89781f370.zip (13.6 KB)

The profile: SELECT * FROM datalakelocal.ducvo.users;
5f216385-ce63-423d-abc7-01e771cd6f1e.zip (13.3 KB)

Have you verified and checked the issue? @balaji.ramaswamy

@ducvo Does not look like a plan cache issue. Can you please try one more thing?

Run the below command and then run the query to see if that helps

ALTER PDS datalakelocal.ducvo.users FORGET METADATA
1 Like

@balaji.ramaswamy
Despite running the ALTER PDS datalakelocal.ducvo.users FORGET METADATA command, it does not seem to be functioning as expected.

The profile: SELECT * FROM datalakelocal.ducvo.users WHERE id = 1;
aabfbecf-b3da-4263-8598-efc7a837ad2a.zip (13.6 KB)

Does it work with WHERE id IN (1)?

@dch No. It does not.


Profile:
a93bca0f-7f4f-4f29-b6dc-2844f8a3a680.zip (13.6 KB)

Thank you for reporting.
Something went sideways during our initial Hive+Delta rollout. Our engineers will take a look.

1 Like

@dch I appreciate your help. Is this bug going to be fixed in the next release?

Yes, it absolutely will be!
In fact it is addressed already and just waiting for next release train.

@dch Can I know the date or month of the next OSS release to plan our project?

Hi @dch
I saw the fixed of this bug on 4.1.3 Release Notes (Enterprise Edition Only, July 2023)

  • The use of the WHERE clause in queries against Delta Lake tables in Hive sources is now supported.

@dch Can I know the date or month of the next OSS release to plan our project?

We prefer not to comment on future release dates however between you and I… if it’s not out in September the something went wrong.

1 Like

Thank you very much!

I’m facing the same issue with Delta Tables stored on S3. I have been using Dremio with Delta for a couple of months but this bug started all of a sudden. I am running the
24.1.0 version. One thing that I’ve noticed is that the bug is gone when the table is formated as Parquet.