[Hive][Delta lake] Can not query with with = operator in Dremio 24.1.0

@balaji.ramaswamy
The profile: SELECT * FROM datalakelocal.ducvo.users WHERE id = 1;
f7935e29-931a-4c4f-81c8-d54e604f2c3d.zip (13.3 KB)

The profile: SELECT * FROM datalakelocal.ducvo.users;
55db70bd-0f54-4140-836e-d00b0e66bafc.zip (13.0 KB)

@ducvo Can you please disable planner.query_plan_cache_enabled and retry both queries and send profiles?

@balaji.ramaswamy
After I turn off the planner.query_plan_cache_enabled
The profile: SELECT * FROM datalakelocal.ducvo.users WHERE id = 1;
f1b148ed-2db1-4d2a-93f2-3ff89781f370.zip (13.6 KB)

The profile: SELECT * FROM datalakelocal.ducvo.users;
5f216385-ce63-423d-abc7-01e771cd6f1e.zip (13.3 KB)

Have you verified and checked the issue? @balaji.ramaswamy

@ducvo Does not look like a plan cache issue. Can you please try one more thing?

Run the below command and then run the query to see if that helps

ALTER PDS datalakelocal.ducvo.users FORGET METADATA
1 Like

@balaji.ramaswamy
Despite running the ALTER PDS datalakelocal.ducvo.users FORGET METADATA command, it does not seem to be functioning as expected.

The profile: SELECT * FROM datalakelocal.ducvo.users WHERE id = 1;
aabfbecf-b3da-4263-8598-efc7a837ad2a.zip (13.6 KB)

Does it work with WHERE id IN (1)?

@dch No. It does not.


Profile:
a93bca0f-7f4f-4f29-b6dc-2844f8a3a680.zip (13.6 KB)

Thank you for reporting.
Something went sideways during our initial Hive+Delta rollout. Our engineers will take a look.

1 Like

@dch I appreciate your help. Is this bug going to be fixed in the next release?

Yes, it absolutely will be!
In fact it is addressed already and just waiting for next release train.

@dch Can I know the date or month of the next OSS release to plan our project?

Hi @dch
I saw the fixed of this bug on 4.1.3 Release Notes (Enterprise Edition Only, July 2023)

  • The use of the WHERE clause in queries against Delta Lake tables in Hive sources is now supported.

@dch Can I know the date or month of the next OSS release to plan our project?

We prefer not to comment on future release dates however between you and I… if it’s not out in September the something went wrong.

1 Like

Thank you very much!

I’m facing the same issue with Delta Tables stored on S3. I have been using Dremio with Delta for a couple of months but this bug started all of a sudden. I am running the
24.1.0 version. One thing that I’ve noticed is that the bug is gone when the table is formated as Parquet.

I think this issue is fixed. I see the released 24.2.0, but I can not found docker images for this version.

I created new issue: Not found build or code for version 24.2.0

@ducvo , I have been able to fix this issue on the weekend. Maybe it can help you, the issue was caused by the pyarrow version 13.0 that was a dependency for deltalake to write data on s3. Our dockerfile didnt’ have a specific version and wasn’ t updated on a while, so when we changed another library it updated the pyarrow package. With pyarrow 8.0 everything went back to normal. I dont know if its the same for you because I wasn’ t using hive, it was just s3. Hope you solve your problem as well. Best regards

1 Like