This is somewhat of a criticial bug.
Using dremio 26.0.0 OSS, when I create an iceberg table on top of AWS GLUE:
- Dremio doesn’t have timstamp with timezone type, so the date cloumn is just a Timestamp
- Parquet files created record a standard timestamp column -
optional int64 field_id=6 created_at (Timestamp(isAdjustedToUTC=false, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false));
- AWS Glue records a timestamp column (timestamptz are not supported)
- However… the Iceberg metadata json file uses a timestamptz column.
When trying to read the table using pyiceberg, it just looks at the json metadata and fails due to type incompatibility between Timestamp and Timestamptz
You need to make ure timestamp columns in iceberg tables are properly recorded using the timestamp type
@sheinbergon Are you able to send “CREATE TABLE DDL” from Glue?
@balaji.ramaswamy I’m not sure what you mean. The I can provide with the actuall json manifest of the table, showing the type is indeed timestamptz even though dremio only uses timestamp. Would that help
@sheinbergon I would like to reproduce the issue locally so thought will get the table DDL so I can create it locally
Thank you for bringing this up and for sharing the details, @sheinbergon. You’re right - in Dremio OSS 26.0.0, Iceberg tables created with timestamp columns may be recorded in the metadata as timestamptz, even though Dremio only supports timestamp. As you noted, this can cause compatibility issues with external readers.
@Icaro_Seara Thank you for acknowledging this issue. Do you have any plan of fixing this behavior? It’s a serious bug.
Also, is this bug present in Dremio Cloud?
Hi @sheinbergon, we actually released a fix for this issue in Dremio Software Enterprise Edition, Community Edition and OSS 26.0.5 on September 10.
That’s super, 10x for fixing this!
Hey folks, a pyiceberg user reported running into this issue in "Cannot promote timestamp to timestamptz" error when loading Dremio created table · Issue #2663 · apache/iceberg-python · GitHub . I see that its been resolved in version 26.0.5 and the user also confirmed in the thread.
I dug into the fix on the Dremio side and found this change, Release 26.0.5 · dremio/dremio-oss@799ccbd · GitHub
It looks like Dremio can potentially write the TIMESTAMPMILLI data type in parquet with adjustToUtc=true. I dont think this is in accordance with the iceberg spec.
- Timestamp without timezone should always write parquet with
adjustToUtc=false
- Timestamp with timezone should always write parquet with
adjustToUtc=true
Wanted to follow up and double check with yall. LMK if I’m missing something here.
Hey Kevin, thanks for raising this and for looking into the change.
The core clarification is that Dremio’s TIMESTAMP type is not a “timestamp without timezone.” As documented, Dremio assumes all timestamps are UTC-normalized. Because of that, when we write an Iceberg table, Dremio maps TIMESTAMP to Iceberg timestamptz, which per the Iceberg spec is correctly written to Parquet with isAdjustedToUTC=true.
So the Parquet output you’re seeing is expected and consistent with the spec. The confusion is understandable if it looked like Dremio was using a no-timezone timestamp type, but that isn’t the case.
Hope this helps clarify!
Reference: Time Zone Support | Dremio Documentation
1 Like
awesome, that makes sense. Thanks for double checking