Hive TimeStamp Data TimeZone problem

data from hive with timestamp type, when query them Dremio returns UTC timezone , how to change it to other timezone ?

2 Likes

Anyone on this subject? It seems really important…

I’ve the same issue using Looker connected to Dremio.

@balaji.ramaswamy @kelly any ideas

Hi @zhicheng_Hou

Dremio always normalizes to UTC

Thanks
@balaji.ramaswamy

Hi Balaji,
What do you mean by “Dremio always normalizes to UTC” how does Dremio understand the Table data time zone and does calculation to derive UTC?
for an example query on hive returns: 2019-02-02 02:33:59.650013
Same table query on dremio returns : 2019-02-02 07:33:59.000
if you check the above values Hour value “2” has become “7” in dremio. I would like to know how did dremio conclude it has to add 5 to derive UTC?

Thanks
DInesh

Hi @Dinesh,

How are you writing the timestamp data?
Are the files behind the Hive table Parquet?
If so, it’s possible that you are skipping the conversion to UTC when reading them in Hive. See the section on hive.parquet.timestamp.skip.conversion in the Hive docs: https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties

Hi Ben,
Thanks for the quick response.

  1. Our Hive tables are in ORC format
  2. I went through the link you have shared. I have follow up questions on that
    i) If hive.parquet.timestamp.skip.conversion is set to true. Then when we query hive tables we see raw data form tables and there will not be any implicit conversion applied.
    ii) When we query the hive tables via Dremio, system automatically sends the UTC conversion function to hive when a time stamp column is part of the Dremio query. Is this correct?
    if correct then based on what system derives the UTC time is it based on Dremio server timezone?
  3. Along with the UTC conversion the micro seconds is getting truncated in Dremio. Microseconds are becoming zeros. May I know the reason for it.

Thanks & Regards
A.Dinesh

Dremio confirmed its a bug and they will fix it in upcoming release