Shifted times in dremio vs parquet in spark

Hi,

I’m having a strange issue with times in dremio.
I have a parquet file produced by a spark job. I always use UTC-times. However when I look at the content of the file via dremio - dremio shifts the times with 2 hours.

See screenshot :
For example the first record - the start_period and end_period are properly set to 2020-08-30 00:00:00.000 and 9999-12-31 23:59:59.999 - but dremio shows 2020-08-29 22:00:00.000 and 9999-12-31 22:59:59.999

Any idea ?

Thanks !

@geertschneider, Dremio should be “timezone-less”, in that it should read the timestamps as written in the file, without any timezone adjustment. What tool are you using to view the files and comparing Dremio with?

Hi Ben,

Well the screenshot wit the CLI is just plan spark-shell.
The only thing I do in spark is 2 instructions :

  1. read parquet
  2. show the data

Indeed I was expecting that dremio would not do any action on the parquet data timestamps and indde show the data as is.
If I remember correctly parquet stores timestamps as epoch values / and they are always in UTC.

Geert

Found it…
apparently there is an option when writing data through spark -

spark.sql.session.timeZone

which I did not set explicitly to UTC.

Thanks for the input !

Geert