A Parquet file has data which represent DATE values.
Server 4.9.3-202010281843560195-edc49b6d
Driver 4.9.1-202010230218060541-2e764ed0
When a query projects data using the Dremio JDBC driver, the data is being shifted.
It appears that Dremio is effectively treating the value as a time stamp and then shifting the value based on the session time zone. By doing so, values are changing which day they represent.
select rnum, cdt, cast ( cdt as varchar(10)) from dbcert.tdt
As shown, type casting at the server shows the value in the original data versus the value returned by the driver.
Meanwhile same test using the web console shows the expected values.
Data is in Parquet. As mentioned, see your Browser UX result for the same query.
The issue is specific to how the JDBC driver is managing the result set.
Meaning, applying the TZ of the machine where the driver is running is causing the data to shift. Subject to the value and TZ could change day, month or year in the value.
I’m assuming that the Dremio Server and the client are on separate machines in different TimeZones?
You can specify the time zone in the connection string by adding a TIME_ZONE parameter and setting it to the desired time zone, such as PST, UTC, etc., this should let you resolve the problem.
This arises from the fact that there are two methods of getDate (or getTime/getTimestamp) in JDBC, one that takes a Calendar and one that does not. When you specify a Calendar you interpret the data according to that Calendar timezone, and if you don’t specify one then consensus seems to be that you use the currently set JVM timezone. This is likely what’s causing the offset for you.