After upgrading from Apache Drill 1.09 to Apache Drill 1.11, the Parquet files I’m creating with a datetime field are no longer being read correctly by Dremio…
July 31st, 2017 is somehow being read as -11347-05-17 00:00:00.000. I tried using Apache Spark to read the same files and it correctly returns back July 31st, 2107.
Here’s the Drill SQL Snippet which takes a CSV files and creates a Parquet file.
create table dfs.tmp.as_of=20170331
as
select
cast(invariant_id as char(9)) as invariant_id,
to_date(‘20170731’,‘yyyyMMdd’) as as_of_dt,
cast(‘DALYRATIO’ as varchar) as analytic_type,
cast(dalyratio as float) as analytic_value_float
from table(dfs./proj/r20170731.csv.gz
(type => ‘text’, fieldDelimiter => ‘,’, extractHeader => true))
where dalyratio not in (’’,’#DIV/0’)
This is what I see in Dremio when opening up the parquet files.
ESG000004
-11347-05-17 00:00:00.000
ANTICOMP_SCORE
5
ESG000004
-11347-05-17 00:00:00.000
BUSETH_FD_MGMT_SCORE
2.5
ESG000004
-11347-05-17 00:00:00.000
CLIM_CHG_THM_SCORE
6.3
ESG000004
-11347-05-17 00:00:00.000
COMP_ENV_SCORE
6.3
ESG000004
-11347-05-17 00:00:00.000
COMP_ESG_ADJ_SCORE
4.2