HDFS connector interpret VARCHAR as VARBINARY

Hi,
connecting to an hdfs source i notice that the varchar columns are interpreted as varbinary.
This not happend for columns with data format, like you can see from the picture.
Another strange thing, the column dir0 has the true format. it is the partition column of the Impala table.
Using the Hive connector this not happend.
All this cause problems to cast these varbinary columns into varchar.
With api requests i have extracted the infomation about the pds:
{
“entityType” : “dataset”,
“id” : “8c096312-4df7-4f16-abf7-54c0b870038b”,
“type” : “PHYSICAL_DATASET”,
“path” : [ “HDFS”, “test”, “sample_20200110” ],
“createdAt” : “2020-02-04T10:41:07.193Z”,
“tag” : “31”,
“format” : {
“type” : “Parquet”,
“ctime” : 0,
“isFolder” : true,
“location” : “/test/sample_20200110”,
“autoCorrectCorruptDates” : true
},
“approximateStatisticsAllowed” : false,
“fields” : [ {
“name” : “xxx”,
“type” : {
“name” : “VARBINARY”
}
}, {
“name” : “scheda”,
“type” : {
“name” : “VARBINARY”
}
}, {
“name” : “ts”,
“type” : {
“name” : “TIMESTAMP”
}
}, {
“name” : “transaction_id”,
“type” : {
“name” : “TIMESTAMP”
}
}, {
“name” : “dir0”,
“type” : {
“name” : “VARCHAR”
}
} ]
}

thanks!

@LucGth, what is the file type you are accessing? Parquet or ORC or something else?

HI Ben, are parquet files.

You should have parquet-tools installed with your Hadoop distribution.

Try running: $ parquet-tools meta <path/to/sample_20200110> and attach the output here.

Hi Ben, thanks for your reply.

creator: impala version 2.12.0-cdh5.16.1 (build 4a3775ef6781301af81b23bca45a9faeca5e761d)

file schema: schema

xx: OPTIONAL BINARY R:0 D:1
yy: OPTIONAL BINARY R:0 D:1
zz: OPTIONAL INT96 R:0 D:1
kk: OPTIONAL INT96 R:0 D:1

row group 1: RC:1 TS:3039

xx: BINARY SNAPPY DO:4 FPO:33 SZ:57/53/0.93 VC:1 ENC:PLAIN_DICTIONARY,RLE
yy: BINARY SNAPPY DO:134 FPO:2978 SZ:2872/5957/2.07 VC:1 ENC:PLAIN_DICTIONARY,RLE
zz: INT96 SNAPPY DO:3059 FPO:3086 SZ:55/51/0.93 VC:1 ENC:PLAIN_DICTIONARY,RLE
kk: INT96 SNAPPY DO:3193 FPO:3220 SZ:55/51/0.93 VC:1 ENC:PLAIN_DICTIONARY,RLE

is it normal in your opinion?

thanks