We are not able to read parquet data stored in S3 Compatible Storage.
We are able to read any other file format (csv, excel, …) stored on OVH openstack Swift storage.
We are using OVH Openstack swift with S3 Compatitlity mode enabled.
Here are the steps to reproduce the issue :
-
In Dremio UI, add a new Data Lake source, then choose Amazon S3
-
Provide AWS Access & Secret , then switch to Advanced tab enable Compatibility mode + add extra property under Connection Properties (name:“fs.s3a.endpoint”, value: “s3.bhs.cloud.ovh.net”) and save:
-
Try setting format & reading a simple CSV file located under the S3 compatible Data lake source - Result Should Work
-
Try Setting the format for your parquet file - Should work
-
After Saving the parquet format, try reading the parquet file formatted: This is where the following error will be printed:
I downloaded the Dremio Source code from Git, and tried to get more logs. the following log file provide more details.
dremio-error-read-parquet-s3.zip (3.6 KB)