Dremio to read encrypted parquet files


We have a use-case to read encrypted parquet files from HDFS for which we have custom-jar file to decrypt them.

We have put this custom-jar file in 3rdparty folder and wanted to know how it can be put to use

Ankit Saxena

How do you deal with it in your existing environment - do you have customized ParquetReader or wrapped FSInputStream to decrypt data first?

Hi yufeldman,

In existing environment we are executing as shown below command to connect to source hdfs.

hdfs dfs -libjars /path/to/custom.jar -ls foundry://@/datasets/

We unable to try the same approach in dremio.

It is great you are providing custom jar to do “hadoop dfs -ls”. Looks like you have your own FileSystem implementation (probably wrapper on top of hdfs) with scheme “foundry”, so I assume you have implementation for your FileSystem and subsequently FSInput/OutputStreams that do decoding/encoding. Is this a correct assumption?

@yufeldman , yes you got that right.

How can dremio access this foundry file system?

Ankit S

Unfortunately dremio currently does not have ability to have FS based sources other than HDFS, MaprFS, LocalFS and S3.
What you supposedly could try if you really use only your FS for all your needs is to modify core-site.xml (put it under dremio conf dir) and specify your (foundry) FileSystem’s implementation for “hdfs” scheme. It is hacky way of doing it and it limits you to only usage of “foundry” FS. Along with it you of course need your jar(s) in 3rdparty directory.