Hi Dremio Team,
we have huge query performance issue with reflection on HDFS storage, in most of the cases it turns out having NO reflection is better than creating a reflection on HDFS as reflection store. reflections on local storage is OK.
we ran several tests to compare the results,
here is sample comparison using same dataset for the same query
running on a cluster with 8 executor nodes, PD is bunch of parquet file sitting on HDFS.
with reflection on local storage: 2min 26sec profile: a320daf6-fe5b-403a-b316-8fd46bde2a5f.zip (916.6 KB)
with no reflections: 4min 5sec profile: 4639afa2-9a29-4de4-b213-b88c826e9580.zip (876.5 KB)
with reflection on hdfs storage : 24min 41sec profile: c00a84ad-03e9-4503-8c04-f4721388cd71.zip (919.8 KB)
Thanks