Our data in Hive is organized with subdirectories under the table directory but Dremio seems assume the file structure is flat the all the data lives under the table directory without any subfolders. When we the non-flat tables, we get:
Caused by: java.io.IOException: Not a file: adl://home/datasets/subdir/subdir2/subdir3/subdir4/2018-03-14
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) ~[hadoop-mapreduce-client-core-2.8.0.jar:na]
at com.dremio.exec.store.hive.DatasetBuilder$HiveSplitsGenerator.runInner(DatasetBuilder.java:377) ~[dremio-hive-plugin-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.store.hive.DatasetBuilder.buildSplits(DatasetBuilder.java:444) ~[dremio-hive-plugin-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.store.hive.DatasetBuilder.buildIfNecessary(DatasetBuilder.java:285) ~[dremio-hive-plugin-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.store.hive.DatasetBuilder.getDataset(DatasetBuilder.java:204) ~[dremio-hive-plugin-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.catalog.DatasetManager.getTableFromPlugin(DatasetManager.java:297) [dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
... 32 common frames omitted
In which /home/datasets/subdir/subdir2/subdir3/subdir4/
is path for the table and 2018-03-14
is another folder in which the real data live inside that folder.