I my case, I want to save current result dataset into my Hive so I can do more query base on this dataset and no need query out it again.
Thanks
I my case, I want to save current result dataset into my Hive so I can do more query base on this dataset and no need query out it again.
Thanks
Hive is only a query shell, actual data is stored in HDFS. Here is it what you can do
#1 Add a HDFS source pointing to the namenode
#2 Use CTAS. “Enable exports into the source (CTAS and DROP)”, see attached screenshot below
#3 Run the query of your choice and add a "Create table as hdfs.source_name.full_path. This would generate PARQUET files with the output of the query in the HDFS path defined
#4 Create Hive external table (via Beeline or Hive shell) pointing to the Parent Parquet folder containing. these Parquet files
#5 Use Dremio to query the files back via the Hive source
Thanks
@balaji.ramaswamy
Thanks a lot. Very helpful