Suppose we took one dataset from mysql and one dataset from hive and after performing join operation, shall we save the resulting dataset in one of the source like mysql, hive, hdfs etc.
You simply have to save the SQL as a VDS under a space and when ever you execute the VDS the join will take place automatically
Can we pull virtual dataset and save it into any data source like hdfs, hive, mysql etc.
It should be the 2-ways process if not can you please clear this point.
What is the reason you want to store back on Hive or HDFS? is it to be consumed by another tool? You can CTAS the query back to HDFS/S3/Azure Storage as Parquet but not Hive. The advantage of writing it back as CTAS is all Parquet will be faster. The better way of handling this would be to create a reflection on the VDS so it refreshes automatically
There is a white paper on Reflections too
What is the reason you want to store back on Hive or HDFS? is it to be consumed by another tool? YES
Is enterprise having this feature of storing back the resulting dataset in hdfs, hive etc.?
Both Community and Enterprise editions have the ability to write back to HDFS using CTAS
Both Community and Enterprise editions DO NOT have the ability to write back to Hive