How to create and refresh tables

snelaturu · February 22, 2021, 6:24am

I have few questions on dremio and trying to compare with presto.

In presto, we normally create tables in hive, and have presto to run queries by referring to the metadata created in hive. I’m trying to understand the same in dremio, and have below questions.

Does dremio support meta data management or do we need to have hive similar to presto.
If dremio supports meta data management, then how to create tables in Dremio. Using docker am able to create a data lake. But not sure how to create tables. Do we need to create PDS and VDS from data lake.
Is this the only way to create tables (using PDS and VDS)?
I have partitioned data in ADLS based on specific column and want to create tables accordingly. Not sure how to create table with partition column in dremio.
How to refresh dremio table whenever a new partition is added in ADLS. As per our use case, we create at least 100 partitions every day to tables. Want to have this reflected in dremio table in real-time.

Thanks

balaji.ramaswamy · February 23, 2021, 4:49am

@snelaturu

#1 Yes Dremio does metadata management of its own
http://docs.dremio.com/advanced-administration/metadata-caching.html
http://docs.dremio.com/sql-reference/sql-commands/datasets.html#refreshing-physical-dataset-metadata

#2 How to create tables in Dremio? Not required, just add the source and the background metadata refresh should scan for new tables, new columns, new datatypes, new files added etc

#3 If you want to create new tables based of older tables then you can use CTAS
http://docs.dremio.com/sql-reference/sql-commands/tables.html

#4 You simply add the Azure storage as a source and Dremio and promote the dataset (above the partition), Dremio will turn it into a table with partitions defined
http://docs.dremio.com/data-sources/azure-storage.html
http://docs.dremio.com/rest-api/catalog/post-catalog-id.html

#5 Real time is coming up later this year, but until then you can increase background refresh interval or once the new partition is added just refresh metadata only for that dataset using SQL
http://docs.dremio.com/sql-reference/sql-commands/datasets.html#refreshing-physical-dataset-metadata

Topic		Replies	Views
Dremio Table Format as Parquet Dynamically Dremio University	26	3993	September 27, 2021
How many of partitions does dremio support and time to refresh	3	1677	February 23, 2021
Dremio Refreshing Data	3	3495	May 12, 2020
Table creation from microstrategy	1	687	January 28, 2023
Discover newly added table	10	1346	August 6, 2020

How to create and refresh tables

Related topics