Can I understand, Once I update the table/view structure, such as new view created, new column adding, or column data type change, the metadata table INFORMATION_SCHEMA.COLUMNS will get the timely update ASAP? For example, how much time I can see the updated metadata after my structure change of my view? thanks! Is there some restriction about data fetch for this metadata table?
For example, some table data is not in it? when we can get the metedata refresh once view structure changed? …
We want to use this table data for some further automation work. Thanks.
Hi @dolphinlei, one thing I learnt recently is : concerning tables, they will not appear in the information_schema tables until they are requested (by hand, or by creating a view ontop).
Is this your concern ?
no, I just wondering, f something changed for the view’s structure, how long time its metadata can be refreshed and accessible in INFORMATION_SCHEMA.COLUMNS?
Does all view structure change (columns) are available in INFORMATION_SCHEMA.COLUMNS?
I do think so, but I sometimes experienced a delay in metadata refresh.
how much delay you encountered?
Just get a link for metadata for Metadata Refresh Settings:
there is a set for this, Why it is called " Caching Source Metadata", do we cache the source metadata in this table?
IMHO information_schema are just views ontop a kind of cache structure. And yes, datasets’ schema can be prefetched (cached) to speed up queries (if not, metadata has to be read at query time).
For non file system sources, the default is one hour where we look for new tables, open the source settings on Dremio and click on the metadata tab and then see “Dataset Discovery”
For new columns added, look just below for “Dataset Details”
Here is the full documentation on this
Once we discover a table via the light probe “Data set discovery” we would add it to information_schema, this does not mean we collect metadata (dataset details), we only do it for objects that have atleast been queries once
Thanks for your reply.
For the setting of “Metadata Caching”, is it only about cache of the metadata table (such as metadata table: INFORMATION_SCHEMA.COLUMNS)?
Can I understand even though we have no setup for the “Data Discovery” and “Dataset Details”(means we have no cache for metadata related table), when we change/add some dremio vds views, their metadata change will be also refreshed in metadata table: INFORMATION_SCHEMA.COLUMNS, the difference is just not cache the metadata table. Is it correct?
Or the Metadata caching including “Dataset Discovery” and “Dataset Details” means the data availability of metadata table? Only we have configuration of the two, once some structure change on dremio views/tables, we can see the metadata table show the exact and updated structure of that changed view/table?
Also for your last input, do you mean, if we have view structure changed (adding or deleting some columns), or we have new view adding or existing view deletion, only we query that view one time at least, its metadata will be refreshed in metadata table, if that view/table changed/adding. but no one query it, its metadata will not be shown in metadata tables.
Actually what I care most is just about the data availability for metadata table once some changes happen for view/table structure, not care so much about the cache setting up. The metadata table size should be not big, why we need to cache it?
currently what I want to get the view(the vds) structure. not the pds table structure, I am wondering if ome my view delete or adding some columns, or new view adding or existing view deleted, I want to know when I can see the update view structure will be shown on the metadata table such as INFORMATION_SCHEMA.COLUMNS and so on.
Would INFORMATION_SCHEMA.views help?