Discover newly added table

Hi, wanted to ask how to shorten a time between creating of new table in source and being able to query it in dremio.

For example, SF connector (ARP) does not allow to define discover refresh rate, and it takes quite long time before it is available for querying.
Additionally, can the discovery be triggered manually?
Lastly, the best option would be to discover the table every time someone tries to query it but dremio does not know about it yet.

Thanks

Hello @froxCZ

If you know the path of the newly created table then the below will help to see the table in Dremio immediately,

Alter PDS {tablename with path} refresh metadata;

Thanks,
@Rakesh_Malugu

@Rakesh_Malugu
Hi! That would be awesome. I tried it, but get different error. I dont understand why… The SF dataset definitely exists. I can run query like select * from SF."METRICS3_STG".PUBLIC.TEST; fine, but this fails:
alter PDS SF refresh metadata;

VALIDATION ERROR: Unable to refresh dataset.

(com.dremio.connector.metadata.DatasetNotFoundException) null
com.dremio.exec.catalog.SourceMetadataManager.refreshDataset():570
com.dremio.exec.catalog.ManagedStoragePlugin.refreshDataset():756
com.dremio.exec.catalog.CatalogImpl.refreshDataset():606
com.dremio.exec.catalog.SourceAccessChecker.refreshDataset():231
com.dremio.exec.catalog.DelegatingCatalog.refreshDataset():209
com.dremio.exec.planner.sql.handlers.direct.RefreshTableHandler.toResult():59
com.dremio.exec.planner.sql.handlers.commands.DirectWriterCommand.plan():99
com.dremio.exec.work.foreman.AttemptManager.plan():427
com.dremio.exec.work.foreman.AttemptManager.lambda$run$1():330
com.dremio.service.commandpool.CommandWrapper.run():62
com.dremio.context.RequestContext.run():95
com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$3():199
com.dremio.common.concurrent.ContextMigratingExecutorService$ComparableRunnable.run():180
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748

Would u know why?

Hello @froxCZ

I mean,

alter pds SF.“METRICS3_STG”.PUBLIC.TEST refresh metadata;

can you try this?

Thanks,
@Rakesh_Malugu

@Rakesh_Malugu I can run that on already discovered table. But if I run it on table that dremio does not know about yet, it fails with the exception I posted.

Hi,

Datasets can be created over the REST API. Is that an option here?
Create the dataset when you create the table using the API.

Will try. But found only documented way to create VDS, not PDS
https://docs.dremio.com/rest-api/catalog/dataset.html

Can u tell me how to create pds?

I use this for creating physical datasets
https://docs.dremio.com/rest-api/catalog/post-catalog.html

I’m not sure what your underlying datasource is, I have only created datasets on parquet files.
An example request body to that API would look like this
{“path”: [“employee”], “entityType”: “dataset”, “type”: “PHYSICAL_DATASET”, “id”: “employee”, “format”: {“type”: “Parquet”}}

Did not get that to work…

Physical Datasets can only be created by promoting other entities.

Not even sure it can work. Seems creating PDS is designed only for parquet, csv or other files… not for discovering tables from other DB sources.

@Rakesh_Malugu what else can we try?

Hello @froxCZ

The error means that the table didn’t found.

(com.dremio.connector.metadata.DatasetNotFoundException) null

You will get the same error when you try to run the query on a non-existing table.

If you have already a PDS in the newly created table location, then I would recommend you copy the PDS path and just change the table name to newly created one and run the alter PDS command.

Thanks,
@Rakesh_Malugu

@Rakesh_Malugu
I’m not sure you understand my issue. Wehn I create new table in Snowflake, Dremio does know about it. When I run the ALTER PDS statement, Dremio says that table does not exist, even though the table exists in Snowflake already. Dremio is not able to discover it, until periodical Discover names refresh is ran :frowning: How can I tell Dremio that table X exists in Snowflake and to register it as PDS?