And have managed to create a physical dataset, with everything, including the glue job working fine. In the relevant S3 bucket, I can also see the data and metadata files. However on trying to query that dataset, I am getting this error:
dataset at [/dataset/###.db.###] not found.
Can anyone help here? Is there anything I am doing wrong, or is there a step which I have missed?
@Ritik Are you trying to query via S3 or Glue, if S3, do you see it as a folder or a purple icon? Do you see the dataset under inforation_schema.“tables”?
I have created a new Data Lake source as a Glue data catalog and I have succeeded in creating the physical dataset, i.e. I can see the purple icon. However, I am not able to query the iceberg dataset as I am getting the dataset not found error.
To test further, I created a normal table in the Glue data catalog in my AWS account, a tab separated flat file stored on S3. However when I try to access that through Dremio via the Glue data catalog source (the table for the flat file was created previously), I get a 403 forbidden error.
Essentially I am not able to access any dataset in the Glue data catalog source I created on Dremio, be it iceberg or a normal file stored in S3 (both of which have tables in the data catalog in my AWS account).
@Ritik From your description, it seems the executors are not able to talk to your Glue service but the coordinator can, please check if the network path is open for the executors to connect to Glue