Problems with connection to AWS Glue Catalog Iceberg table

Hi,

I have gone through the steps here:

And have managed to create a physical dataset, with everything, including the glue job working fine. In the relevant S3 bucket, I can also see the data and metadata files. However on trying to query that dataset, I am getting this error:

dataset at [/dataset/###.db.###] not found.

Can anyone help here?

Can you check that you have set the connection property: hive.metastore.warehouse.dir to your s3 bucket location

More information here: Dremio

Hi, thanks for your reply.

I have done it now, by pointing the data source in Dremio to the S3 bucket location where the AWS Glue job created the iceberg table with the data. However, I still get the same error:

dataset at [/dataset/###.db.###] not found.

My Dremio version is 21.3. Also, the AWS instance on which I am running Dremio is in Ireland, not in N. Virginia. Could these have anything to do with it?

Your VPC and s3 bucket location need to be in the same aws region. As long as all of your resources are in Ireland, you should be fine.

Ah. So my Dremio instance is deployed in Ireland, but my S3 bucket and Glue job are in Virginia (I did it that way since the Iceberg connector for the Glue job works only in Virginia).

However, the issue just seems to be with iceberg tables. I created other glue databases in Virginia pointing to normal flat files in S3 and they seem to work fine on Dremio.

Could the version have something to do with it? Or is there a configuration which I may have missed?