Dremio connection to S3 - What does it do behind the scenes?

Hello,

I’m fairly new to Dremio as well as to AWS S3, and I’m trying to calculate the S3 costs making an assumption that I will access my S3 data via Tableau with Dremio connector.

I tested loading a dataset to my S3 bucket, and connecting from Tableau, when I check my AWS cost tracker I see several S3- Puts and S3 - Gets got generated.

Can you give me some info on what Dremio does behind the scenes when connecting to S3, so I can make an estimate of S3 cost?

Thanks,

Paula

Hey @paulisDataViz, If you configured S3 as Dremio’s Distributed Storage location, Dremio will use this location to store:

  • Reflections
  • Job results for queries previewed/run via Dremio’s UI
  • User downloads
  • User uploads
  • CTAS tables

Hi @can, I have not configured S3 as Dremio’s distributed Storage…I’m just connecting to the S3 bucket via source.

Can you please elaborate more on what actions done in Dremio generates S3 PUTS and GETs?

We get the bucket names, bucket locations, file names etc. Unless you configured S3 as distributed storage, I’d expect no PUTs. Would be good to understand what the PUTs look like. Could you also please share your dremio.conf file?