Hey guys
What is the db folder used for? Why does it generate so much io?
We have a dremio env with the following rough folder sizes:
/opt/dremio/data/db/catalog: 156gb
/opt/dremio/data/db/search: 13gb
For the last 2 full days, our efs io charges are $227 and $222 USD. Over the last year, we have had spikes of $900 USD per day for efs io. I wasn’t keeping an eye on the charges at that time, so I don’t know what the dremio catalog sizes were at those times.
But for the last 2 full days, the total io traffic associated with the efs volume was 7.9tb and 7.7tb. The read portion of that io traffic is 7.7tb and 7.4tb respectively for those 2 days. This correlates well with the aws cost explorer findings that most of these io charges are for io reads and not io writes.
What could be causing so much io?
Our paths.local folder is mapped to aws efs. If I understand it correctly, efs is some form of nfs which dremio supports. Our paths.dist folder is mapped to s3. So all the reflection data storage and access should be billed as part of the s3 charges and not efs.
At one stage, I thought the efs charges was related to the dremio’s default setup of the spill folder being mapped to ${paths.local}/spill. In our desperate attempt to make these efs io charges go way, even though it’s not explicitly stated by dremio as being supported, I have the remapped the spill folder to an ec2 instance storage volume. But this didn’t help either. So spilling from larger queries is not the cause either.
One more thing to mention is the “reflection.cloud.cache.enabled” support key has been set to false to avoid efs io. Since it was a support key, I didn’t think a dremio restart was necessary. If a restart is required to effectuate this change, please let me know. Our efs io costs haven’t changed in any noticeable way.
This might be a separate topic but why did dremio by default have the cloud cache folder mapped to a folder under paths.local when dremio also recommends that paths.local be mapped to distributed storage like nfs. People running dremio in the cloud would most likely be using some form of metered nfs solution. This could have been another way to bankrupt us, but for the moment, it looks like that’s not what’s causing the high efs io right now.
Back to the main topic. In our current state, if I’m understanding everything correctly, there is nothing obviously io intensive mapped to the paths.local folder.
We do have a lot of reflections defined and they are all regularly refreshing on a schedule. Based on my understanding, data associated with those reflections is being saved to paths.dist which is backed by s3. So why is the “/opt/dremio/data/db/catalog” folder so huge. As much as I can tell, all the efs io is related to access to files under that catalog folder.
Are we doing something wrong or is dremio reflections just a scam? When reflections are turned on, we can avoid hits on the backing data source by having the data cached in s3 based storage. But the io associated with the maintenance of the reflection definitions in the catalog folder will generate so much io that a metered nfs service will cost many times more than hits to the backing data source itself.
So then is the only way for dremio to be cost effective in a cloud based deployment is for it to be run in a single node arrangement for the paths.local folder to be mapped to either ebs storage or instance storage.
Please advise us on how we can avoid bankruptcy. Thanks.