Hi Dremio Team,
can we get some isight’s on cloud caching?
after enabling cloud cache,
how do i correlate cached data to a physical data/queried data?
how do i know if query is using the cache(from profile)?
how do i manually cleanup cache db and files directory?
how do i correlate cached data to a physical data/queried data? - Do you want to compare? Then query the source data using an external tool and compare
how do i know if query is using the cache(from profile)? - Yes, from profile
how do i manually cleanup cache db and files directory? We control this by setting the percentage property - Deleting files from cache
any update on @smora question? I am interested in cleaning up some cache folders
#1 Yes, if you run select * from UI (run), that should cache all rows so subsequent queries on that dataset should use C3 cache. The files are under the folder configured but not readable.
#2 Expand the Parquet_Scan, scroll to the Operator_metrics section and scroll to the right and you will see “NUM_CACHE_HITS” and “NUM_CACHE_MISSES”, see attached screenshot
#3 Currently this is not possible
Queries run via UI are truncated to roughly 1M records. Does Dremio still fetch all the data from the data lake and stores it in C3 or would it only cache the Parquet files required for the ~ 1M records?
I have exactly same question.
Plus to this, I want understand can I disable cache for only one table?
Some of my tables has very fat columns, which never used in Queries only in ETL flow.