Hi,
I have wide table (>200 columns) with 30M rows.
I want to accelerate select distinct * from table.
So I have created view with “select distinct * from table” and created a full Raw reflection on it.
My understanding is that this reflection would materialize the results of the view and it would execute instantly.
When I run select * from view limit 10 it still 16secs to finish. The job says that the acceleration was used.
Without acceleration it took >30s.
Can you please advice how to approach this?
5ed4488f-07bb-42b8-8024-6bb42005ef0f.zip (397.6 KB)
thanks
jaro
@jaroslav_marko SELECT DISTINCT *
on a 445 column table is going to be expensive as you only have a raw reflection. Aggregations are faster only with Agg reflections. But DISTINCT is not a measure that is currently available. You can use “Approximate count” if that works, again doing it on 445 columns would probably take time to create the agg but the actual query might be faster. What is your end goal? Is there any specific column(s) that you are trying find distinct values?
Hi @balaji.ramaswamy sorry again for late reply. i think the goal is to remove duplicates from table. I was thinking that Raw reflection creates a physical copy of distinct * table. then running count(*) on this set should be quick. but it isn’t. is my assumption correct?
thanks
Jaro
@jaroslav_marko Raw reflection will just executor the SQL logic on the VDS it is built on