Scalability in creating reflections

Creating Reflections Not Performed Distributed on Nodes. Is this behavior normal?

@alex_lopes You mean the materialization files are heavily concentrated on one or couple of nodes and other nodes not hosting at all or less ?

Is this is a PDFS ? or HDFS ?

@Venugopal_Menda

I refer at the time of the creation of the reflection. When I see the execution plan I never see two or more executor clusters working on creating a reflection always one node. I’m working with one coordinator node and four executors with Deploy at Azure. Is this processing to be distributed among clusters?

@alex_lopes, Dremio will fully parallelize execution for many operations. Can you share a profile for one of the REFLECTION REFRESH jobs that you think isn’t using the full width of your cluster?

@ben
As you can see in the image, the reflection creation process is using only one node… I Have four nodes…

@alex_lopes, can you attach the profile, I’d like to take a look:

profile_reflection.zip (15,7,KB)

Thanks @ben

@alex_lopes, because the source is against a single table (no joins) in a RDBMS (Oracle, in this case), the read will be single threaded and subsequent operations will be single threaded. If you were reading from files on a filesystem source, your would see parallelization.

1 Like

@ben Makes sense! Thank you for the informations!