What is the Display clause alter dataset create raw reflection imply? What thinking needs to go in to select all columns or few?
In order for a certain column to be part of the reflection, we need to add it as a display column. For example, if a table has 3 columns c1, c2, c3 and while creating a reflection we specify or choose c1,c2 and is user fires 2 queries
select c1, c2 from table - This query will be accelerated
select c1, c2, c3 from table - This query will not be accelerated
To decide if you need all columns or not, depends on the query pattern. For the most part answer is “No” as selecting call columns causes 1. refresh times to go higher 2. use more space on disk 3. Query using reflection has to scan more. Create the reflection with less foot print as possible
Is there an alter command create raw reflection on PDS selecting all columns as default? - Currently no, we need to list all columns needed as part of the SQL
Is it correct understanding that applying a raw reflections essentially converts the data to parquet? - Yes, but this is optimized parquet, in terms of parquet row group size, as explained above you can also create on a subset of columns and rows
Below is a link to a white paper on reflection best practices