I would like to know if there is an easy way to stack up tables in a single long table
In short, I have a greenplum installation with many schemas, the schemas have identical structure, tables etc, only the data is different, I would like to be able to stack table1 in schema1 with table1 in schema2 etc
so that I can use a simple select to get the data from all those tables in one go, I could use a union but it adds complications and usually are a performance killers, so I would like to fine a better way
A bit like in the discussion Stacking files? but with tables rather than files
Are you trying to avoid the union on the source database tables? Assuming due to performance?
Then my only thought is to perform a union on the many raw reflections of the physical source queries that you setup. Then configure a reflection on that to increase performance of any queries against the combined records.
You can do a UNION ALL of the tables when creating a virtual dataset:
If you’d like, you can create a reflection on this virtual dataset, and queries will then be rewritten to
use the reflection instead of pushing down into Greenplum.
If your query patterns frequently make use of subsets of the underlying partitions, you could experiment with creating a reflection on the physical datasets, and then a union as described above, and Dremio will decide which combination of reflections to use for a given query.
Yes I’m trying to avoid unions on the source database tables, it impacts performances, I did not think about creating cascade reflections
My concern is that the original tables are large so the reflections would be large as well, but I will try your suggestions and I will let you know
Mark - reflections are highly compressed, so they should be relatively small compared to the source. In addition, they don’t have to include all rows/columns, so you can pick only what is needed. Finally, you can use aggregation reflections to keep only aggregated reflections of the source. This works when your queries are only aggregation queries, which is the case for much of BI.
Just some ideas, hope this helps.
I have been doing some testing and we can get some good compression
Thanks for the help