Filter not pushing down to Postgres on JOIN

Hello there, fellow Dremio users and experts!

I’ve been trying to get Dremio to push down filters to Postgres when joining with an Iceberg table to no avail. The Iceberg table is a stream of events, very basic, and I need to enrich it with data from some Postgres tables (that aren’t on Iceberg yet).
When joining them, Dremio is fetching the entire table from Postgres, even if just a few records of the table are needed for the join. The field I’m joining on PG have a corresponding index, so if I could manage to push down the filter, it would be super sweet. Here’s a simple query I crafted to makes the issue clear:

with some_events as (
    select *
    limit 10

from some_events
inner join postgres.profile.profile
    on profile.accountid = some_events.accountid;

Is there is any optimization (even ones that have to be switched on using ALTER SYSTEM) that can help me with this? Am I doing something wrong? I’ve tried switch places between iceberg and postgres tables (because of the runtime filtering article), but it didn’t help either.

I’m using Dremio 25.0 CE on Docker. I’ve attached my query profile. (20.4 KB)

Answering my own question for those who come here with a similar problem.

As of v25.0, according to the docs (Runtime Filtering | Dremio Documentation) the runtime filtering only works on Parquet and Iceberg sources, so that’s why I’m not getting any pushdown - because I’m using a PostgreSQL source.

@rcosta.esapiens I looked at the profile you have attached and do not see any filters in the SQL. You are right that runtime filters are only applicable to above mentioned file/table formats (plus Delta lake too) but if you have a filter on the postgres table, that should be pushed down to PG. Do you have an example of when the filter is not getting pushed down to PG?

I think that the problem was my incorrect understanding of how the pushdown would work in this case. I (wrongly) assumed that Dremio would only fetch from PG the records with accountid (the pk of the table) corresponding to the values from records in some_events instead of doing a full table scan and joining on memory.

This way, let’s say the 10 events had 10 distinct values of accountid. Dremio would issue a query to PG fetching only rows which had accountid belonging to the 10 events from the Iceberg table. But after some reading of the docs, I realized that it isn’t the case.

Thanks for reaching back!

@rcosta.esapiens Dynamic filtering also known as Runtime filtering is only for joins between PRAQUET, Iceberg, Delta tables