Which ports to open through firewalls?

Hi, I’ve got a question with respect to connectivity to the underlying datasources in a distributed environment: do the coordinators need to have access to the ports of the underlying databases (e.g. for postgresql 5432) or do only the executors need this access?
We’ve opened up the zookeeper port between clients and coordinator (2181), we changed firewall rules so that coordinator can connect to all the executors on port 2181.
But which other ports need to be open? Is this config enough between coordinator and executors? Is there anything to take into account between coordinator, executor and the datasources?
Between coordinator and executors there is at least 1 firewall.

This document covers all the ports.

The Inter-node communication port needs to be open on all nodes as that is how they communicate. And yes, all nodes need to have access to the sources - coordinator nodes will fetch metadata for example.

hi, just for clarification: is it also required to have port 31010 open on corporate network if we use zookeeper? The doc states it is, but just to have confirmation.

thx

If you plan to use ODBC or JDBC drivers, yes.

1 Like

Hi. The documentation also states that all nodes need access to all sources. So I guess this means one cannot configure an executor node or set of executor nodes to connect only to specific data sources.
This would mean that ports between different networks and firewalls need to be opened specifically for Dremio?
E.g. a database located behind firewall in a certain network zone, another one in another zone. Hence executors and coordinators need to have mapped ports through all these firewalls…
This is certainly something we want to avoid.
So could you confirm that this is the case: per each dremio cluster one needs to have access to all datasources one wants to serve from all coordinator and executor nodes in that cluster.
Correct right?

@dbrys Correct, there is no way currently to limit nodes in a cluster to certain sources.