Accessing Dremio running behind App Gateway from Superset

I am trying to establish connection to Dremio from a BI tool called Superset. Dremio is running behind an Application Gateway(Azure). While trying to access Dremio from Superset using the ArrowFlight driver - dremio+flight://{username}:{password}@{host}:{port}/dremio
or even using the ODBC driver - dremio://{username}:{password}@{host}:{port}/{database_name}/dremio?SSL=1
it throws an error DatabaseTestConnectionFailedError.

I have tried to bypass the app gateway and directly connect to the Dremio Service(LoadBalancer IP) using the above connection string from Superset UI, but I still cant reach.

Also tried running a python sample(arrow-flight-client-examples/example.py at main · dremio-hub/arrow-flight-client-examples · GitHub) from inside the superset pod(both dremio and superset are running inside the same aks cluster) using the appgw url, I still get grpc returned unavailable error.

However, in this script I can reach Dremio, when using the Dremio Loadbalancer IP.

Could you please help understand if I am missing anything.

@rymurr @balaji.ramaswamy Hi Experts, Would request your kind support here.

Where are you running the script from? Is 31010 open from Superset to Dremio? Can you see if you can reach the port? using telnet or nc?

Hello Balaji,

Thanks for the response,
The script is being run from inside the superset pod. We are using the arrow flight port 32010 for accessing dremio and we can see that the telnet is working from superset pod to the app gateway hostname on port 32010.

One more thing, as informed before, we are able to reach Dremio from superset pod, when we are not using the app gateway, and we directly access the Dremio AKS Loadbalancer Service IP. However, when we provide the same Loadbalancer IP in the Superset UI as a connection string, it doesn’t work.

@balaji.ramaswamy - Further investigating down, we found that AppGateway only uses http/https protocol and since Dremio(exposed on 32010) uses a non http protocol, it is not possible to access it from behind an app gateway. Could you please confirm if the understanding is correct?

@lenoyjacob - Any inputs here would really helpful.

@ramprasd89 Talking to the engineering teams, it looks like it needs HTTP/2. I have asked an engineer to follow up with you

1 Like

@ramprasd89 It seems that from your testing, you have shown that your App Gateway does not support HTTP2 protocol. The Arrow Flight connection uses gRPC which requires HTTP2 to support it.

HTTP2 is the next generation of HTTP protocol. It has evolved significantly since the HTTP1.x specifications. HTTP2 is binary in nature, as opposed to the text nature of HTTP1.x. It is this binary nature that allows HTTP to support gRPC calls, while HTTP1.x cannot.

The JDBC/ODBC connection point on 31010 can be proxied with an HTTP1.x class of LoadBalancer, as these do not use the same gRPC as the Arrow Flight connection point on 32010 uses.

I would suspect that the issue you are having on the JDBC/ODBC connector is likely different than that caused by the lack of HTTP2 support for the Arrow Flight connector.

The only common fact could be that TLS termination can be performed on the load balancer, which could pull this into context if that TLS termination is not properly setup. This could be tested by removing the SSL=1 switch from your JDBC/ODBC connection string to see if that changes the nature of the failure.

Without any stack traces, I am largely having to guess about what is actually happening for you for the JDBC/ODBC end-point given that it operates quite differently than the Arrow Flight end-point. Regarding the grpc not available error for Flight, that is more clear about the situation. It is telling us that the client is not able to upgrade the HTTP1.x connection to an HTTP2 connection to allow gRPC to function.

If you have access to a Network LoadBalancer that operates at Layer-3, rather than an Application LoadBalancer that operates at Layer-7, you should have more success in getting Arrow Flight to work through an external LoadBalancer, if your Application LoadBalancer is unable to support HTTP2.

1 Like