Unable to connect to Elastic Cloud Cluster

I can’t seem to connect to a cluster in the cloud via an admin user. Going to _cluster/health using the same URL and user does work, so it is not an issue with auth/auth.

2017-08-10 10:33:51,898 [metadata-refresh] INFO c.d.p.elastic.ElasticConnectionPool - User Error Occurred [ErrorId: 2039b9a6-b75b-4b06-95d2-2926e67a9031] com.dremio.common.exceptions.UserException: Encountered a problem while executing com.dremio.plugins.elastic.ElasticActions$Health@5bc000b4. Cannot get cluster health information. Please make sure that the user has [cluster:monitor/health] privilege. at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:622) ~[dremio-common-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] at com.dremio.plugins.elastic.ElasticConnectionPool.addContextAndThrow(ElasticConnectionPool.java:324) [dremio-elasticsearch-plugin-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] at com.dremio.plugins.elastic.ElasticConnectionPool.access$500(ElasticConnectionPool.java:80) [dremio-elasticsearch-plugin-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] at com.dremio.plugins.elastic.ElasticConnectionPool$ElasticConnection.executeAndHandleResponseCode(ElasticConnectionPool.java:477) [dremio-elasticsearch-plugin-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] at com.dremio.plugins.elastic.ElasticsearchStoragePlugin2.getState(ElasticsearchStoragePlugin2.java:223) [dremio-elasticsearch-plugin-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] at com.dremio.exec.store.CatalogServiceImpl$NamespaceUpdateThread.refreshSourceStates(CatalogServiceImpl.java:505) [dremio-sabot-kernel-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] at com.dremio.exec.store.CatalogServiceImpl$NamespaceUpdateThread.run(CatalogServiceImpl.java:546) [dremio-sabot-kernel-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] Caused by: javax.ws.rs.ProcessingException: java.net.ConnectException: Operation timed out at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:287) ~[jersey-client-2.23.2.jar:na] at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:252) ~[jersey-client-2.23.2.jar:na] at org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:701) ~[jersey-client-2.23.2.jar:na] at org.glassfish.jersey.internal.Errors.process(Errors.java:315) ~[jersey-common-2.23.2.jar:na] at org.glassfish.jersey.internal.Errors.process(Errors.java:297) ~[jersey-common-2.23.2.jar:na] at org.glassfish.jersey.internal.Errors.process(Errors.java:228) ~[jersey-common-2.23.2.jar:na] at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:444) ~[jersey-common-2.23.2.jar:na] at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:697) ~[jersey-client-2.23.2.jar:na] at com.dremio.plugins.elastic.ElasticActions$Health.getResult(ElasticActions.java:151) ~[dremio-elasticsearch-plugin-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] at com.dremio.plugins.elastic.ElasticConnectionPool$ElasticConnection.executeAndHandleResponseCode(ElasticConnectionPool.java:475) [dremio-elasticsearch-plugin-1.0.8-201707190805180330-27f36e1.jar:1.0.8-201707190805180330-27f36e1] ... 3 common frames omitted Caused by: java.net.ConnectException: Operation timed out at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_73] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_73] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_73] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_73] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_73] at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_73] at sun.net.NetworkClient.doConnect(NetworkClient.java:175) ~[na:1.8.0_73] at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) ~[na:1.8.0_73] at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) ~[na:1.8.0_73] at sun.net.www.http.HttpClient.<init>(HttpClient.java:211) ~[na:1.8.0_73] at sun.net.www.http.HttpClient.New(HttpClient.java:308) ~[na:1.8.0_73] at sun.net.www.http.HttpClient.New(HttpClient.java:326) ~[na:1.8.0_73] at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169) ~[na:1.8.0_73] at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105) ~[na:1.8.0_73] at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999) ~[na:1.8.0_73] at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933) ~[na:1.8.0_73] at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1513) ~[na:1.8.0_73] at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441) ~[na:1.8.0_73] at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) ~[na:1.8.0_73] at org.glassfish.jersey.client.internal.HttpUrlConnector._apply(HttpUrlConnector.java:394) ~[jersey-client-2.23.2.jar:na] at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:285) ~[jersey-client-2.23.2.jar:na] ... 12 common frames omitted 2017-08-10 10:33:51,900 [metadata-refresh] INFO c.d.exec.store.CatalogServiceImpl - Ignoring metadata load for source EC 5 since it is currently in a bad state.

1 Like

Hi @loek,

This is a connection Timeout error when Dremio connects to the configured elastic search server.
Please make sure that you can connect to the elastic search server and port from the Dremio server.
You can use a curl command like this : curl ‘http://elastic-search-host:9200/?pretty

thanks ,
Benoy

Hi @benoy, quoting myself here:

Going to _cluster/health using the same URL and user does work, so it is not an issue with auth/auth.

So that’s from the same client + ES host + port + user + pass.

So you can connect from Dremio host to the ES Host independently of Dremio. Could you please check the ES logs to see if there are any further error information when Dremio makes the connection attempt ?

There is no log message in Elastic Cloud indicating anything related to Dremio. Have you tested this on Elastic Cloud? @benoy

Hey @loek we were able to reproduce the issue you’ve run into with Elastic Cloud and will reach out once we have an update here.

So far, we’ve been mainly focusing our testing efforts around on-premise Elasticsearch deployments, since this happened to be the case for all current Elasticsearch/Dremio deployments we have.

I noticed this problem as well when connecting to a self-hosted cluster (on Azure, behind a load balancer) from Dremio installed on my local (windows) machine.

The connection succeeds at first, after which Dremio gets a list of ES-hosts. This host list are all local ip-addresses (10.xxxxx), and I have the feeling that this is the ip-address to which Dremio is trying to connect (instead of always using the load balancer ip-adress I entered).

When I ran Dremio on a Elasticsearch node, it worked perfectly.

2 Likes

I believe Dremio attempts to connect to the http.publish_host address exposed by each of the ElasticSearch nodes - this needs to be a routable address since Dremio runs searches directly against the data nodes where the relevant shards are located for maximum parallelism and throughput.

I’ve got the same behavior here, emulating an ES cluster inside a docker machine. Dremio gets the list of nodes’ IPs from _cat/nodes API and tries to connect on those.

Reading the code, I can see the problem.

The plugin gets the information about the nodes with NodesInfo at ElasticConnectionPool.java:297 object from ES’s API. I think that one workaround could be escaping this method just like when we have a local address [ElasticsearchStoragePlugin2.java:127], connecting only to the address that we provide on the connection config screen.

Thanks Allan, we are working on a fix for this. We’ll circle back with details.

Thanks for your patience!

1 Like

I just downloaded and installed Dremio to connect up with ElasticSearch. Got the this error. Any update/ETA?
Thanks.

Hi @woundvision - I’m not sure if this is the same underlying issue, but we expect this to be addressed in our 1.2 version, which should be available in a few weeks.

Hi @kelly - can you confirm whether or not this is resolved in version 1.2?

If not, same question as @woundvision - can you provide an ETA?

Yes, this was addressed in 1.2. See https://docs.dremio.com/release-notes/121-release-notes.html

1.3.1 is current and I recommend this as the release to use at this time.

Currently ES6 is not yet supported. We are working on that and hope to have this addressed in the next few weeks.

@kelly

Just downloaded Dremio 1.4.4 the other day on Mac High Sierra 10.13.2
I’m running Dremio by specifying Java 8:
JAVA_HOME=$(/usr/libexec/java_home -v 1.8) open /Applications/Dremio.app

Attempting to connect to an ES cluster (running 5.6.7) hosted by Elastic (cloud.elastic.co). I’m using credentials for a superuser and can curl any endpoint including _cluster/health so I can confirm there’s no auth issues yet, I’m getting the same error that started this thread:

2018-02-20 22:35:47,427 [qtp103272111-196] INFO  c.d.p.elastic.ElasticConnectionPool - User Error Occurred [ErrorId: 39cb7e12-5d4d-4175-bd9d-e5e9314c3190]
com.dremio.common.exceptions.UserException: Encountered a problem while executing com.dremio.plugins.elastic.ElasticActions$Health@7d4379eb. Cannot get cluster health information.  Please make sure that the user has [cluster:monitor/health] privilege.

I can easily connect to a local instance of elasticsearch but seemingly unable to connect to a hosted cluster.

Any guidance you can provide would be great. Thanks.

Just encountered the same issue with Dremio 1.4.9 attempting to connect to an ES cluster (running 2.4.6) hosted by Elastic (cloud.elastic.co).

java version “1.8.0_131”
Java™ SE Runtime Environment (build 1.8.0_131-b11)

Any help would be greatly appreciated.
Thanks

It works!
I just needed to check the “Query whitelisted hosts only” option.

1 Like

Glad you sorted that out!