Dremio API 104 closed by remote host error

R_Dirden · January 11, 2019, 9:46pm

While trying to kick off a string of api calls from different docker containers, I’m getting a 104 error. ‘Connection aborted.’, error(104, ‘Connection reset by peer’)

I can run the api’s from one container, but the instant I kick off the second group of calls which is querying against a different table, both code sets fail due to the connection being closed by the host. I even tried making different user accounts for the api’s to be called with. Any help here would be great.

balaji.ramaswamy · January 11, 2019, 10:28pm

Hi @R_Dirden

Do you see any jobs for this on the Dremio UI jobs page? If yes can you please send us the profile. If not kindly send us the server.log

Dremio Logs
Share a Query Profile

Thanks
@balaji.ramaswamy

R_Dirden · January 14, 2019, 4:24pm

The server.log file is not being updated. Also, there’s not a failed query to pull a profile from due to the call not making it through dremio. Is there some type of setting surrounding api’s from different sessions?

doron · January 14, 2019, 5:20pm

Hi,

If the logs are not being updated then it sounds like a networking issue. Can you ping the Dremio UI the first container from the second container?

R_Dirden · January 14, 2019, 7:10pm

I’m able to kick off the code and constantly submit queries from either of the 2 containers individually. The moment I try to run the code from the 2 containers in parallel is when they fail. So the 2 containers can definitely talk with dremio and actively submit and receive data via the api.

balaji.ramaswamy · January 14, 2019, 7:15pm

From the coordinator, node activity, do you see both nodes?

R_Dirden · January 14, 2019, 7:47pm

I’m using a single node dremio instance, and I’m accessing that instance from 2 separate python containers.

balaji.ramaswamy · January 14, 2019, 7:52pm

Hi @R_Dirden

We would need Dremio to be running on the containers if they need to execute queries… We are little confused by your architecture . Can you please explain?

Thanks
@balaji.ramaswamy

R_Dirden · January 14, 2019, 8:03pm

Dremio 3.0 is running as a single node within a container. I also have 2 idle python containers I’m using to start and stop processes. So if I go into either of the 2 python containers and start a process that use Dremio’s API it works without any issues. However, if I go into the container that’s not running and start the process while the first python process is still running, they both will fail immediately. Not sure why they can run without issues individually and when I run them at the same time they fail.

balaji.ramaswamy · January 14, 2019, 8:14pm

Hi @R_Dirden

When you say this "if I go into either of the 2 python containers and start a process that use Dremio’s API ", do you mean run a query?

Thanks
@balaji.ramaswamy

doron · January 14, 2019, 8:50pm

@R_Dirden

Can you give us an exact HTTP response when its failing? Is there perhaps a proxy infront of the Dremio instance?

R_Dirden · January 15, 2019, 5:35pm

Container 1 Running then fails almost immediately after starting container 2
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:55.858Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:56.606Z’, u’startedAt’: u’2019-01-15T17:31:54.139Z’, u’queueName’: u’LARGE’}
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:55.858Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:56.606Z’, u’startedAt’: u’2019-01-15T17:31:54.139Z’, u’queueName’: u’LARGE’}
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:55.858Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:56.606Z’, u’startedAt’: u’2019-01-15T17:31:54.139Z’, u’queueName’: u’LARGE’}
Traceback (most recent call last):
File “Correlation.py”, line 171, in
qryRowCount, qryRowArray = checkRunStatus(arrJobList)
File “Correlation.py”, line 70, in checkRunStatus
response = requests.request(“GET”, urlStatus + sJobId, headers=headers)
File “/usr/local/lib/python2.7/site-packages/requests/api.py”, line 58, in request
return session.request(method=method, url=url, **kwargs)
File “/usr/local/lib/python2.7/site-packages/requests/sessions.py”, line 508, in request
resp = self.send(prep, **send_kwargs)
File “/usr/local/lib/python2.7/site-packages/requests/sessions.py”, line 618, in send
r = adapter.send(request, **kwargs)
File “/usr/local/lib/python2.7/site-packages/requests/adapters.py”, line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: (‘Connection aborted.’, error(104, ‘Connection reset by peer’))

Container 2 fails after making a few successful calls:
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:24.609Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:25.275Z’, u’startedAt’: u’2019-01-15T17:31:24.314Z’, u’queueName’: u’LARGE’}
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:24.609Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:25.275Z’, u’startedAt’: u’2019-01-15T17:31:24.314Z’, u’queueName’: u’LARGE’}
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:24.609Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:25.275Z’, u’startedAt’: u’2019-01-15T17:31:24.314Z’, u’queueName’: u’LARGE’}
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:24.609Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:25.275Z’, u’startedAt’: u’2019-01-15T17:31:24.314Z’, u’queueName’: u’LARGE’}
Status {u’queueId’: u’LARGE’, u’jobState’: u’RUNNING’, u’resourceSchedulingStartedAt’: u’2019-01-15T17:31:24.609Z’, u’errorMessage’: u’’, u’queryType’: u’REST’, u’rowCount’: 0, u’resourceSchedulingEndedAt’: u’2019-01-15T17:31:25.275Z’, u’startedAt’: u’2019-01-15T17:31:24.314Z’, u’queueName’: u’LARGE’}
Traceback (most recent call last):
File “Correlation2.py”, line 89, in
tagCount, tagRowCounts = checkRunStatus(arrTagListId)
File “Correlation2.py”, line 70, in checkRunStatus
response = requests.request(“GET”, urlStatus + sJobId, headers=headers)
File “/usr/local/lib/python2.7/site-packages/requests/api.py”, line 58, in request
return session.request(method=method, url=url, **kwargs)
File “/usr/local/lib/python2.7/site-packages/requests/sessions.py”, line 508, in request
resp = self.send(prep, **send_kwargs)
File “/usr/local/lib/python2.7/site-packages/requests/sessions.py”, line 618, in send
r = adapter.send(request, **kwargs)
File “/usr/local/lib/python2.7/site-packages/requests/adapters.py”, line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: (‘Connection aborted.’, error(104, ‘Connection reset by peer’))

doron · January 15, 2019, 7:41pm

In the Jobs page in the Dremio UI, are there any failed jobs being listed? If not then it seems like its a networking setup issue as the queries are not even arriving at Dremio.

R_Dirden · January 15, 2019, 7:49pm

Yeah, I believe it may be some sort of networking issue as well. I may just be making way to many calls back to back.

Topic		Replies	Views
Unable to connect to Dremio from Superset	7	2368	April 7, 2021
Some time dremio webui not working contact support	7	1732	December 30, 2019
Web Server not starting after recreating Dremio version 14.0.0 container	2	996	April 16, 2021
Can't start dremio	1	813	January 14, 2021
Dremio don't stop after the client close the socket connexion	1	995	August 2, 2021

Dremio API 104 closed by remote host error

Related topics