Circuit_breaking_exception for an aggregated query

Hi
I want to plug Tableau desktop to my ElasticSearch (5.6.9) index with Dremio (2.0.5) in-between.
The index contains 15 millions rows and is on day level
In tableau I only need the month level so I tried to define the query on top of this virtual dataset

_SELECT _
_“Year” AS “Year”, _
_“Month” AS “Month”, _
_“MonthLabel” AS “Monthlabel”, _
_UC, _
Continent,
Country,
Device,
Query,
“Query without accent”,
sum(Clicks),
sum(Impressions),
sum(“Position”)
FROM “ES GCS Data”.“gge-gcsdata-index”.generictype
group by
_“Year”, _
_“Month”, _
_“MonthLabel”, _
_UC, _
Continent,
Country,
Device,
Query,
“Query without accent”

And I get an error from ES when running the query.
Here is the query that was pushed down to ES, and I can see that dremio ask ES to perform the aggregation

Query {
“size” : 0,
“query” : {
“match_all” : { }
},
“aggregations” : {
“Year” : {
“terms” : {
“field” : “Year”,
“missing” : -2147483648,
“size” : 2147483647
},
“aggregations” : {
“Month” : {
“terms” : {
“field” : “Month”,
“missing” : -2147483648,
“size” : 2147483647
},
“aggregations” : {
“Monthlabel” : {
“terms” : {
“field” : “MonthLabel”,
“missing” : “NULL_STRING_TAG”,
“size” : 2147483647
},
“aggregations” : {
“UC” : {
“terms” : {
“field” : “UC”,
“missing” : “NULL_STRING_TAG”,
“size” : 2147483647
},
“aggregations” : {
“Continent” : {
“terms” : {
“field” : “Continent”,
“missing” : “NULL_STRING_TAG”,
“size” : 2147483647
},
“aggregations” : {
“Country” : {
“terms” : {
“field” : “Country”,
“missing” : “NULL_STRING_TAG”,
“size” : 2147483647
},
“aggregations” : {
“Device” : {
“terms” : {
“field” : “Device”,
“missing” : “NULL_STRING_TAG”,
“size” : 2147483647
},
“aggregations” : {
“Query” : {
“terms” : {
“field” : “Query”,
“missing” : “NULL_STRING_TAG”,
“size” : 2147483647
},
“aggregations” : {
“Query without accent” : {
“terms” : {
“field” : “Query without accent”,
“missing” : “NULL_STRING_TAG”,
“size” : 2147483647
},
“aggregations” : {
“EXPR$9” : {
“sum” : {
“field” : “Clicks”
}
},
“EXPR$10” : {
“sum” : {
“field” : “Impressions”
}
},
“EXPR$11” : {
“sum” : {
“field” : “Position”
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}

And here is the error I get from ES

Response Status 503
Response Reason Service Unavailable
Response Body {“error”:{“root_cause”:[{“type”:“circuit_breaking_exception”,“reason”:“[request] Data too large, data for [<agg [EXPR$9]>] would be [2556063392/2.3gb], which is larger than the limit of [2556061286/2.3gb]”,“bytes_wanted”:2556063392,“bytes_limit”:2556061286}],“type”:“search_phase_execution_exception”,“reason”:“all shards failed”,“phase”:“query”,“grouped”:true,“failed_shards”:[{“shard”:4,“index”:“gge-gcsdata-index”,“node”:“4OWoPVXCQGWriLS_V56GhA”,“reason”:{“type”:“circuit_breaking_exception”,“reason”:“[request] Data too large, data for [<agg [EXPR$9]>] would be [2556063392/2.3gb], which is larger than the limit of [2556061286/2.3gb]”,“bytes_wanted”:2556063392,“bytes_limit”:2556061286}}]},“status”:503}
SqlOperatorImpl ELASTICSEARCH_AGGREGATOR_SUB_SCAN
Location 3:0:3
Fragment 3:0

[Error Id: 5283fb4f-2f87-4d02-8867-af1afbe66557 on localhost:31010]

(javax.ws.rs.client.ResponseProcessingException) javax.ws.rs.ServiceUnavailableException: HTTP 503 Service Unavailable
org.glassfish.jersey.client.JerseyInvocation.convertToException():1043
org.glassfish.jersey.client.JerseyInvocation.access$900():92
org.glassfish.jersey.client.JerseyInvocation$7.completed():952
org.glassfish.jersey.client.ClientRuntime.processResponse():196
org.glassfish.jersey.client.ClientRuntime.access$300():74
org.glassfish.jersey.client.ClientRuntime$2$1$1.run():166
org.glassfish.jersey.internal.Errors$1.call():271
org.glassfish.jersey.internal.Errors$1.call():267
org.glassfish.jersey.internal.Errors.process():315
org.glassfish.jersey.internal.Errors.process():297
org.glassfish.jersey.internal.Errors.process():267
org.glassfish.jersey.process.internal.RequestScope.runInScope():340
org.glassfish.jersey.client.ClientRuntime$2$1.response():164
org.glassfish.jersey.client.internal.HttpUrlConnector$3.run():297
java.util.concurrent.Executors$RunnableAdapter.call():511
java.util.concurrent.FutureTask.run():266
jersey.repackaged.com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute():299
java.util.concurrent.AbstractExecutorService.submit():112
jersey.repackaged.com.google.common.util.concurrent.AbstractListeningExecutorService.submit():50
jersey.repackaged.com.google.common.util.concurrent.AbstractListeningExecutorService.submit():37
org.glassfish.jersey.client.internal.HttpUrlConnector.apply():293
org.glassfish.jersey.client.ClientRuntime$2.run():180
org.glassfish.jersey.internal.Errors$1.call():271
org.glassfish.jersey.internal.Errors$1.call():267
org.glassfish.jersey.internal.Errors.process():315
org.glassfish.jersey.internal.Errors.process():297
org.glassfish.jersey.internal.Errors.process():267
org.glassfish.jersey.process.internal.RequestScope.runInScope():340
org.glassfish.jersey.client.ClientRuntime$3.run():208
java.util.concurrent.Executors$RunnableAdapter.call():511
java.util.concurrent.FutureTask.run():266
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
Caused By (javax.ws.rs.ServiceUnavailableException) HTTP 503 Service Unavailable
org.glassfish.jersey.client.JerseyInvocation.convertToException():1035
org.glassfish.jersey.client.JerseyInvocation.access$900():92
org.glassfish.jersey.client.JerseyInvocation$7.completed():952
org.glassfish.jersey.client.ClientRuntime.processResponse():196
org.glassfish.jersey.client.ClientRuntime.access$300():74
org.glassfish.jersey.client.ClientRuntime$2$1$1.run():166
org.glassfish.jersey.internal.Errors$1.call():271
org.glassfish.jersey.internal.Errors$1.call():267
org.glassfish.jersey.internal.Errors.process():315
org.glassfish.jersey.internal.Errors.process():297
org.glassfish.jersey.internal.Errors.process():267
org.glassfish.jersey.process.internal.RequestScope.runInScope():340
org.glassfish.jersey.client.ClientRuntime$2$1.response():164
org.glassfish.jersey.client.internal.HttpUrlConnector$3.run():297
java.util.concurrent.Executors$RunnableAdapter.call():511
java.util.concurrent.FutureTask.run():266
jersey.repackaged.com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute():299
java.util.concurrent.AbstractExecutorService.submit():112
jersey.repackaged.com.google.common.util.concurrent.AbstractListeningExecutorService.submit():50
jersey.repackaged.com.google.common.util.concurrent.AbstractListeningExecutorService.submit():37
org.glassfish.jersey.client.internal.HttpUrlConnector.apply():293
org.glassfish.jersey.client.ClientRuntime$2.run():180
org.glassfish.jersey.internal.Errors$1.call():271
org.glassfish.jersey.internal.Errors$1.call():267
org.glassfish.jersey.internal.Errors.process():315
org.glassfish.jersey.internal.Errors.process():297
org.glassfish.jersey.internal.Errors.process():267
org.glassfish.jersey.process.internal.RequestScope.runInScope():340
org.glassfish.jersey.client.ClientRuntime$3.run():208
java.util.concurrent.Executors$RunnableAdapter.call():511
java.util.concurrent.FutureTask.run():266
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748

Can someone explain this error and the way to fix it?

Thanks

Xavier

It seems like ES is returning an error because there’s too much data. I see Data too large, data for [<agg [EXPR$9]>] would be [2556063392/2.3gb], which is larger than the limit of [2556061286/2.3gb]

I’m not an ES expert but a quick Google search can probably find some ways to up this limit - https://stackoverflow.com/questions/30811046/fielddata-data-is-too-large
Additionally, for testing purposes, you could also try to alter the Tableau query (maybe select less dimensions/filters in Tableau) so it selects less data.