Elasticsearch: Validation Failed: 1: [size] cannot be [0] in a scroll context

Hi there,

I’m working on a PoC in which we want to visualize data from Elasticsearch using Dremio as the bridge, i.e. without reflections activated.

Elasticsearch (6.2.2.) runs in single-node mode with auth turned off – basically, with default settings. Dremio: 1.4.9 with the latest JDBC Driver, running on Windows 7 (did I mention that this is a PoC? ;)).
Executing basic queries, such as select * work fine. However, as soon as aggregate functions are included in the query, Elastic returns with a HTTP 400 "Validation Failed: 1: [size] cannot be [0] in a scroll context"
The most simple example I can give is select count(*) FROM … which translates into
Query { "size" : 0, "query" : { "match_all" : { } } }
The HTTP Request looks like this: http://10.21.46.50:9200/ny_taxi_data/rides/_search?scroll=100000ms

Seems to me that the combination of scroll (pagination) and size: 0 seems invalid.

Any hints what I’m doing wrong? The problem occurs both when accessing via Dremio UI and JDBC.

Thanks and best regards, Tim

75d19ac9-601b-47fa-846b-be1f69c97777.zip (5.5 KB)

Hey Tim, Dremio doesn’t yet support 6.x, so that may be the issue here.

https://docs.dremio.com/data-sources/elasticsearch.html

Any chance you can try the same on 5.x? We won’t have 6.x until later this year.

What you’re trying to do normally works fine.

Hopefully you get the chance to experiment with Reflections, they can make a big performance difference. :slight_smile:

Hi Kelly,

thanks a lot for the quick reaction – awesome, just like the product :wink:
Seems like I didn’t check the manual properly ;( I actually assumed that 6.X must be working, because when I enable Reflections, Dremio happily sucks 15 Mln records out of Elastic and allows me to query the data in essentially no time (100 ms and less, depending on the complexity of the query of course).
I also tried to import some huge CSV files and benchmarked it against our existing Spark 2.2 impl (on the same hardware) – wow :wink: Dremio beats Spark+Parquet by an order of magnitude (quick tests only, nothing that would qualify as a professional benchmark).

We depend on 6.X… Our project is not that urgent, so “later that year” could work…

Thanks, Tim

Hello Tim,
Support for Elasticsearch 6.0 is now available in our latest release, Dremio 2.1. More details here.

Hi Lucio,

thanks for the information! I’ll give the latest version a try.