Apache Calcite version

Hello,

I am currently evaluating Dremio as an alternative to our existing stack.

I was wondering what version of Apache Calcite Dremio 3.3 is using, and if there are any plans to upgrade it to at least version 1.18 (released on 2018-12-21).

The reason is that I am dealing with an EAV model and would like to interact with it using the JSON functions that were added to Calcite in 1.18, however trying to use any of these functions in Dremio 3.3 just returns a ‘no match found for function signature’ error.

Dremio claims to remove the need for ETL, but at least for this use-case I still need to use an ETL tool.

Thanks!

Hey! Welcome to the Dremio community site :smiley:

Which functions are you interested in from calcite? We don’t directly expose calcite functions however we do support json manipulation. Most notably you can select or filter on json subfields:

SELECT "business.json".hours.Monday FROM "business.json"
    where "business.json".attributes.GoodForKids='True' 

You can also flatten or extract fields: https://docs.dremio.com/sql-reference/sql-functions/nested-data.html

You can see all the functions supported in dremio here: https://docs.dremio.com/sql-reference/sql-functions/

Best,
Ryan

To answer the Apache Calcite version part more specifically, Dremio uses a version closely similar to Apache Calcite 1.16, and we are regularly updating the version.
That said, using a newer version of Calcite doesn’t mean that new functions added to Calcite are automatically available in Dremio: this is because Dremio and Calcite do not use the same execution engine, and simply support a different set of functions.

Like @rymurr, Dremio already has some support for JSON functions, and so you might be able to not use an ETL anymore. If not, feel free to share which functions would be missing, and our product team could then evaluate them for inclusion in the product.

Thanks guys, its a bit clearer now. I was missing the connection that you’re just using Calcite’s parser and you’ve implemented your own execution engine.

My use-case was actually pivoting a table that looks something like:

itemId | attribute | value
1 | name | foo
1 | color | green
1 | size | small

To something like:

itemId | attributes
1 | {“name”: “foo”, “color”: “green”, “size”: “small”}

And then allowing users to pick out attributes in their own datasets based on mine, eg:

select itemId, json_value(attributes, ‘$.size’) as size from table

I can’t see a way to do that in Dremio at present

You are correct, currently there is no way to pivot a table into a json document. In your example above what Dremio can do is select itemId, "value" as size where "attribute"='size'. That will produce the same result as your example above. Does that help at all?