How to query list field in dremio

I have one table which consists of list data in one column.

company|brand
'ABC' | [1]
'DEF' | [2,3]
'XYZ' | [1,3]

i’m using where condition with in

select * from table where brand in (1,3)

but im getting error

Failure finding function: equal(list<int32>, int32) Failure finding function: equal(list<int32>, int32)

Hi @Jaikumar_Chandrapalk

Are you getting the above error on Dremio? Is this a table on Hive.What happens if you run the same SQL via Hive shell or beeline?

Thanks
@balaji.ramaswamy

@balaji.ramaswamy I’m using Dremio & dataset is of Mongo

1 Like

@Jaikumar_Chandrapalk were you able to find a solution to this issue? I’m experiencing it as well.

I’m looking for a solution to this problem as well.

@balaji.ramaswamy can you help?

cc/ @rhanks

Hey Guys,
Even I’m experiencing the same problem while querying a list data type(json array) in Dremio while it works well in hive.

Query i have ran:
SELECT * FROM <table_name> where <column_name> != ‘[]’

Error i’m getting is:
Failure finding function: equal(list<struct<start::varchar, company_name::varchar, last_updated::varchar, title::varchar, linkedin_company_url::varchar, end::varchar, location::varchar, description::varchar>>, varchar)

Can anyone help me on this. We are looking for an early solution to consider Dremio for our business usecases.

Thanks,
Jalandhar

Hello @jalandhar,

First of all, you need to unnest the column to run queries on it.Please review the below link.

https://docs.dremio.com/working-with-datasets/data-curation.html#dealing-with-mixed-types

Thanks,
@Rakesh_Malugu

Thanks @Rakesh_Malugu for the info.

I have already went through that link.

My use case is completely different. I have a huge dataset where one of the field is a list (array of jsons). In few of the fields we do have an empty list ([]) and we need to filter them.
If i unnest or flatten it, it will expand to a different dataset and i won’t get the exact count by that way.

I simply need to query the records having that field as empty list.

Thanks,
Jalandhar

@jalandhar

Is it possible to provide the dataset?

You can change the data values no worries thats not the priority. It is just the format I need.

Thanks,
@Rakesh_Malugu

@Rakesh_Malugu
As i can’t share the exact dataset, i have prepared a sample dataset for sharing purpose.
Please find the below 2 sample json records for this use case:
{“column1”:[],“column2”:[],“column3”:“AB”}
{“column1”:[{“org”:“ABC”,“date”:“Feb 2010”,“type”:“Scholarship”}],“column2”:[],“column3”:“AB”}

So, what i need is to get all the records where column1 is [] (empty list).

Thanks,
Jalandhar

@jalandhar

I’m not able to read the provided sample data from Dremio.

I would like you to re-format and test in Dremio and update it back please.

Thanks,
@Rakesh_Malugu

@Rakesh_Malugu,
Thanks for checking that.
Looks like the json format got changed while copy pasting into the editor.

I’m attaching the sample file for which i have tested in Dremio.

dremio_test.json.zip (695 Bytes)

Please let me know if you are still not able to read the file in dremio.

Thanks,
Jalandhar

@jalandhar

This is a known issue with Dremio. FLATTEN function eliminating rows when they contain an empty array.

This will be get fixed in future release.

Thanks,
@Rakesh_Malugu

Thanks @Rakesh_Malugu for the clarification.

For we have found an alternative to query empty list using below query:
SELECT count(*) FROM <table_name> where list_field[0] is null;

Hey @Rakesh_Malugu I am having a similar issue, where I am trying to filter rows based on array equality. I am using following syntax

WHERE listColumn = ARRAY['value']

But I am getting following error

java.lang.AssertionError: todo: implement syntax SPECIAL(ARRAY(‘value’))

Is it something that is not supported, but will be supported in the future? Or is there a different syntax I can use to filter rows based on array comparison?

Thank you, Igor

Hello @igorsechyn

You cannot use the array directly in the query. However you can unnest the column so that it will split into multiple records with diff array values. Please use the below link for steps.

Thanks,
Rakesh

Hi @Rakesh_Malugu

Thank you for your answer. I have seen the unnest option, this would not work for our use case however. Is supporting ARRAY['value'] on a roadmap?

Cheers, Igor