How to query list field in dremio

I have one table which consists of list data in one column.

company|brand
'ABC' | [1]
'DEF' | [2,3]
'XYZ' | [1,3]

i’m using where condition with in

select * from table where brand in (1,3)

but im getting error

Failure finding function: equal(list<int32>, int32) Failure finding function: equal(list<int32>, int32)

Hi @Jaikumar_Chandrapalk

Are you getting the above error on Dremio? Is this a table on Hive.What happens if you run the same SQL via Hive shell or beeline?

Thanks
@balaji.ramaswamy

@balaji.ramaswamy I’m using Dremio & dataset is of Mongo

1 Like

@Jaikumar_Chandrapalk were you able to find a solution to this issue? I’m experiencing it as well.

I’m looking for a solution to this problem as well.

@balaji.ramaswamy can you help?

cc/ @rhanks

Hey Guys,
Even I’m experiencing the same problem while querying a list data type(json array) in Dremio while it works well in hive.

Query i have ran:
SELECT * FROM <table_name> where <column_name> != ‘[]’

Error i’m getting is:
Failure finding function: equal(list<struct<start::varchar, company_name::varchar, last_updated::varchar, title::varchar, linkedin_company_url::varchar, end::varchar, location::varchar, description::varchar>>, varchar)

Can anyone help me on this. We are looking for an early solution to consider Dremio for our business usecases.

Thanks,
Jalandhar

Hello @jalandhar,

First of all, you need to unnest the column to run queries on it.Please review the below link.

https://docs.dremio.com/working-with-datasets/data-curation.html#dealing-with-mixed-types

Thanks,
@Rakesh_Malugu

Thanks @Rakesh_Malugu for the info.

I have already went through that link.

My use case is completely different. I have a huge dataset where one of the field is a list (array of jsons). In few of the fields we do have an empty list ([]) and we need to filter them.
If i unnest or flatten it, it will expand to a different dataset and i won’t get the exact count by that way.

I simply need to query the records having that field as empty list.

Thanks,
Jalandhar

@jalandhar

Is it possible to provide the dataset?

You can change the data values no worries thats not the priority. It is just the format I need.

Thanks,
@Rakesh_Malugu

@Rakesh_Malugu
As i can’t share the exact dataset, i have prepared a sample dataset for sharing purpose.
Please find the below 2 sample json records for this use case:
{“column1”:[],“column2”:[],“column3”:“AB”}
{“column1”:[{“org”:“ABC”,“date”:“Feb 2010”,“type”:“Scholarship”}],“column2”:[],“column3”:“AB”}

So, what i need is to get all the records where column1 is [] (empty list).

Thanks,
Jalandhar

@jalandhar

I’m not able to read the provided sample data from Dremio.

I would like you to re-format and test in Dremio and update it back please.

Thanks,
@Rakesh_Malugu

@Rakesh_Malugu,
Thanks for checking that.
Looks like the json format got changed while copy pasting into the editor.

I’m attaching the sample file for which i have tested in Dremio.

dremio_test.json.zip (695 Bytes)

Please let me know if you are still not able to read the file in dremio.

Thanks,
Jalandhar

@jalandhar

This is a known issue with Dremio. FLATTEN function eliminating rows when they contain an empty array.

This will be get fixed in future release.

Thanks,
@Rakesh_Malugu

Thanks @Rakesh_Malugu for the clarification.

For we have found an alternative to query empty list using below query:
SELECT count(*) FROM <table_name> where list_field[0] is null;