I am evaluating Dremio for one of my client partner’s use case and wondering if Dremio has any capabilities for unstructured data? Any supporting documentation can also be a help
Any specific sources you have in mind like Elasticsearch, Mongo, JSON etc?
Hi,
I am having the same need about the json files
Is dremio have the capabilities to parse json files dynamically and recursively ?
is there predefined functions to help parse a json file and store the result in a relational schema ?
You should be able to promote the JSON folder as a PDS and then Dremio sees it like a relational schema. On top of that if you create reflections, it would Dremio generated Parquert files
Thanks
@balaji.ramaswamy
Hi balaji,
Dremio sees the Json as physical dataset, that’s fine
but let me explain the problem with a sample used in a dremio tutorial
The relation schema expected is at least tree relational tables
table 1 : business
table 2 : business type
table 3 : service hours
So , if I want to get this schema, I have to parse the text of each record in Dremio
to get all service hours record and store them in a new table
and things can get tougher if there’s multiple imbrication levels !
So my question : is dremio capable to do this transformation easily ?
{
“business_id”: “vcNAWiLM4dR7D2nwwJ7nCA”,
“full_address”: “4840 E Indian School Rd\nSte 101\nPhoenix, AZ 85018”,
“hours”: {
“Tuesday”: {
“close”: “17:00”,
“open”: “08:00”
},
“Friday”: {
“close”: “17:00”,
“open”: “08:00”
},
“Monday”: {
“close”: “17:00”,
“open”: “08:00”
},
“Wednesday”: {
“close”: “17:00”,
“open”: “08:00”
},
“Thursday”: {
“close”: “17:00”,
“open”: “08:00”
}
},
“open”: true,
“categories”: [
“Doctors”,
“Health & Medical”
],
“city”: “Phoenix”,
“review_count”: 7,
“name”: “Eric Goldberg, MD”,
“neighborhoods”: [],
“longitude”: -111.983758,
“state”: “AZ”,
“stars”: 3.5,
“latitude”: 33.499313,
“attributes”: {
“By Appointment Only”: true
},
“type”: “business”
}