Create a data set from a subset of parquet files

I can see how I would create datasets from a single file. I can also see how to create a dataset from all files in a directory. Both of those are covered in the UI docs here. And I see that I can do both of those from the API as documented here.

what I want to do is slightly different. I want to use the API to create a dataset from 100 files from an S3 bucket with 100,000 files in it. I don’t want Dremio to scan the whole S3 bucket because that will take FOREVER because of how many files are in there. But I have the file names and I’d like to tell Dremio, “hey, make a single dataset from these files. Here are the names…”

Is that what the propertyList parameter is for? I can’t find any documentation or examples around what key value pairs one can shove into propertyList or why. Or is there a complete different method?

1 Like

@jdlong Currently there is no way you can do a partial promotion, not a great workaround would be to move the 100 files to a subfolder and promote that