We would like to be able to upload a file, create a new dataset from it and then be able to overwrite that file with added data. This way don’t have to re-create our downstream sql statements again.
The best way to do this is to use the NAS source connection to connect to a file or directory:
https://docs.dremio.com/data-sources/files-and-directories.html
Each time the query is issued to Dremio, it will be pushed down to the latest data on the file system.
Will this work for you?
yeah that would be a good work around. I think as a future roadmap item it might be good to allow files to be overwritten
@Russ_Wilson alternatively, you may delete the file from Dremio and upload the new version with the same name. With this approach, there will be a short period where connected datasets will fail. But will start working again once the file is in place - no need to re-create downstream statements. This applies for all connected datasets in Dremio, not only uploaded files.
Actually that does not work, if its connected to a dataset it gives us an error message when we try to delete it.
What error message are you getting, would be great to capture it? Are you referring to the warning about “This will affect connected datasets”?
I did a quick test with the following steps, wonder what’s different in your environment:
- upload File
- create dataset that select * from File
- delete File
- observe dataset preview fails
- upload File again with the same name
- successfully preview dataset
interesting, it worked for me just now as well, let me have the person who was doing it and see if we still get that error.
Thanks!