I’d like to develop spaces, data sources, virtual data sources, users, permissions, etc. in a development environment and then sync them to the production environment when ready. Has anyone come up with tools or techniques for doing this?
In my case, I need physical datasources to point to different servers (e.g. Dremio dev server points to Oracle dev server, Dremio prod points to Oracle prod.)
The REST API makes it (more) possible. I’m hoping to find something easy. I need to make sure that running Dremio is a net-positive for my team, and that we’re not spending a lot of effort (including creating bespoke tools) syncing environments.
If you want to migrate definitions from one environment to the next with some level of granularity, you’ll have to make your own scripts to do it. We have a flyway-like system for our environment, but unfortunately I can’t share the code with you. I’d highly recommend copying their model though! It’s pretty simple.
Using the REST API for this is pretty simple. Your dataset definitions (the underlying SQL) remain the same for both environments, you simply create sources that point at the appropriate server depending on the deployment type (production vs development).
Simply creating sources ignores syncing deletion of resources, and I fear that pushing updates that have no real change could trigger refreshing reflections unnecessarily. Also, I wonder whether the API allows creating sources with invalid queries that are made valid with later updates; if not, ordering of updates is important.
I like @desidero’s thoughts around using a migrations tool. It allows a gitops style where changesets can be used as a more granular way of specifying what’s ready to promote to production. I found a promising but possibly stale library that could help, mutagen:
Mutagen takes the general principals of database versioning from frameworks like Flyway and abstracts them so that they can be applied to virtually any resource, not just SQL-based databases.