Docker-compose example

Hello I would like to try a dremio minimal local cluster in local to read local data ( or even S3 data in a second time )

I would like to use the docker image Docker Hub

Do you have any documentation or working example of docker-compose ? Thank you

Hi,

Why do you want to use docker-compose ? If you want a standalone instance you can just run something like that:

docker run -p 9047:9047 -p 31010:31010 -p 45678:45678 -p 32010:32010 --name Dremio -v “C:\MYSHARE:/MNT/MYSHARE” dremio/dremio-oss

Way easier.

But if you want to use docker-compose you can create a file named “docker-compose.yml” with content like that:

version: “3.2”
services:
mssql:
container_name: mssql
image: microsoft/mssql-server-linux:2017-latest
ports:
- “1433:1433”
environment:
SA_PASSWORD: “A_P@ssw0rd”
ACCEPT_EULA: “Y”
dremio:
container_name: dremio
image: dremio/dremio-oss
volumes:
- C:\MYSHARE:/MNT/MYSHARE
ports:
- “9047:9047”
- “31010:31010”
- “32010:32010”
- “45678:45678”
links:
- mssql

image

then just type “docker-compose up” and it should mount both images.
then create an account in Dremio (you can access through “localhost:9047” from your host
and try to add an External Source


You can untick “Verify server certificate” on Advanced option

You can as well past some parquet file or whatever in the folder “C:/MYSHARE” of your host
image
and add a “Nas” under DataLake:
image
promote your file:
image
And you can query files from there:

Hope it helps a little bit :wink:

You can mix other docker images instead of MSSQL, if you want to give a try to Postgres or Greenplum for instance, the way should be the same

Hello thanks for your very much for your answer

Why do you want to use docker-compose ?
It’s a good tool to run multiples containers in local, I could run a helm chart in KIND , but I prefer to keep it simple at start.

Also I would like to run a real cluster, a Coordinator node and some Executor nodes , all in separate container launch by docker-compose

something like

version: “3.2”
services:
	mssql:
		container_name: mssql
		image: microsoft/mssql-server-linux:2017-latest
		ports:
			- “1433:1433”
		environment:
			SA_PASSWORD: “A_P@ssw0rd”
			ACCEPT_EULA: “Y”
	
	dremio_coordinator:
		container_name: dremio
		image: dremio/dremio-oss
		volumes:
			- C:\MYSHARE:/MNT/MYSHARE
		environment:
			CONF: “is_coordinator”
			EXECTUORS : "dremio_executor_1,dremio_executor_2"
		ports:
			- “9047:9047”
			- “31010:31010”
			- “32010:32010”
			- “45678:45678”
		links:
			- mssql
	dremio_executor_1:
		container_name: dremio
		image: dremio/dremio-oss
		volumes:
			- C:\MYSHARE:/MNT/MYSHARE
		environment:
			CONF: “is_executor"
			ID "1"
		ports:
			- “9047:9047”
			- “31010:31010”
			- “32010:32010”
			- “45678:45678”
		links:
			- dremio_coordinator
	dremio_executor_2:
		container_name: dremio
		image: dremio/dremio-oss
		volumes:
			- C:\MYSHARE:/MNT/MYSHARE
		environment:
			CONF: “is_executor"
			ID: "2"
		ports:
			- “9047:9047”
			- “31010:31010”
			- “32010:32010”
			- “45678:45678”
		links:
			- dremio_coordinator

couchbase have an example of multiples containers in local to run a cluster

→ Install Couchbase Server Using Docker | Couchbase Docs

@YEN provided a great tutorial on the quick docker run approach, basic functionality.

Can you explain what you hope to achieve from Docker Compose? Do you wish to simulate one or more of Dremio’s production architecture(s)? Do you want to tweak or customize the Dremio front end from dremio-oss?

The main purpose of running in a docker version of Dreimo for me is a local development - test environment. My main Dremio runs on AWS. I want to test or validate a few queries and test changing some settings that I know Dremio uses both on AWS and in Dremio. I want to see whether my laptop is faster at refreshing reflections all locally and see whether it may be more cost effective to run Dremio from a custom docker image or docker compose that includes another Oracle or SQL Server config.
I deployed Oracle enterprise 19c on Docker then used it as a data source from Dremio-OSS for example.

If your interest is more so on the multi node approach or have a hosted Docker enterprise environment that can leverage scalability or run as Kubernetes, running a dremio-oss image and a few data sources won’t be enough.

If you don’t plan to use the AWS or Azure versions of Dremio, you need to consider the complexity of running Hadoop or MapR as a pre-requisite to enabling your docker compose components a.k.a. engines or clusters or swarms. Most of those multi-node deployments are so complex other companies provide custom images and managed services around those more open solutions.

If you want to simulate the self-hosted Hadoop or MapR environment of Dremio then I could see some value in a docker compose, just remember the bulk of the work will not be Dremio specific. Instead, you’d need a suite of Hadoop docker images configurations that Dremio would fit well with, perhaps a few cloud images to integrate with cloud storage layer or even a managed hybrid cloud YARN cluster that could share both cloud and local nodes to run Dremio queries.

Dremio on YARN Hadoop I think could be fully open source in a docker compose, that would really benefit the community to leverage a pre-built “test suite” of images for simulation or testing multi node Hadoop engine with Dremio as a query engine and front-end.

1 Like

Can you explain what you hope to achieve from Docker Compose

already said → dremio minimal local cluster (not a single one node container) to read local data

I don’t need AWS , Oracle , Azure , Hadoop, MapR , YARN.

It’s so easy with spark , trino , couchdb , redis …

I will handle myself, thanks for trying.