Is reflection storage intended to be private per node or per cluster?

Are reflections to be stored on a shared datasource or are they private for each node in a cluster.

They should be stored on a shared storage layer, such as S3, HDFS, NAS.

You can set the config here:

storage area for the accelerator cache.

accelerator: ${paths.dist}/accelerator


Also, see: Storing reflection data in S3

I think we could do a better job documenting this.

1 Like

@swarren some details on how to configure for different distributed storage locations:

1 Like

Thanks, I’ll look into that. I’ve started learning on a single node instance and it has one data directory with the Zookeeper and reflection storage under that. So it’s not clear to me yet what a production multi-node cluster looks like.

So I’m now thinking that nodes are disposable with the exception of the zookeeper data (for nodes participating as embedded zookeeper instances).

Yes, except for the master/coordinator, all other nodes are disposable.