Deprecated reflection cleanup

am using Dremio 2.0.5, Dremio deployed in open shift container with 10 nodes.

looking for solution for below

  1. is there any options available to purge deprecated reflection?
  2. distributed data store is in shared mount point in open shift. if any one of node down then reflection status become INCOMPLETE. since its in shared mount one node down should not impact. i do look at code, its checks partitions with active nodes. is there any solution for this.

Thank in advance. looking forward for solution,

For 1, do you mean the reflection has been disabled? The reflection files should be cleaned up automatically on a schedule.

For 2, how do you have your nodes configured for coordinators and executors?

This may help: https://docs.dremio.com/advanced-administration/high-availability.html

Thank you very much for response.

for 1 : let say reflection refresh policy is every 2 hours and expiry after 5 hours. when reflection will be deleted from distributed store after expiry. is there any advanced options-support key available to configured this. i want to purge reflection which is disabled and deprecated reflection .

for 2 : yes one of node act as coordinator and executor. i will go thru High availability docs and let you know in case anything.

Once again Thanks a lot.

Hi,

By default deprecated reflections are removed 4 hours after they are marked as deprecated (primarily to let any queries that are using the reflection to finish). We do have a system option called reflection.deletion.grace_seconds that determines the grace period before they are removed - just be aware that if the reflection is being used and deleted mid query things will go wrong :slight_smile:.

Thanks doron for response.

Regarding point 2: INCOMPLETE reflection state.

As suggested, I have 10 nodes, 2 master (1 coordinator (No executor), master + executor ) 8 executor nodes and external Zookeeper configured.

created one reflection which has partition in one of the node example node1,node2 and reflection status is CAN_ACCELERATE.

If node1 is done, reflection status become INCOMPLETE. which is not excepted behavior since distributed store in shared mount point.

ReflectionUtils.hasMissingPartitions checks activeHost and reflection partitions, if any partition missing then status set as INCOMPLETE.

This is major blocker for me to leverage Dremio. please let me know is there any options/solution to address this.

looking forward for response.

Thanks

When you say node 1 is done, do you mean the executor is shut down?

yes executor shutdown. am trying to test auto scaling features.

The first question I have is how you configure your shared mount point in your Dremio conf file. Are you using something like dist: "pdfs://"${paths.local}"/pdfs"?

You mention INCOMPLETE - so when you run select * from sys.reflections the STATUS for your reflection is INCOMPLETE? If so, can you run select * from sys.materializations - what does the data_partitions column say for your reflection?

Thanks doron for response.

in dremio.conf
paths: { local: “/var/lib/dremio/local”,
dist: “pdfs:///var/lib/dremio/share” }

yes. sys.reflections status is INCOMPLETE.
data_partitions column has ip address of node1,node2(xx.xx.xx.xx,xx.xx.xx.xx) in sys.materializations table.

For pdfs the current expected behavior is if any of the nodes are unreachable then the reflection will not work - I will open an internal ticket to see if for pdfs we should be doing that check or not.

Basically pdfs does not guarantee that the data is available on on nodes since its only pseudo distributed - you could be pointing pdfs at a local folder on each node and not necessarily a shared drive.

One other thing you could do is use file:// instead of pdfs if you are using a shared drive.

Thank you very much for suggestion.

Tried with file:// its does not store IP address of the node in data_partition column in sys.materialization table.

If i down one of the node. reflection state remain CAN_ACCELERATE.

Hello doron,

To config to use NAS as distributed storage, the doc mentioned here doesn’t have any prefix before the path.

We want to use shared mounted NFS volume as distributed storage which I think is a subset of NAS method. I tried with something like

dist: “/shared_mount_path”

but Dremio reported some errors parsing the config, do I need to put file:// as the prefix as well?

Yes file:// should work for you.