Reflection "Failed to spill on disks" on Azure Data Lake

vplauzon · April 21, 2021, 7:43pm

Hi,

I’m new to Dremio.

I’m trying to reflect on a dataset stored on Azure Data Lake as csv.gz blobs.

I successfully loaded the data lake. I then create a dataset for each of the 4 sub folders. The dataset total is 1.2 TB.

I then activate raw reflection on each dataset specifying a column with about 1K cardinality. Each reflection fail with “Failed to spill to disk. Please check space availability”.

I run a 5 VMs cluster on E16-v3 sku, which have 400GB of disk space. So that should be plenty for the 1.2 TB, shouldn’t it? Not even one work.

Here are some details on one of them:

Input Bytes:	131.64 GB
Input Records:	458,525,083

Any tips on why this isn’t working?

balaji.ramaswamy · April 22, 2021, 3:30am

@vplauzon

Would it be possible to share the query profile?

vplauzon · April 22, 2021, 2:09pm

Hi Balaji,

Of course, here it is. I scrubbed a couple of details (email + storage account name).header.zip (47.3 KB)

Thank you!

balaji.ramaswamy · April 23, 2021, 5:27am

@vplauzon

It may be possible that Dremio is not able to write to “/mnt/resource/dremio/spill” due to permissions. Can we validate this?

vplauzon · April 23, 2021, 1:15pm

How can we validate that?

It’s a vanilla image fresh from the Azure Marketplace (VMSS). I didn’t tweak the config in any way.

balaji.ramaswamy · April 25, 2021, 5:22am

@vplauzon Log on to “mydremioq000001.internal.cloudapp.net” and go to “/mnt/resource/dremio/spill” and check

vplauzon · May 3, 2021, 9:59pm

Hi Balaji,

Logging in the first executor, trying I got the following with a sudo-cd:

-bash: cd: /mnt/resource/dremio/spill: No such file or directory

The “highest” I could get through the hierarchy is:

$ sudo ls /mnt/resource
DATALOSS_WARNING_README.txt lost+found

I did update the VM scale set image…

Is there an issue with the Azure Marketplace image?

vplauzon · May 7, 2021, 9:04pm

Any ideas?

Do you think there was a bug with the Az Marketplace version I used?

balaji.ramaswamy · May 8, 2021, 2:31pm

@vplauzon Let me check, For Azure, recommend using AKS, is that possible?

vplauzon · May 9, 2021, 10:55pm

The Azure Marketplace is using VM Scale set. I know AKS is also possible.

balaji.ramaswamy · May 10, 2021, 4:44am

@vplauzon On Azure the recommended deployment is AKS, would that be ok to try?

http://docs.dremio.com/deployment/azure-aks/azure-aks/

vplauzon · May 10, 2021, 2:12pm

Sure, I’ll give it a shot.

Topic		Replies	Views
No space left on device - Dremio on Azure	1	1207	May 2, 2022
DATA_WRITE ERROR: No space left on device	1	1077	March 25, 2022
Query planning still Progress	1	1007	January 31, 2019
Failed to create directory for spilling in Dremio like AWS Service	1	1621	March 21, 2022
Failed to spill to disk. Please check space availability;	1	1641	April 30, 2020

Reflection "Failed to spill on disks" on Azure Data Lake

Related topics