Reflection always says 'in progress'

I am on Windows 10 with 8GB of memory… I have a sql server 2012 locally installed.
After submitting to create reflection, the process took a very long time (hours). Even though, the queries submitted to sql server took about 3 seconds and only returns 2118 rows.

When I check the data set through the settings, it would say that the reflection is ‘In progress’. But when I click on the Job’s log and located the data set materializing job, it would say’completed’
No matter, the reflection would always say that ‘This reflection is not ready to accelerate queries’.

What i am not doing right?

Thanks for your help with this.
Nabil

Hi @nakdouche,

Couple of things here,

First the reflection job still in progress. Can you please send us the profile? see below URL on how to download profile

How To Share A Query Profile

For the reflection not getting used, we would like to understand the tables or data sets that were used in the reflection and the actual query you are running that is not getting accelarated

Thanks,
@balaji.ramaswamy

Good to note which version of Dremio you’re running.

187801a1-2d54-4825-af44-6938abc9f83d.zip (6.4 KB)

Hi balaji.ramaswamy and Kelly.

Here is the log from profiler.
Also here the version: Dremio 1.2. installed files :

  • dremio-community-1.2.1-201710030121530889-8e49316
  • dremio-connector-odbc-1.3.11.1034-winx64

Here is the query from AdvantureWorks database from SQL server 2012.

SELECT [ModifiedDate], [OrderDate], [DueDate], SUM([SalesOrderDetailID]) AS [agg-0-0], MAX([SalesOrderDetailID]) AS [agg-0-1], COUNT([SalesOrderDetailID]) AS [agg-0-2], MIN([SalesOrderDetailID]) AS [agg-0-3],
SUM([CustomerID]) AS [agg-11-0], MAX([CustomerID]) AS [agg-11-1], COUNT([CustomerID]) AS [agg-11-2], MIN([CustomerID]) AS [agg-11-3], SUM([SalesOrderID]) AS [agg-5-0], MAX([SalesOrderID]) AS [agg-5-1],
COUNT([SalesOrderID]) AS [agg-5-2], MIN([SalesOrderID]) AS [agg-5-3], SUM([ShipToAddressID]) AS [agg-14-0], MAX([ShipToAddressID]) AS [agg-14-1], COUNT([ShipToAddressID]) AS [agg-14-2], MIN([ShipToAddressID]) AS
[agg-14-3], SUM([BillToAddressID]) AS [agg-13-0], MAX([BillToAddressID]) AS [agg-13-1], COUNT([BillToAddressID]) AS [agg-13-2], MIN([BillToAddressID]) AS [agg-13-3], SUM([SubTotal]) AS [agg-16-0], MAX([SubTotal]) AS
[agg-16-1], COUNT([SubTotal]) AS [agg-16-2], MIN([SubTotal]) AS [agg-16-3], SUM([TotalDue]) AS [agg-19-0], MAX([TotalDue]) AS [agg-19-1], COUNT([TotalDue]) AS [agg-19-2], MIN([TotalDue]) AS [agg-19-3],
SUM([Freight]) AS [agg-18-0], MAX([Freight]) AS [agg-18-1], COUNT([Freight]) AS [agg-18-2], MIN([Freight]) AS [agg-18-3], SUM([TaxAmt]) AS [agg-17-0], MAX([TaxAmt]) AS [agg-17-1], COUNT([TaxAmt]) AS [agg-17-2],
MIN([TaxAmt]) AS [agg-17-3], SUM([LineTotal]) AS [agg-3-0], MAX([LineTotal]) AS [agg-3-1], COUNT([LineTotal]) AS [agg-3-2], MIN([LineTotal]) AS [agg-3-3], SUM([ProductID]) AS [agg-20-0], MAX([ProductID]) AS
[agg-20-1], COUNT([ProductID]) AS [agg-20-2], MIN([ProductID]) AS [agg-20-3], SUM([Weight]) AS [agg-32-0], MAX([Weight]) AS [agg-32-1], COUNT([Weight]) AS [agg-32-2], MIN([Weight]) AS [agg-32-3], SUM([UnitPrice]) AS
[agg-2-0], MAX([UnitPrice]) AS [agg-2-1], COUNT([UnitPrice]) AS [agg-2-2], MIN([UnitPrice]) AS [agg-2-3], SUM([OrderQty]) AS [agg-1-0], MAX([OrderQty]) AS [agg-1-1], COUNT([OrderQty]) AS [agg-1-2], MIN([OrderQty]) AS [agg-1-3], SUM([StandardCost]) AS [agg-27-0], MAX([StandardCost]) AS [agg-27-1], COUNT([StandardCost]) AS [agg-27-2], MIN([StandardCost]) AS [agg-27-3], SUM([ListPrice]) AS [agg-28-0], MAX([ListPrice]) AS [agg-28-1], COUNT([ListPrice]) AS [agg-28-2], MIN([ListPrice]) AS [agg-28-3], SUM([ProductModelID]) AS [agg-37-0], MAX([ProductModelID]) AS [agg-37-1], COUNT([ProductModelID]) AS [agg-37-2], MIN([ProductModelID]) AS [agg-37-3], SUM([SalesPersonID]) AS [agg-12-0], MAX([SalesPersonID]) AS [agg-12-1], COUNT([SalesPersonID]) AS [agg-12-2], MIN([SalesPersonID]) AS [agg-12-3], SUM([SalesLastYear]) AS [agg-46-0], MAX([SalesLastYear]) AS [agg-46-1], COUNT([SalesLastYear]) AS [agg-46-2], MIN([SalesLastYear]) AS [agg-46-3], SUM([SalesYTD]) AS [agg-45-0], MAX([SalesYTD]) AS [agg-45-1], COUNT([SalesYTD]) AS [agg-45-2], MIN([SalesYTD]) AS [agg-45-3], SUM([DaysToManufacture]) AS [agg-33-0], MAX([DaysToManufacture]) AS [agg-33-1], COUNT([DaysToManufacture]) AS [agg-33-2], MIN([DaysToManufacture]) AS [agg-33-3], SUM([ReorderPoint]) AS [agg-26-0], MAX([ReorderPoint]) AS [agg-26-1], COUNT([ReorderPoint]) AS [agg-26-2], MIN([ReorderPoint]) AS [agg-26-3], SUM([SafetyStockLevel]) AS [agg-25-0], MAX([SafetyStockLevel]) AS [agg-25-1], COUNT([SafetyStockLevel]) AS [agg-25-2], MIN([SafetyStockLevel]) AS [agg-25-3], SUM([ShipMethodID]) AS [agg-15-0], MAX([ShipMethodID]) AS [agg-15-1], COUNT([ShipMethodID]) AS [agg-15-2], MIN([ShipMethodID]) AS [agg-15-3], SUM([Status]) AS [agg-9-0], MAX([Status]) AS [agg-9-1], COUNT([Status]) AS [agg-9-2], MIN([Status]) AS [agg-9-3], COUNT([$f28]) AS [CONVERT_COUNT_STAR]
FROM (SELECT CAST([t2].[ModifiedDate] AS DATE) AS [ModifiedDate], CAST([t2].[OrderDate] AS DATE) AS [OrderDate], CAST([t2].[DueDate] AS DATE) AS [DueDate], [t2].[SalesOrderDetailID], [t2].[OrderQty], [t2].[UnitPrice], [t2].[LineTotal], [t2].[SalesOrderID], [t2].[Status], [t2].[CustomerID], [t2].[SalesPersonID], [t2].[BillToAddressID], [t2].[ShipToAddressID], [t2].[ShipMethodID], [t2].[SubTotal], [t2].[TaxAmt], [t2].[Freight], [t2].[TotalDue], [t2].[ProductID], [t2].[SafetyStockLevel], [t2].[ReorderPoint], [t2].[StandardCost], [t2].[ListPrice], [t2].[Weight], [t2].[DaysToManufacture], [t2].[ProductModelID], [SalesTerritory].[SalesYTD], [SalesTerritory].[SalesLastYear], 1 AS [$f28]
FROM (SELECT [t1].[SalesOrderDetailID], [t1].[OrderQty], [t1].[UnitPrice], [t1].[LineTotal], [t1].[ModifiedDate], [t1].[SalesOrderID], [t1].[OrderDate], [t1].[DueDate], [t1].[ShipDate], [t1].[Status], [t1].[OnlineOrderFlag], [t1].[CustomerID], [t1].[SalesPersonID], [t1].[TerritoryID], [t1].[BillToAddressID], [t1].[ShipToAddressID], [t1].[ShipMethodID], [t1].[SubTotal], [t1].[TaxAmt], [t1].[Freight], [t1].[TotalDue], [t1].[ProductID], [t1].[Name], [t1].[MakeFlag], [t1].[FinishedGoodsFlag], [t1].[Color], [t1].[SafetyStockLevel], [t1].[ReorderPoint], [t1].[StandardCost], [t1].[ListPrice], [t1].[Size], [t1].[SizeUnitMeasureCode], [t1].[WeightUnitMeasureCode], [t1].[Weight], [t1].[DaysToManufacture], [t1].[ProductLine], [t1].[Class], [t1].[Style], [t1].[ProductModelID], [t1].[SellStartDate], [t1].[SellEndDate], [t1].[Product Category Name] AS [Product Sub Category Name], [ProductCategory].[Name] AS [Product Category Name]
FROM (SELECT [t0].[SalesOrderDetailID], [t0].[OrderQty], [t0].[UnitPrice], [t0].[LineTotal], [t0].[ModifiedDate], [t0].[SalesOrderID], [t0].[OrderDate], [t0].[DueDate], [t0].[ShipDate], [t0].[Status], [t0].[OnlineOrderFlag], [t0].[CustomerID], [t0].[SalesPersonID], [t0].[TerritoryID], [t0].[BillToAddressID], [t0].[ShipToAddressID], [t0].[ShipMethodID], [t0].[SubTotal], [t0].[TaxAmt], [t0].[Freight], [t0].[TotalDue], [t0].[ProductID], [t0].[Name], [t0].[MakeFlag], [t0].[FinishedGoodsFlag], [t0].[Color], [t0].[SafetyStockLevel], [t0].[ReorderPoint], [t0].[StandardCost], [t0].[ListPrice], [t0].[Size], [t0].[SizeUnitMeasureCode], [t0].[WeightUnitMeasureCode], [t0].[Weight], [t0].[DaysToManufacture], [t0].[ProductLine], [t0].[Class], [t0].[Style], [t0].[ProductModelID], [t0].[SellStartDate], [t0].[SellEndDate], [t0].[ProductSubcategoryID], [ProductSubcategory].[ProductCategoryID], [ProductSubcategory].[Name] AS [Product Category Name]
FROM (SELECT [t].[SalesOrderDetailID], [t].[OrderQty], [t].[UnitPrice], [t].[LineTotal], [t].[ModifiedDate], [t].[SalesOrderID], [t].[OrderDate], [t].[DueDate], [t].[ShipDate], [t].[Status], [t].[OnlineOrderFlag], [t].[CustomerID], [t].[SalesPersonID], [t].[TerritoryID], [t].[BillToAddressID], [t].[ShipToAddressID], [t].[ShipMethodID], [t].[SubTotal], [t].[TaxAmt], [t].[Freight], [t].[TotalDue], [t].[ProductID], [Product].[Name], [Product].[MakeFlag], [Product].[FinishedGoodsFlag], [Product].[Color], [Product].[SafetyStockLevel], [Product].[ReorderPoint], [Product].[StandardCost], [Product].[ListPrice], [Product].[Size], [Product].[SizeUnitMeasureCode], [Product].[WeightUnitMeasureCode], [Product].[Weight], [Product].[DaysToManufacture], [Product].[ProductLine], [Product].[Class], [Product].[Style], [Product].[ProductSubcategoryID], [Product].[ProductModelID], [Product].[SellStartDate], [Product].[SellEndDate]
FROM (SELECT [SalesOrderDetail].[SalesOrderDetailID], [SalesOrderDetail].[OrderQty], [SalesOrderDetail].[ProductID], [SalesOrderDetail].[UnitPrice], [SalesOrderDetail].[LineTotal], [SalesOrderDetail].[ModifiedDate], [SalesOrderDetail].[SalesOrderID], [SalesOrderHeader].[OrderDate], [SalesOrderHeader].[DueDate], [SalesOrderHeader].[ShipDate], [SalesOrderHeader].[Status], [SalesOrderHeader].[OnlineOrderFlag], [SalesOrderHeader].[CustomerID], [SalesOrderHeader].[SalesPersonID], [SalesOrderHeader].[TerritoryID], [SalesOrderHeader].[BillToAddressID], [SalesOrderHeader].[ShipToAddressID], [SalesOrderHeader].[ShipMethodID], [SalesOrderHeader].[SubTotal], [SalesOrderHeader].[TaxAmt], [SalesOrderHeader].[Freight], [SalesOrderHeader].[TotalDue]
FROM [AdventureWorks2012_Data].[Sales].[SalesOrderDetail]
INNER JOIN [AdventureWorks2012_Data].[Sales].[SalesOrderHeader] ON [SalesOrderDetail].[SalesOrderID] = [SalesOrderHeader].[SalesOrderID]) AS [t]
FULL JOIN [AdventureWorks2012_Data].[Production].[Product] ON [t].[ProductID] = [Product].[ProductID]) AS [t0]
INNER JOIN [AdventureWorks2012_Data].[Production].[ProductSubcategory] ON [t0].[ProductSubcategoryID] = [ProductSubcategory].[ProductSubcategoryID]) AS [t1]
INNER JOIN [AdventureWorks2012_Data].[Production].[ProductCategory] ON [t1].[ProductCategoryID] = [ProductCategory].[ProductCategoryID]) AS [t2]
INNER JOIN [AdventureWorks2012_Data].[Sales].[SalesTerritory] ON [t2].[TerritoryID] = [SalesTerritory].[TerritoryID]) AS [t3]
GROUP BY [ModifiedDate], [OrderDate], [DueDate]

Please let me know if you would need anything else.

Thanks,
Nabil

Hi @nakdouche

It does look like from the profile the reflection was still running and that could be the same reason the subsequent query was not accelarated

Can you please try this?

  1. drop the reflection
  2. Re create the reflection and makes sure the job completes
  3. Try your query again
  4. If not accelerated upload the profile of the query that was run

Thanks,
@balaji.ramaswamy

92d227df-e5bb-4ffe-8110-c2edb19ae5b5.zip (6.5 KB)

Hi Balaji.

I went ahead and restarted Dremio service.
Cleared all reflections of that dataset.
Recreated the raw reflection.
It took 11 hours to say ‘completed’ at the job status. But when you click on the dataset, reflection and go to the settings, it would still say ‘In progress’ and it would say ‘This reflection is not ready to accelerate queries’

Please find the profile file attached here.

Thank you for your help with this.
Nabil

Hi @nakdouche

Can you also please send me a screenshot of the reflection settings page?

Thanks
@balaji.ramaswamy

Screenshots of reflection

Hi @nakdouche

As we troubleshoot this we would like one more piece of information for this. Can you please send me your refresh period and grace period values for the dataset/source?

Probably a screenshot of the data source settings including the adavanced

Thanks,
@balaji.ramaswamy

Hi Balaji

I do not think I am getting that screen with the dataset that I am working with. I can get to that screen only if I am at the table level in the database source ( but not via a virtual dataset).

Please let me know if this helps.

Thanks,
Nabil

Hi Balaji.

I have not heard back from you so I thought to check with you and see if you needed any thing else from me.

Thanks for helping with this issue.

Nabil

Hi @nakdouche

Sorry for the delay in response. I wanted to ask if you can try something so we can narrow down the issue

Could you please drop this reflection and start with fewer measures to start with?

Lets make sure that reflection gets created and we can throttle up

Sounds ok?

Thanks,
@balaji.ramaswamy

That sounds good Balaji.
I will start with one table and build on it until it breaks.

Thanks for your help with this.
Nabil

Hi Balaji.

I started a new dataset (using MS AdventureWorks2012_Data). This time, I started with one single table (“SQL Server 2012”.AdventureWorks2012_Data.Production.ProductSubcategory) . I was able to create a reflection and that went in a flash.
I then joined another table to the first table. I had to clear the reflection I created on the dataset. Then I tried to create a reflection on the new dataset. Unfortunately, it did not go as smooth as the first one (I let it run for about 20 min and then killed it).
While the first reflection was still running, I thought that it could be the size of the data which would nothing to what I think dremio would handle, I selected a tiny table (504 rows), I removed most of the columns. When I launched Raw Reflection, it did not go through (showing in Progress) as the other reflection was still running.

I tried to cancel the first reflection by hitting the cancel button, it would not do anything (still says in progress). As if the job is not there (zombie job). So, I went into the reflection and I tried to clear all and that cleared all. The second reflection with the tiny table and small number of columns was still running (7 min and running). I tried to cancel and same thing happened for the second reflection. It did not cancel. I had to clear all reflections to kill it.

Now that I do not have anything running, I went back to the small table dataset and attempted to create a raw reflection again.
Here is the dump of the profile for the tiny table:

1c513533-a05c-48f1-b038-15746d0010bd.zip (4.3 KB)

I cleared the running raw reflection creation and decided to create a third tiny data set (“SQL Server 2012”.AdventureWorks2012_Data.Production.ProductSubcategory ) and then create a raw reflection. This third reflection did not even show a status ‘in Progress’. In the job tab, you can see it running but in the dataset setting, the reflection shows as New.
Third dataset is one table, 5 columns, 37 rows:
SELECT * FROM “SQL Server 2012”.AdventureWorks2012_Data.Production.ProductSubcategory
At this point, the dremio engine looks like it is corrupt.

I restarted dremio. I went back to the third data set and tried to create s reflection again and that created the raw reflection instantly.
I edited the query to join to another tiny table (
SQL Server 2012".AdventureWorks2012_Data.Production.ProductCategory
). Saved the query.

SELECT nested_0.ProductSubcategoryID AS ProductSubcategoryID, nested_0.Name AS Name, nested_0.rowguid AS rowguid, nested_0.ModifiedDate AS ModifiedDate, nested_0.ProductCategoryID AS ProductCategoryID, join_ProductCategory.ProductCategoryID AS ProductCategoryID0, join_ProductCategory.Name AS Name0, join_ProductCategory.rowguid AS rowguid0, join_ProductCategory.ModifiedDate AS ModifiedDate0
FROM (
SELECT ProductSubcategoryID, ProductCategoryID, Name, rowguid, ModifiedDate
FROM “SQL Server 2012”.AdventureWorks2012_Data.Production.ProductSubcategory
) nested_0
INNER JOIN “SQL Server 2012”.AdventureWorks2012_Data.Production.ProductCategory AS join_ProductCategory ON nested_0.ProductCategoryID = join_ProductCategory.ProductCategoryID

I went back to the raw reflections, cleared it and then created a new raw reflections and that worked this time too. I pulled in a third small table (“SQL Server 2012”.AdventureWorks2012_Data.Production.Product) . That also work.
But as soon as I pulled in ‘SQL Server 2012.AdventureWorks2012_Data.Sales.SalesOrderDetail’ and I refreshed the reflection, things went south (reflection kept saying in progress).
Not sure what is the issue with this table but I thought I would give it one more chance.
I restarted the dremio service and created a new dataset with this table. I trimmed the columns all but few(7 columns). I created a raw reflection and that worked.
I joined to product table. Trimmed columns to (20) 0nly. I then cleared the raw reflection and tried to create it again. And that worked.
I joined the Sales Header table. And that broke it.

Here is the profile file for this one while it is still running:

2ff424ca-295e-41de-8ae8-760f1a04170a.zip (5.9 KB)

This is the data set query: when I run it in sql server , it takes 4 minutes and generates 121317 rows.

SELECT nested_1.SalesOrderDetailID AS SalesOrderDetailID, nested_1.OrderQty AS OrderQty, nested_1.UnitPrice AS UnitPrice, nested_1.UnitPriceDiscount AS UnitPriceDiscount, nested_1.LineTotal AS LineTotal, nested_1.ProductID AS ProductID, nested_1.Name AS Name, nested_1.ProductNumber AS ProductNumber, nested_1.MakeFlag AS MakeFlag, nested_1.FinishedGoodsFlag AS FinishedGoodsFlag, nested_1.Color AS Color, nested_1.SafetyStockLevel AS SafetyStockLevel, nested_1.ReorderPoint AS ReorderPoint, nested_1.StandardCost AS StandardCost, nested_1.ListPrice AS ListPrice, nested_1.ProductLine AS ProductLine, nested_1.ProductSubcategoryID AS ProductSubcategoryID, nested_1.ProductModelID AS ProductModelID, nested_1.SalesOrderID AS SalesOrderID, join_SalesOrderHeader.SalesOrderID AS SalesOrderID0, join_SalesOrderHeader.RevisionNumber AS RevisionNumber, join_SalesOrderHeader.OrderDate AS OrderDate, join_SalesOrderHeader.DueDate AS DueDate, join_SalesOrderHeader.ShipDate AS ShipDate, join_SalesOrderHeader.Status AS Status, join_SalesOrderHeader.OnlineOrderFlag AS OnlineOrderFlag, join_SalesOrderHeader.SalesOrderNumber AS SalesOrderNumber, join_SalesOrderHeader.PurchaseOrderNumber AS PurchaseOrderNumber, join_SalesOrderHeader.AccountNumber AS AccountNumber, join_SalesOrderHeader.CustomerID AS CustomerID, join_SalesOrderHeader.SalesPersonID AS SalesPersonID, join_SalesOrderHeader.TerritoryID AS TerritoryID, join_SalesOrderHeader.BillToAddressID AS BillToAddressID, join_SalesOrderHeader.ShipToAddressID AS ShipToAddressID, join_SalesOrderHeader.ShipMethodID AS ShipMethodID, join_SalesOrderHeader.CreditCardID AS CreditCardID, join_SalesOrderHeader.CreditCardApprovalCode AS CreditCardApprovalCode, join_SalesOrderHeader.CurrencyRateID AS CurrencyRateID, join_SalesOrderHeader.SubTotal AS SubTotal, join_SalesOrderHeader.TaxAmt AS TaxAmt, join_SalesOrderHeader.Freight AS Freight, join_SalesOrderHeader.TotalDue AS TotalDue, join_SalesOrderHeader.Comment AS Comment, join_SalesOrderHeader.rowguid AS rowguid, join_SalesOrderHeader.ModifiedDate AS ModifiedDate
FROM (
SELECT nested_0.SalesOrderID AS SalesOrderID, nested_0.SalesOrderDetailID AS SalesOrderDetailID, nested_0.OrderQty AS OrderQty, nested_0.UnitPrice AS UnitPrice, nested_0.UnitPriceDiscount AS UnitPriceDiscount, nested_0.LineTotal AS LineTotal, nested_0.ProductID AS ProductID, join_Product.Name AS Name, join_Product.ProductNumber AS ProductNumber, join_Product.MakeFlag AS MakeFlag, join_Product.FinishedGoodsFlag AS FinishedGoodsFlag, join_Product.Color AS Color, join_Product.SafetyStockLevel AS SafetyStockLevel, join_Product.ReorderPoint AS ReorderPoint, join_Product.StandardCost AS StandardCost, join_Product.ListPrice AS ListPrice, join_Product.ProductLine AS ProductLine, join_Product.ProductSubcategoryID AS ProductSubcategoryID, join_Product.ProductModelID AS ProductModelID
FROM (
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount, LineTotal
FROM “SQL Server 2012”.AdventureWorks2012_Data.Sales.SalesOrderDetail
) nested_0
INNER JOIN “SQL Server 2012”.AdventureWorks2012_Data.Production.Product AS join_Product ON nested_0.ProductID = join_Product.ProductID
) nested_1
INNER JOIN “SQL Server 2012”.AdventureWorks2012_Data.Sales.SalesOrderHeader AS join_SalesOrderHeader ON nested_1.SalesOrderID = join_SalesOrderHeader.SalesOrderID

Please let me know of your thoughts and ideas.

Thanks,
Nabil

Hi Balaji

Do you think upgrading to the latest version would help with this issue? Just an FYI. I am on windows 10.

Thank you.
Nabil

Hi @nakdouche

You should be able to download the latest version now and try and see if the reflection creates

Thanks,
@balaji.ramaswamy

Same problem here (v1.4.9), running with 8 workers on YARN. But I’m building only one raw reflection for one parquet table (some hundreds of GB).

Will try to restart everything from scratch and see if this problem persists. This problem should be located and fixed in the newer release.