Join row_number over function in dremio getting Wrong Result

I have used this query in SSMS and get correct Result
select EntityId,EntityName,WellType,ProductionDate,EffectiveDate,Field,CondyMeasured,OilMeasured,WaterMeasured,CGRComponentRatio,WGRComponentRatio from
(select EntityId,EntityName,WellType,ProductionDate,EffectiveDate,Field,CondyMeasured,OilMeasured,WaterMeasured,CGRComponentRatio,WGRComponentRatio
,row_number() over(partition by EntityId,ProductionDate order by EntityId,ProductionDate,EffectiveDate desc) as roworder from
(select * from (SELECT FDCWellRecords.EntityId,FDCWellRecords.WellType,FDCWellRecords.ProductionDate,FDCWellRecords.EntityName,Field,CondyMeasured,OilMeasured,WaterMeasured,EffectiveDate,CGRComponentRatio,
WGRComponentRatio FROM FDCWellRecords inner Join FDCComponentRatio on FDCWellRecords.EntityId = FDCComponentRatio.EntityId ) temp where temp.EffectiveDate is not null and temp.WellType = ‘Effluent’) temp1 where ProductionDate >= EffectiveDate) temp2 where roworder = 1

But same query in Dremio getting Wrong Result here is the attached screenshots

I am using Dremio 3.1.1-201901281837360699-30c9d74

here is attached Query Profile
e1509cdf-dd55-458f-b464-20b979d6d568.zip (22.1 KB)

i have also tested on latest Dremio version 3.1.10-201904162146020182-adf690d but still i am getting diffrent result

Any Alternative we can use instaid of Row _function() i see similar Thred here

@Vikash_Singh,

The SQL in the profile is different from what you supplied in the comments. Please run the below query in Dremio and in SSMS and include a screenshot of the results with the rownumber, ProductionDate and EffectiveDate clearly visible in each.

select
    EntityId,
    EntityName,
    WellType,
    ProductionDate,
    EffectiveDate,
    Field,
    CondyMeasured,
    OilMeasured,
    WaterMeasured,
    CGRComponentRatio,
    WGRComponentRatio 
from
    (select
        EntityId,
        EntityName,
        WellType,
        ProductionDate,
        EffectiveDate,
        Field,
        CondyMeasured,
        OilMeasured,
        WaterMeasured,
        CGRComponentRatio,
        WGRComponentRatio ,
        row_number() over(partition 
    by
        EntityId,
        ProductionDate 
    order by
        EntityId,
        ProductionDate,
        EffectiveDate desc) as roworder 
    from
        (select
            * 
        from
            (SELECT
                FDCWellRecords.EntityId,
                FDCWellRecords.WellType,
                FDCWellRecords.ProductionDate,
                FDCWellRecords.EntityName,
                Field,
                CondyMeasured,
                OilMeasured,
                WaterMeasured,
                EffectiveDate,
                CGRComponentRatio,
                WGRComponentRatio 
            FROM
                FDCWellRecords 
            inner Join
                FDCComponentRatio 
                    on FDCWellRecords.EntityId = FDCComponentRatio.EntityId ) temp 
        where
            temp.EffectiveDate is not null 
            and temp.WellType = ‘Effluent’
        ) temp1 
    where
        ProductionDate >= EffectiveDate) temp2 
where
    roworder = 1

Here is the attached screenshots

SSMS

dremio

660a0d58-d05d-4ac3-b423-c95f51e14d6d.zip (16.8 KB)

Any suggestion on above Query?

Is see that EffectiveDate is different in Dremio, but you are executing a Preview, not a Run. Please run the query and check the results:
21%20AM

i have triyed Run too but getting same result as previous there was not changing in query
result

Please attach the profile for this Run so we can examine what query Dremio is pushing down into the SQL Server source.

thanks ,here is attached Profile for above querya035c4da-3e53-4833-a679-261a3956e2b5.zip (22.5 KB)

This is the query that Dremio is pushing down into the MSSQL source:

SELECT
        [EntityId] COLLATE LATIN1_GENERAL_BIN2 AS [EntityId],
        [EntityName] COLLATE LATIN1_GENERAL_BIN2 AS [EntityName],
        CAST('Effluent' COLLATE LATIN1_GENERAL_BIN2 AS VARCHAR(8000)) AS [WellType],
        [ProductionDate],
        [EffectiveDate],
        [Field] COLLATE LATIN1_GENERAL_BIN2 AS [Field],
        [CondyMeasured],
        [OilMeasured],
        [WaterMeasured],
        [CGRComponentRatio],
        [WGRComponentRatio] 
    FROM
        (SELECT
            [EntityId] COLLATE LATIN1_GENERAL_BIN2 AS [EntityId],
            [EntityName] COLLATE LATIN1_GENERAL_BIN2 AS [EntityName],
            CAST('Effluent' COLLATE LATIN1_GENERAL_BIN2 AS VARCHAR(8000)) AS [WellType],
            [ProductionDate],
            [EffectiveDate],
            [Field] COLLATE LATIN1_GENERAL_BIN2 AS [Field],
            [CondyMeasured],
            [OilMeasured],
            [WaterMeasured],
            [CGRComponentRatio],
            [WGRComponentRatio],
            ROW_NUMBER() OVER (PARTITION 
        BY
            [EntityId] COLLATE LATIN1_GENERAL_BIN2,
            [ProductionDate] 
        ORDER BY
            CASE 
                WHEN [EntityId] COLLATE LATIN1_GENERAL_BIN2 IS NULL THEN 1 
                ELSE 0 
            END,
            [EntityId] COLLATE LATIN1_GENERAL_BIN2,
            CASE 
                WHEN [ProductionDate] IS NULL THEN 1 
                ELSE 0 
            END,
            [ProductionDate],
            CASE 
                WHEN [EffectiveDate] IS NULL THEN 1 
                ELSE 0 
            END DESC,
            [EffectiveDate]) AS [roworder] 
        FROM
            (SELECT
                [EntityId] COLLATE LATIN1_GENERAL_BIN2 AS [EntityId],
                CAST('Effluent' COLLATE LATIN1_GENERAL_BIN2 AS VARCHAR(8000)) AS [WellType],
                [ProductionDate],
                [EntityName] COLLATE LATIN1_GENERAL_BIN2 AS [EntityName],
                [Field] COLLATE LATIN1_GENERAL_BIN2 AS [Field],
                [CondyMeasured],
                [OilMeasured],
                [WaterMeasured],
                [EffectiveDate],
                [CGRComponentRatio],
                [WGRComponentRatio] 
            FROM
                (SELECT
                    [FDCWellRecords].[EntityId] COLLATE LATIN1_GENERAL_BIN2 AS [EntityId],
                    [FDCWellRecords].[WellType] COLLATE LATIN1_GENERAL_BIN2 AS [WellType],
                    [FDCWellRecords].[ProductionDate],
                    [FDCWellRecords].[EntityName] COLLATE LATIN1_GENERAL_BIN2 AS [EntityName],
                    [FDCWellRecords].[Field] COLLATE LATIN1_GENERAL_BIN2 AS [Field],
                    [FDCWellRecords].[CondyMeasured],
                    [FDCWellRecords].[OilMeasured],
                    [FDCWellRecords].[WaterMeasured],
                    [FDCComponentRatio].[EffectiveDate],
                    [FDCComponentRatio].[CGRComponentRatio],
                    [FDCComponentRatio].[WGRComponentRatio] 
                FROM
                    [tourmalineqadb].[dbo].[FDCWellRecords] 
                INNER JOIN
                    [tourmalineqadb].[dbo].[FDCComponentRatio] 
                        ON [FDCWellRecords].[EntityId] COLLATE LATIN1_GENERAL_BIN2 = [FDCComponentRatio].[EntityId] COLLATE LATIN1_GENERAL_BIN2) AS [t] 
            WHERE
                [t].[WellType] COLLATE LATIN1_GENERAL_BIN2 = 'Effluent' COLLATE LATIN1_GENERAL_BIN2 
                AND [t].[EffectiveDate] IS NOT NULL
            ) AS [t] 
        WHERE
            [ProductionDate] >= [EffectiveDate]) AS [t] 
    WHERE
        [roworder] = 1

Does this give the same results as Dremio when you submit it with SSMS (I’m assuming it will).

Do you have any special options set, like “Enable legacy dialect”?
advanced-options

Hello Ben,

You are right SSMS also Queries same Results and also I did not enabled any Legacy dialect setting

Hi @Vikash_Singh, apologies for the very late response here.

Dremio orders the null values differently than SQL Server; Dremio ranks them “highest” while MSSQL ranks them lowest. So when Dremio pushes down a query into the MSSQL source, it changes the query to make sure the records sent back are ordered as Dremio would order them.

That’s why are you seeing the difference.