Glitches between Dremio and Vertica

Greetings to you all.

We have some problem with Dremio and Vertica.
For example we use this SQL-query in Dremio (I will write here only part of our query cause it is big enough):

Code in Dremio

with transactions as(

select

case /*when TRX_DATE between TO_TIMESTAMP(‘2009-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2010-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2010

when TRX_DATE between TO_TIMESTAMP(‘2010-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2011-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2011

when TRX_DATE between TO_TIMESTAMP(‘2011-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2012-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2012

when TRX_DATE between TO_TIMESTAMP(‘2012-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2013-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2013

when TRX_DATE between TO_TIMESTAMP(‘2013-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2014-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2014

when TRX_DATE between TO_TIMESTAMP(‘2014-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2015-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2015*/

when TRX_DATE between TO_TIMESTAMP(‘2015-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2016-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2016

when TRX_DATE between TO_TIMESTAMP(‘2016-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2017-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2017

when TRX_DATE between TO_TIMESTAMP(‘2017-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2018-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2018

when TRX_DATE between TO_TIMESTAMP(‘2018-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2019-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2019

when TRX_DATE between TO_TIMESTAMP(‘2019-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2020-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2020

when TRX_DATE between TO_TIMESTAMP(‘2020-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2021-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2021

when TRX_DATE between TO_TIMESTAMP(‘2021-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2022-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2022

when TRX_DATE between TO_TIMESTAMP(‘2022-10-01 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) and TO_TIMESTAMP(‘2023-09-30 00:00:00.000’, ‘YYYY-MM-DD HH24:MI:SS.FFF’) then 2023

end flight_period,

cast (FFP_NO as bigint) ffp_no,

TRANSACTION_ID,

min (TRX_DATE) TRX_DATE

And on the Vertica side we have this query transformed like this:

What we have in vertica

SELECT “ffp_transactions_V”.“FFP_NO”,
“ffp_transactions_V”.“TRX_DATE”,
“ffp_transactions_V”.“ACCRUAL_REDEMPTION_STATUS”,
“ffp_transactions_V”.“TRX_STATUS”,
“ffp_transactions_V”.“TEST_ACCOUNT_FLAG”,
“ffp_transactions_V”.“TRX_TRANSFER_FLAG”,
“ffp_transactions_V”.“TRANSACTION_DESC”,
“ffp_transactions_V”.“BUSINESS_LINE”
FROM “edw2”.“dm”.“ffp_transactions_V”

As I see the Dremio just wants to get all the data (without using any of the specified ranges).
Is there any possibility to tune the Vertica connector using file in this repo https://github.com/luciodaza/dremio-vertica-connector/blob/master/src/main/resources/arp/implementation/vertica-arp.yaml and build our own version with needed functionality?

Of course, these are community built ARP connectors. Feel free to fork and fix any issues (check out other forks as well). Here’s a quick tutorial on what’s important in an ARP connector → How to Create an ARP Connector | Dremio

Also, a quick workaround would be to use External Queries. See here.

SELECT * FROM table(<vertica_datasource-name>.external_query('<vertica-query'))

Thanks for the info.

I have tried this query, but it fails:

select * FROM table(EDW2.edw2.dm.ffp_transactions_V.external_query(‘where TRX_DATE between ‘‘2015-10-01’’ and ‘‘2023-09-30’’’))

The error is:

No match found for function signature external_query().

I think this is due to the wrong syntax, but I’m using the double simple quotes here (there is no double quotes).

Am I missing something?

  1. Is EDW2.edw2.dm.ffp_transactions_V your Source name? You can generally find the source name in the bottom left side. For example, here are mine:
    image

  2. You need to put the entire Vertica query (as how you would run it in Vertica) between quotes. So for example, if I want to query some table from the source “pg” from the above image. I run the following query:

    SELECT * FROM table(pg.external_query('SELECT * FROM public.weather_nyc'))
    

Thanks for your help!
The variant with the use of external_query works now.
You were right - it has been the wrong syntax.

Also I want to know about configuring the Vertica connector through vertica-arp.yaml (https://github.com/luciodaza/dremio-vertica-connector/blob/master/src/main/resources/arp/implementation/vertica-arp.yaml).
I’m not good in Java and coding.
Maybe there is some manual about how to tune and build the connector through this vertica-arp.yaml file?

Great to hear that it works!

Here’s the guide that may help you: Dremio Tutorial: Creating ARP Connector IMO, building an ARP connector isn’t too difficult. Try it out and let us know if you need any help.