Hi to you all.
I work with dremio and have the following
- an iceberg table, I want to update
- a view with differential source data that should be appended with dbt with an incremental update strategy to the target iceberg table.
When I’m running dbt run in vs code, I get the following error message:
e[0m14:52:25.120274 [error] [MainThread]: Runtime Error in model raw_demo_dbt_incr_updates (models/raw/demo_dbt_incr_updates/raw_demo_dbt_incr_updates.sql)
ERROR: SYSTEM ERROR: NoSuchElementException
SqlOperatorImpl NESTED_LOOP_JOIN
Location 0:0:25
ErrorOrigin: EXECUTOR
[Error Id: 19bd9ddd-6824-4d6e-b6d3-d24af8ba3bcd ]
(java.util.NoSuchElementException) null
java.util.ArrayList$Itr.next():1000
com.dremio.sabot.op.join.nlje.NLJEOperator.getInitialMatchState():188
com.dremio.sabot.op.join.nlje.NLJEOperator.noMoreToConsumeRight():257
com.dremio.sabot.driver.SmartOp$SmartDualInput.noMoreToConsumeRight():517
com.dremio.sabot.driver.WyePipe.pump():83
com.dremio.sabot.driver.Pipeline.doPump():134
com.dremio.sabot.driver.Pipeline.pumpOnce():124
com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run():655
com.dremio.sabot.exec.fragment.FragmentExecutor.run():560
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():1234
com.dremio.sabot.task.AsyncTaskWrapper.run():130
com.dremio.sabot.task.single.DedicatedFragmentRunnable.run():53
java.util.concurrent.Executors$RunnableAdapter.call():515
java.util.concurrent.FutureTask.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():829
How to fix this error ?
@memyself This job should have produced a profile, are you able to download the profile and send it?
Hi balaji,
I’m sending you some data.
The table I want to update incrementally is stored unter the nessie main branch.
the dremio sql code from the log file:
/* {“app”: “dbt”, “dbt_version”: “1.9.6”, “profile_name”: “dpf_dremio_project”, “target_name”: “dev”, “node_id”: “model.dpf_dremio_project.raw_demo_dbt_incr_updates”} */
create table “$scratch”.“raw”.“raw_demo_dbt_incr_updates__dbt_tmp”
as (
select
id,
dater ,
"domain_name",
"ip_v4_address",
"ip_v6_address",
url,
city
from “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates_view”
WHERE id > (SELECT MAX(id) FROM “$scratch”.“raw”.“raw_demo_dbt_incr_updates”)
)
Here I get the error: NoSuchelementException
Regards,
memyself
@memyself Got it! The failure would generate a job, run the create table again and once it fails, click on jobs page, find the failed job, download profile and send it. The error in the profile will tell where the error is happening
Hi Balaij,
when I look at the raw_profile in dremio I get:
Query register card:
/* {“app”: “dbt”, “dbt_version”: “1.9.6”, “profile_name”: “dpf_dremio_project”, “target_name”: “dev”, “node_id”: “model.dpf_dremio_project.raw_demo_dbt_incr_updates”} */
create table “$scratch”.“raw”.“raw_demo_dbt_incr_updates__dbt_tmp”
the $scratch variable is the problem.
When i use the database name , here: Nessie catalog is used,
the table is created.
That seems an error for me. Can you approve it ?
In the Error register card, I get the message from above:
SYSTEM ERROR: NoSuchElementException
SqlOperatorImpl NESTED_LOOP_JOIN
Location 0:0:17
ErrorOrigin: EXECUTOR
[Error Id: dadcf005-5f56-4dbe-b4f0-38959c4a7ad5 ]
(java.util.NoSuchElementException) null
java.util.ArrayList$Itr.next():1000
com.dremio.sabot.op.join.nlje.NLJEOperator.getInitialMatchState():188
com.dremio.sabot.op.join.nlje.NLJEOperator.noMoreToConsumeRight():257
com.dremio.sabot.driver.SmartOp$SmartDualInput.noMoreToConsumeRight():517
com.dremio.sabot.driver.WyePipe.pump():83
com.dremio.sabot.driver.Pipeline.doPump():134
com.dremio.sabot.driver.Pipeline.pumpOnce():124
com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run():655
com.dremio.sabot.exec.fragment.FragmentExecutor.run():560
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():1234
com.dremio.sabot.task.AsyncTaskWrapper.run():130
com.dremio.sabot.task.single.DedicatedFragmentRunnable.run():53
java.util.concurrent.Executors$RunnableAdapter.call():515
java.util.concurrent.FutureTask.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():829
What does your ~/.dbt/profiles.yml look like?
For example, with my dbt profile as below
Table will be created at hive_source.test_dbt
View will be created at test_dbt_space.test_dbt_space_folder
$scratch is the default value for storage_path/source in case you don’t specify it when you’re setting up DBT
Ahh, I guess I’m a step nearer to the goal.
I added the object_storage_source to the profiles.yml file. The mentioned error disappears.
But I get a new one:
dremio Job log: SQL:
create or replace view “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates”
as select *
from
“test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates”
with error: A non-view table with given name [“test-nessie-dev”.raw.raw_demo_dbt_incr_updates] already exists in schema [“test-nessie-dev”.raw].
For my understanding: I’ve created an iceberg table which I want to incrementally update.
I’v entered an entry for the table in the schema.yml and a sql file with the name of the table.
Inside the sql file I’ve entered:
{{ config(
materialized=‘incremental’,
unique_key = ‘id’,
incremental_strategy=‘append’,
schema=‘raw’
) }}
What is wrong in my code ?
Need some info like
Your model file name, for example, tableA.sql will create/insert data into table/view at source.database.tableA
Also, full query of your model.sql, you only show the {{config}} path. Need source table and destination table
Also maybe show your storage/space in profiles.yml for more info.
For example, my case
testdbtdesinc.sql
{{
config(
materialized = ‘incremental’,
unique_key = ‘id’,
alias = ‘TestDBTDesInc’,
tags = [“example”]
)
}}
select
*
from hive_source.test_dbt.TestDBTSourceInc
{% if is_incremental() %}
where
id not in (select distinct id from {{ this }})
{% endif %}
Will insert new data from hive_source.test_dbt.TestDBTSourceInc into hive_source.test_dbt.testdbtdesinc table
dbt run --select testdbtdesinc.sql --debug
Generate 2 important queries
create table “hive_source”.“test_dbt”.“testdbtdesinc__dbt_tmp”
as (
– incremental cdc
select
*
from hive_source.test_dbt.TestDBTSourceInc
where
id not in (select distinct id from “hive_source”.“test_dbt”.“testdbtdesinc”)
)
and
insert into “hive_source”.“test_dbt”.“testdbtdesinc”( “id”, “data” )
select “id”, “data” from “hive_source”.“test_dbt”.“testdbtdesinc__dbt_tmp”
Hi [quangbilly], thanks for the fast answer. I’ve run dbt in debug mode and read the following steps:
-
drop table if exists “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates__dbt_tmp”
-
create table “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates__dbt_tmp”
as select from my view
-
insert into “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates”( “id”, “dater”, “domain_name”, “ip_v4_address”, “ip_v6_address”, “url”, “city” )
select “id”, “dater”, “domain_name”, “ip_v4_address”, “ip_v6_address”, “url”, “city” from “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates__dbt_tmp”
-
create or replace view “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates”
as select * from “test-nessie-dev”.“raw”.“raw_demo_dbt_incr_updates”
the last step produces the error:
ERROR: A non-view table with given name [“csp-dpf-nessie-dev”.raw.raw_demo_dbt_incr_updates] already exists in schema [“csp-dpf-nessie-dev”.raw]
Can you explain why the last step is executed ?
As I said, think you’re inserting a table into itself (raw_demo_dbt_incr_updates).
I asked the same question before.
What is the source table, and what is the destination table (the table that the data is written to)
And What is your model file name? Is it “raw_demo_dbt_incr_updates.sql”?
And also the content of that file