If I replace the first subselect with the actual value of “folder” (which is “24”) the query works:
before:
SELECT wikiNode.uuid, wikiNodePropName.string_value AS title
FROM (SELECT *
FROM alf_node
WHERE audit_creator <> 'System' AND store_id = 6) AS n
JOIN alf_qname AS nodeType ON (nodeType.id = n.type_qname_id AND nodeType.local_name = 'site')
JOIN alf_child_assoc AS aca ON (aca.parent_node_id = n.id)
JOIN alf_node AS wikiNode ON (aca.child_node_id = wikiNode.id AND wikiNode.type_qname_id IN (SELECT id
FROM alf_qname
WHERE local_name = 'folder'))
JOIN alf_node_properties AS wikiNodePropName ON (wikiNodePropName.node_id = wikiNode.id)
WHERE wikiNodePropName.qname_id IN (SELECT id
FROM alf_qname
WHERE local_name = 'name')
AND wikiNodePropName.string_value = 'wiki'
after:
SELECT wikiNode.uuid, wikiNodePropName.string_value AS title
FROM (SELECT *
FROM alf_node
WHERE audit_creator <> 'System' AND store_id = 6) AS n
JOIN alf_qname AS nodeType ON (nodeType.id = n.type_qname_id AND nodeType.local_name = 'site')
JOIN alf_child_assoc AS aca ON (aca.parent_node_id = n.id)
JOIN alf_node AS wikiNode ON (aca.child_node_id = wikiNode.id AND wikiNode.type_qname_id = 24)
JOIN alf_node_properties AS wikiNodePropName ON (wikiNodePropName.node_id = wikiNode.id)
WHERE wikiNodePropName.qname_id IN (SELECT id
FROM alf_qname
WHERE local_name = 'name')
AND wikiNodePropName.string_value = 'wiki'
The other subselects on the qname table work just fine.
Can you please run the below commands and retry your query?
Do you have any reflections on the physical dataset? If yes, those will be dropped and need to be recreated
ALTER PDS Alfresco.public.alf_node FORGET METADATA;
ALTER PDS Alfresco.public.alf_qname FORGET METADATA;
ALTER PDS Alfresco.public.alf_child_assoc FORGET METADATA;
ALTER PDS Alfresco.public.alf_node_properties FORGET METADATA;
ALTER PDS Alfresco.public.alf_node REFRESH METADATA;
ALTER PDS Alfresco.public.alf_qname REFRESH METADATA;
ALTER PDS Alfresco.public.alf_child_assoc REFRESH METADATA;
ALTER PDS Alfresco.public.alf_node_properties REFRESH METADATA;
your commands completed successful. However, my query produces the same output.
I did not set any reflections within Dremio. (Because I do not know what these are for.) If you want to take a look I can offer a short video call.
I have now tried running the query in a docker instance of dremio running on my Windows machine.
I still get the same error. I can’t figure out how this issue is cause by the Alfresco db or my setup. Any chance you can give this issue some attention?
I have now set up a local installation of Dremio 24.0.0 via RPM on a fresh instance of rocky. Then I have reconnected to the PSQL-DB and set “raw reflection” for all tables (in the query) to “on”. This far everything worked fine and there were no errors in the “jobs” tab.
However after executing the evil-query there was the same error message as in my very first message.
SYSTEM ERROR: AssertionError: Field ordinal 9 is invalid for type 'RecordType(BIGINT id)'
(com.dremio.exec.work.foreman.ForemanException) Unexpected exception during fragment initialization: java.lang.AssertionError: Field ordinal 9 is invalid for type 'RecordType(BIGINT id)'
com.dremio.exec.work.foreman.AttemptManager.run():474
com.dremio.context.RequestContext.run():96
com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$4():212
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():829
Caused By (java.util.concurrent.ExecutionException) java.lang.AssertionError: Field ordinal 9 is invalid for type 'RecordType(BIGINT id)'
java.util.concurrent.CompletableFuture.reportGet():395
java.util.concurrent.CompletableFuture.get():1999
com.dremio.exec.work.foreman.AttemptManager.run():427
com.dremio.context.RequestContext.run():96
com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$4():212
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():829
Caused By (java.lang.AssertionError) Field ordinal 9 is invalid for type 'RecordType(BIGINT id)'
org.apache.calcite.rex.RexBuilder.makeFieldAccess():196
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.lambda$lookupExp$1():4816
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier():3925
org.apache.calcite.sql2rel.SqlToRelConverter.access$2700():227
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit():5236
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit():4453
org.apache.calcite.sql.SqlIdentifier.accept():341
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression():5096
org.apache.calcite.sql2rel.SqlToRelConverter.substituteSubQuery():1119
org.apache.calcite.sql2rel.SqlToRelConverter.replaceSubQueries():1072
org.apache.calcite.sql2rel.SqlToRelConverter.convertWhere():1039
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl():686
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect():664
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive():3358
com.dremio.exec.planner.sql.DremioSqlToRelConverter.convertQueryRecursive():219
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery():590
com.dremio.exec.planner.sql.SqlValidatorAndToRelContext.toConvertibleRelRoot():149
com.dremio.exec.planner.sql.SqlValidatorAndToRelContext.toConvertibleRelRoot():130
com.dremio.exec.planner.sql.handlers.PrelTransformer.toConvertibleRelRoot():999
com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToRelRoot():1021
com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToRel():1041
com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():193
com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():179
com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan():68
com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan():59
com.dremio.exec.work.foreman.AttemptManager.plan():524
com.dremio.exec.work.foreman.AttemptManager.lambda$run$4():422
com.dremio.service.commandpool.ReleasableBoundCommandPool.lambda$getWrappedCommand$3():140
com.dremio.service.commandpool.CommandWrapper.run():70
com.dremio.context.RequestContext.run():96
com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$4():212
com.dremio.common.concurrent.ContextMigratingExecutorService$ComparableRunnable.run():192
java.util.concurrent.Executors$RunnableAdapter.call():515
java.util.concurrent.FutureTask.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():829
I have now tested another PSQL DB from another instance of Alfresco (a newer version). The issue persits. It has to be related to the subselect in the JOIN.
Hi @Muffex
Sorry to bother you, are you able to do 2 things?
Send the latest profile on 24.x
Create a reflection on the final set of datasets and see if the query works, last time you tried it, it did not work as you were still uaing PDFS to store reflections. This is no longer supported, so you have to create reflection on a dist store,
tbh I have not yet understood the need of the reflections for my use case. Since it does currently not work at all, why should it work when accelerated?
(Nevertheless, I will now also try to do create a reflection on a dist store.)
Attached you can find the current profile.zip (without having reflections enabled).
I have now tried to use HDFS as dist storage. The setup worked. The creation of reflections for all the included tables (in “the query”) seems to work. There is a green checkmark and the size is shown in the “Reflections” Tab of the “Dataset settings” of each selected table in the space.
The issue persists.
The reflections aren’t used (if i understand correctly):
First I only enabled “Raw reflections” then I also enabled “Aggregation Reflections”. Same error.
Does the subquery run? If yes, can we drop all reflections, save the below query as a VDS and create a raw reflection on the VDS with all columns and then try the outer query on the VDS?
SELECT *
FROM alfresco_alfNode
WHERE audit_creator <> 'System' AND store_id = 6) AS n
JOIN alfresco_alfQname AS nodeType ON (nodeType.id = n.type_qname_id AND nodeType.local_name = 'site')
JOIN alfresco_alfChildAssoc AS aca ON (aca.parent_node_id = n.id)
JOIN alfresco_alfNode AS wikiNode ON (aca.child_node_id = wikiNode.id AND wikiNode.type_qname_id IN (SELECT id
FROM alfresco_alfQname
WHERE local_name = 'folder'))
JOIN alfresco_alfNodeProperties AS wikiNodePropName ON (wikiNodePropName.node_id = wikiNode.id)
WHERE wikiNodePropName.qname_id IN (SELECT id
FROM alfresco_alfQname
WHERE local_name = 'name'
This query does produce the same error. (I slightly modified your query becuase some brackets were missing.) Q1:
SELECT *
FROM (SELECT * FROM alfresco_alfNode
WHERE audit_creator <> 'System' AND store_id = 6) AS n
JOIN alfresco_alfQname AS nodeType ON (nodeType.id = n.type_qname_id AND nodeType.local_name = 'site')
JOIN alfresco_alfChildAssoc AS aca ON (aca.parent_node_id = n.id)
JOIN alfresco_alfNode AS wikiNode ON (aca.child_node_id = wikiNode.id AND wikiNode.type_qname_id IN (SELECT id
FROM alfresco_alfQname
WHERE local_name = 'folder'))
JOIN alfresco_alfNodeProperties AS wikiNodePropName ON (wikiNodePropName.node_id = wikiNode.id)
WHERE wikiNodePropName.qname_id IN (SELECT id
FROM alfresco_alfQname
WHERE local_name = 'name')
The following subqueries work: Q2:
SELECT id
FROM alfresco_alfQname
WHERE local_name = 'folder'
Q3:
SELECT id
FROM alfresco_alfQname
WHERE local_name = 'name'
When replacing the “in Q2”-part in Q1 with the actual value (24) the query executes successfully: Q4
SELECT *
FROM (SELECT * FROM alfresco_alfNode
WHERE audit_creator <> 'System' AND store_id = 6) AS n
JOIN alfresco_alfQname AS nodeType ON (nodeType.id = n.type_qname_id AND nodeType.local_name = 'site')
JOIN alfresco_alfChildAssoc AS aca ON (aca.parent_node_id = n.id)
JOIN alfresco_alfNode AS wikiNode ON (aca.child_node_id = wikiNode.id AND wikiNode.type_qname_id = 24) -- here
JOIN alfresco_alfNodeProperties AS wikiNodePropName ON (wikiNodePropName.node_id = wikiNode.id)
WHERE wikiNodePropName.qname_id IN (SELECT id
FROM alfresco_alfQname
WHERE local_name = 'name')