Using REGEXP_LIKE on PostgreSQL column spins a 100% CPU consuming thread

select 
       REGEXP_LIKE(email, '.*\+.*\@.*') as is_internal_email       
from   "keycloakdb".public."user_entity"
where  email like '%+%'

Get stuck a the planning phase, and we can see 100% (single core) utilization. If I run this query again from another window, I get the same behavior, and the usage increases to 200%. This Doesnt happen if I use REGEXP_MATCHES

So, this is a very serious bug, and while workarounds exist, I don’t think this should be left open:

Judging by the stacktrace - it’s a PostgreSQL only issue. Also, it only happens if the regex contains an escaped + sign.

Was able to recreate this on Dremio 24.0.0, 24.1.0, 24.2.6 and Dremio 24.3.0

Here’s the the thread-dump of the rogue planner thread:

"1a5b0a50-c9fa-4cde-f237-4661ddd35600/0:foreman-planning" #319 daemon prio=10 os_prio=0 cpu=1352243.25ms elapsed=1355.16s tid=0x00007fd3904ef000 nid=0x2536c runnable  [0x00007fd3640df000]
   java.lang.Thread.State: RUNNABLE
        at com.dremio.exec.store.jdbc.dialect.PostgreSQLDialect.supportsRegexString(PostgreSQLDialect.java:112)
        at com.dremio.exec.store.jdbc.rules.JdbcExpressionSupportCheck.supportsRegexString(JdbcExpressionSupportCheck.java:375)
        at com.dremio.exec.store.jdbc.rules.JdbcExpressionSupportCheck.visitCall(JdbcExpressionSupportCheck.java:125)
        at com.dremio.exec.store.jdbc.rules.JdbcExpressionSupportCheck.visitCall(JdbcExpressionSupportCheck.java:57)
        at org.apache.calcite.rex.RexCall.accept(RexCall.java:197)
        at com.dremio.exec.store.jdbc.rules.JdbcExpressionSupportCheck.hasOnlySupportedFunctions(JdbcExpressionSupportCheck.java:68)
        at com.dremio.exec.store.jdbc.rules.JdbcRuleContext$3.load(JdbcRuleContext.java:72)
        at com.dremio.exec.store.jdbc.rules.JdbcRuleContext$3.load(JdbcRuleContext.java:69)
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3533)
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2282)
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2159)
        - locked <0x00000007fc6aa640> (a com.google.common.cache.LocalCache$StrongAccessWriteEntry)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049)
        at com.google.common.cache.LocalCache.get(LocalCache.java:3966)
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3989)
        at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4950)
        at com.dremio.exec.store.jdbc.rules.JdbcProjectRule.matches(JdbcProjectRule.java:90)
        at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:553)
        at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:416)
        at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:281)
        at org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
        at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:212)
        at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:199)
        at com.dremio.exec.planner.DremioHepPlanner.findBestExp(DremioHepPlanner.java:74)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.lambda$transform$0(PrelTransformer.java:544)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer$$Lambda$1511/0x0000000801317040.get(Unknown Source)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.doTransform(PrelTransformer.java:612)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.transform(PrelTransformer.java:591)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.getPostLogical(PrelTransformer.java:342)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToDrel(PrelTransformer.java:260)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToDrel(PrelTransformer.java:403)
        at com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan(NormalHandler.java:82)
        at com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan(HandlerToExec.java:59)
        at com.dremio.exec.work.foreman.AttemptManager.plan(AttemptManager.java:524)
        at com.dremio.exec.work.foreman.AttemptManager.lambda$run$4(AttemptManager.java:422)
        at com.dremio.exec.work.foreman.AttemptManager$$Lambda$1201/0x0000000801171840.get(Unknown Source)
        at com.dremio.service.commandpool.ReleasableBoundCommandPool.lambda$getWrappedCommand$3(ReleasableBoundCommandPool.java:140)
        at com.dremio.service.commandpool.ReleasableBoundCommandPool$$Lambda$1180/0x0000000801170440.get(Unknown Source)
        at com.dremio.service.commandpool.CommandWrapper.run(CommandWrapper.java:70)
        at com.dremio.context.RequestContext.run(RequestContext.java:96)
        at com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$4(ContextMigratingExecutorService.java:212)
        at com.dremio.common.concurrent.ContextMigratingExecutorService$$Lambda$995/0x0000000800eab440.run(Unknown Source)
        at com.dremio.common.concurrent.ContextMigratingExecutorService$ComparableRunnable.run(ContextMigratingExecutorService.java:192)
        at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.21/Executors.java:515)
        at java.util.concurrent.FutureTask.run(java.base@11.0.21/FutureTask.java:264)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.21/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.21/ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(java.base@11.0.21/Thread.java:829)

Interesting find.
It does seem like a defect - thank you for sharing!

Well triage this with the team.

@sheinbergon If you have it handy, can you please send us the profile so we can look at the push down SQL to PG