Window function: frame definition

Is it possible to add a frame_start and frame_end statement within a window function similar to:

SELECT SalesMonth, SalesAmount, SUM(SalesAmount) OVER( ORDER BY SalesMonth ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS SalesAmount_12M FROM …

If not, are there plans to implement this.

Hi @bh650803 , from what I can recall, RANGE is implemented, but not ROWS.

Thanks for your help
an example with RANGE would be helpful.

My first trial:
… SUM(SalesAmount) OVER( PARTITION BY … ORDER BY SalesMonth RANGE BETWEEN 11 PRECEDING AND CURRENT ROW) AS SalesAmount_12M…
it doesn’t work (error message in this case: “Data Type Mismatch: ORDER BY … RANGE …”)

2nd trial:
If RowNumbers are given: (so DATA TYPE is INTEGER)
… SUM(SalesAmount) OVER( PARTITION BY … ORDER BY RowNumber RANGE BETWEEN 11 PRECEDING AND CURRENT ROW) AS SalesAmount_12M…
it doesn’t work (error message in this case: “currently not supported”)

3rd trial:
… SUM(SalesAmount) OVER( PARTITION BY … ORDER BY SalesMonth RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Sales_Amount_ALL…
this works and brings correct results, but not a 12 months rolling sum

@bh650803, something like below?

SELECT
  FIRST_NAME,
  LAST_NAME,
  SALARY,
  SUM(SALARY) OVER(
    PARTITION BY DEPARTMENT_ID
    ORDER BY FIRST_NAME ASC
    RANGE UNBOUNDED PRECEDING
  ) AS range_cumulative_revenue
FROM EMPLOYEES;

Hi Bert,

I think that dremio currently implements only RANGE BETWEEN + UNBOUNDED PRECEDING|UNBOUNDED FOLLOWING|CURRENT ROW.

Someone in the dremio’s staff will tell you more for sure.

Hi, @balaji.ramaswamy
I wonder if this window frame syntax is supported:

ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING

My test reports an error:


  (com.dremio.exec.work.foreman.UnsupportedFunctionException) This type of window frame is currently not supported
    com.dremio.exec.exception.UnsupportedOperatorCollector.convertException():41
    com.dremio.exec.planner.sql.parser.UnsupportedOperatorsVisitor.convertException():75
    com.dremio.exec.planner.sql.handlers.PrelTransformer.validateNode():207
    com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():180
    com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():176
    com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan():66
    com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan():59
    com.dremio.exec.work.foreman.AttemptManager.plan():489
    com.dremio.exec.work.foreman.AttemptManager.lambda$run$1():388
    com.dremio.service.commandpool.ReleasableBoundCommandPool.lambda$getWrappedCommand$3():138
    com.dremio.service.commandpool.CommandWrapper.run():62
    com.dremio.context.RequestContext.run():95
    com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$3():199
    com.dremio.common.concurrent.ContextMigratingExecutorService$ComparableRunnable.run():180
    java.util.concurrent.Executors$RunnableAdapter.call():511
    java.util.concurrent.FutureTask.run():266
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():750

Change it to

ROWS UNBOUNDED PRECEDING

It can run, but it is not the result I want. If according to the above friend @fetanchaud , dremio does not support complex windows frame statements at this stage, is there any plan to support it?
image

You may find supported OVER features here :

@bigfacewo Where is your data? RDBMS (which one?) or querying PARQUET on S3 etc?

I am running parquet on my local storage:
orders.zip (403.7 KB)
SQL:

SELECT
	子类别,
	COUNT(MAX( "类别" )) OVER ( ORDER BY MAX( "类别" ) ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING) AS "WINDOW_COUNT" 
FROM
	orders 
GROUP BY
	"子类别" 

dremio.env:

#
# Copyright (C) 2017-2019 Dremio Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#
# Dremio environment variables used by Dremio daemon
#

#
# Directory where Dremio logs are written
# Default to $DREMIO_HOME/log
#
#DREMIO_LOG_DIR=${DREMIO_HOME}/log

#
# Send logs to console and not to log files. The DREMIO_LOG_DIR is ignored if set.
#
#DREMIO_LOG_TO_CONSOLE=1

#
# Directory where Dremio pidfiles are written
# Default to $DREMIO_HOME/run
#
#DREMIO_PID_DIR=${DREMIO_HOME}/run

#
# Max total memory size (in MB) for the Dremio process
#
# If not set, default to using max heap and max direct.
#
# If both max heap and max direct are set, this is not used
# If one is set, the other is calculated as difference
# of max memory and the one that is set.
#
#DREMIO_MAX_MEMORY_SIZE_MB=

#
# Max heap memory size (in MB) for the Dremio process
#
# Default to 4096 for server
#
#DREMIO_MAX_HEAP_MEMORY_SIZE_MB=4096

#
# Max direct memory size (in MB) for the Dremio process
#
# Default to 8192 for server
#
#DREMIO_MAX_DIRECT_MEMORY_SIZE_MB=8192

#
# Max permanent generation memory size (in MB) for the Dremio process
# (Only used for Java 7)
#
# Default to 512 for server
#
#DREMIO_MAX_PERMGEN_MEMORY_SIZE_MB=512

#
# Garbage collection logging is enabled by default. Set the following
# parameter to "no" to disable garbage collection logging.
#
#DREMIO_GC_LOGS_ENABLED="yes"

#
# Send GC logs to console and not to log files. The DREMIO_LOG_DIR is ignored if set.
# Default is set to "no"
#
#DREMIO_GC_LOG_TO_CONSOLE="no"

#
# By default G1GC is used as java garbage collection.
# This can be overriden by changing this parameter
#
#DREMIO_GC_OPTS="-XX:+UseG1GC"

#
# Java version will be checked by default.
# Currently only java 8 is supported by dremio.
# This check can be disabled by changing value to false.
#
#DREMIO_JAVA_VERSION_CHECK="true"

#
# The scheduling priority for the server
#
# Default to 0
#
# DREMIO_NICENESS=0
#

#
# Number of seconds after which the server is killed forcibly it it hasn't stopped
#
# Default to 120
#
#DREMIO_STOP_TIMEOUT=120

# Extra Java options - shared between dremio and dremio-admin commands
#
#DREMIO_JAVA_EXTRA_OPTS=

# Extra Java options - client only (dremio-admin command)
#
#DREMIO_JAVA_CLIENT_EXTRA_OPTS=

# Extra Java options - server only (dremio command)
#
DREMIO_JAVA_SERVER_EXTRA_OPTS='-Dsaffron.default.charset=UTF-8 -Dsaffron.default.nationalcharset=UTF-8 -Dsaffron.default.collation.name=UTF-8$en_US'

@fetanchaud

Thanks, kindly send us the profile so we can replicate the exact same SQL

Any update? @balaji.ramaswamy