Resampling/Group By Time Series Data

Hi Team

I would like to know if there are any functions in Dremio that can assist in resampling time series data.

For example if I had a table with 3 columns ts, tag and value, with ts being the timestamp at intervals of every minute.

Is there any functions we can used to resample this data by taking the average of X min intervals for each tag. Where X could be 5 min, 10 min 15 min etc

Thanks!

I note that DATE TRUNC can be used to GROUP BY a particular unit of time i.e. second, minute, hour etc. But still haven’t found a solution for a multiple of this time units i.e. 2 mins or 5 mins etc.

@nikhil.makan,

General formula is: SELECT TO_TIMESTAMP(UNIX_TIMESTAMP(your_timestamp_column, 'YYYY-MM-DD HH24:MI:SS.FFF') - UNIX_TIMESTAMP(your_timestamp_column,'YYYY-MM-DD HH24:MI:SS.FFF')%your_granularity_in_seconds)

So in your case, the query will for 5 minutes (300 seconds) will be:

SELECT tag, TO_TIMESTAMP(UNIX_TIMESTAMP(ts, 'YYYY-MM-DD HH24:MI:SS.FFF') - UNIX_TIMESTAMP(ts,'YYYY-MM-DD HH24:MI:SS.FFF')%300), avg(val)
FROM nikm
GROUP BY 1, 2
ORDER BY 2

Here’s a full example with some sample data:

WITH nikm AS
  (
      select TO_TIMESTAMP('2023-01-31 10:00:02', 'YYYY-MM-DD HH24:MI:SS')  as ts, 'A' as tag, 10 as val
      UNION ALL
      select TO_TIMESTAMP('2023-01-31 10:08:02', 'YYYY-MM-DD HH24:MI:SS')  as ts, 'A' as tag, 5 as val
      UNION ALL
      select TO_TIMESTAMP('2023-01-31 10:09:02', 'YYYY-MM-DD HH24:MI:SS')  as ts, 'A' as tag, 20 as val
      UNION ALL
      select TO_TIMESTAMP('2023-01-31 10:20:02', 'YYYY-MM-DD HH24:MI:SS')  as ts, 'B' as tag, 6 as val
  )
SELECT tag, TO_TIMESTAMP(UNIX_TIMESTAMP(ts, 'YYYY-MM-DD HH24:MI:SS.FFF') - UNIX_TIMESTAMP(ts,'YYYY-MM-DD HH24:MI:SS.FFF')%300) as ts_round, avg(val) as avg_val
FROM nikm
GROUP BY 1, 2
ORDER BY 2

Hope that’s what you’re looking for. Based on my previous answer here.

@lenoyjacob awesome that’s quite a robust way of doing it!