Summary Job Statistics

candlergrimes · February 1, 2021, 9:59pm

We’re rolling out Dremio internally and want to measure utilization. The target metric is Jobs per month. Ideally, we could filter using the same criteria in the Jobs UI (start time, status, user, etc).

What’s the best way to collect this information? There are no summary stats in the Jobs UI (that I can see), and the REST endpoint can only pull jobs by specific jobId.

Note: This article states there is a sys.job_result table, but I am not seeing that when I SHOW TABLES IN SYS. If this does exist, I may be able to use it to group results …

balaji.ramaswamy · February 2, 2021, 7:44am

@candlergrimes

Dremio writes a file called queries.json to the coordinator log folder, times are in UTC. It records every job that hits Dremio, be it REST API, JDBC, ODBC or UI. The file is moved to archive every 24 hours and kept for 30 days. Retention can be configured in conf/logback.xml (restart required). You can copy the queries.json to S3 or HDFS or Az storage and promote the folder containing the 30 days of queries.json and run SQL on it. You get very rich information like queryID, querytext, username, start time, finsih time etc

Thanks
Bali

vrb · March 27, 2022, 1:18am

Hi,
Is there an option to log additional attributes (Records returned, Input Size, Output Size, Peak Memory consumed) in the queries.json file ?

balaji.ramaswamy · April 2, 2022, 5:45am

@vrb

Currently is no option to add fields but at some point Dremio will have a system table for these queries and will have more information

cesar-santos · April 25, 2024, 4:53pm

@balaji.ramaswamy any updates on this “Dremio will have a system table for these queries and will have more information”?
I’m currently trying to extract metrics from Dremio usage based on my OS Deployment.

ben · April 25, 2024, 10:35pm

In Dremio 25 we have a new “Monitor” page that provides cluster usage metrics, like job count over time and top 10 longest running jobs.

Topic		Replies	Views
Best practice to perform queries via the REST API	2	2076	June 6, 2018
Dremio Jobs Details	8	2027	December 23, 2019
Trace Row Count on dremio	4	1875	January 3, 2020
Jobs profiles by SQL query	5	1383	March 24, 2021
Querying old jobs via REST	3	1282	January 10, 2019

Summary Job Statistics

Related topics