Documentation of exposed Prometheus metrics

serra · April 19, 2024, 8:37am

I have set up Prometheus to monitor my Dremio instance, this works very well.

I am overwhelmed by the number of exposed metrics. Although every metric exposes a help text, it is not very helpful. E.g.:

# HELP jobs_active Generated from Dropwizard metric import (metric=jobs.active, type=com.dremio.telemetry.api.metrics.Metrics$$Lambda$94/1750950410)
# TYPE jobs_active gauge
jobs_active 0.0

This “Generated from Dropwizard metric import (...” is all over the place and not helpful to me.

Are the metrics’ meaning documented somewhere else too?

txalaparta · April 22, 2024, 7:53am

Dear Serra
Some time ago I tried to setup prometheus in my Kubernetes Dremio-OSS instance (helm chart) with no success.
Here´s what I did in local.values.yaml:

# Dremio Coordinator
coordinator:

podAnnotations:*
prometheus.io/scrape: “true” #Enable scraping for this pod*
prometheus.io/scheme: “http” #If the metrics endpoint is secured then you will need to set this to https, if not default ‘http’*
prometheus.io/path: “/metrics” #If the metrics path is not /metrics, define it with this annotation.*
prometheus.io/port: “9010”*

Either port 9010 was no accessible or was no even created???

Could you please explain how you made it work?
Thanks in advance
Txalaparta

serra · April 22, 2024, 1:14pm

Hm, not sure what you have to do to get this running in k8.

I host in docker compose.

I had to add this to my dremio.conf:

  web-admin.host: "0.0.0.0"
  web-admin.port: 9090

Then let the container expose port 9090 and configure my Prometheus instance to scrape it.

You can test if Dremio is set up correctly by browsing (or curling) to http://your_dremio_host:your_web_admin_port/metrics (in my case http://dremio:9090/metrics) to get the (long) list of metrics. Hth!

txalaparta · April 23, 2024, 5:41am

Thanks Serra
I will try and let you know

txalaparta · April 26, 2024, 11:27am

I made it work in K8S following the very same approach.
Many thanks again!!

serra · April 26, 2024, 12:16pm

That is good to hear!

I am still curious about the metrics documentation, so if you happen to come across do let me know.

cesar-santos · June 5, 2024, 12:10pm

@txalaparta can you please share your values.yaml?

Thank you!

Benny_Chow · June 7, 2024, 5:33am

There’s JMX documentation here: Monitoring Dremio Nodes | Dremio Documentation

The ones that start with jobs and reflections are the easiest to start with.

txalaparta · June 7, 2024, 11:47am

Dear Cesar, I am attaching my values here.In my case I am using a minio dist storage. Good luck!!

image: dremio/dremio-oss
imageTag: 24.2

annotations: {}
podAnnotations: {}
labels: {}
podLabels: {}
nodeSelector: {}
tolerations: []

# Dremio Coordinator
coordinator:
  # CPU & Memory
  # Memory allocated to each coordinator, expressed in MB.
  # CPU allocated to each coordinator, expressed in CPU cores.
  cpu: 1
  memory: 4200

  # This count is used for slave coordinators only.
  # The total number of coordinators will always be count + 1
  count: 0

  # Coordinator data volume size (applies to the master coordinator only).
  # In most managed Kubernetes environments (AKS, GKE, etc.), the size of the disk has a direct impact on
  # the provisioned and maximum performance of the disk.
  volumeSize: 10Gi

  # Kubernetes Service Account
  # Uncomment below to use a custom Kubernetes service account for the coordinator.
  #serviceAccount: ""

  
  extraStartParams: >-
    -Dservices.web-admin.port=9090
    -Dservices.web-admin.enabled=true
    -Dservices.web-admin.host=0.0.0.0

  podAnnotations:
    prometheus.io/scrape: "true" #Enable scraping for this pod
    prometheus.io/scheme: "http" #If the metrics endpoint is secured then you will need to set this to `https`, if not default ‘http’
    prometheus.io/path: "/metrics" #If the metrics path is not /metrics, define it with this annotation.
    prometheus.io/port: "9090" #If port is not 9102 use this annotation

  # Web UI
  web:
    port: 9047
    tls:
      # To enable TLS for the web UI, set the enabled flag to true and provide
      # the appropriate Kubernetes TLS secret.
      enabled: true

      # To create a TLS secret, use the following command:
      # kubectl create secret tls ${TLS_SECRET_NAME} --key ${KEY_FILE} --cert ${CERT_FILE}
      secret: dremio-tls-secret-ui

  # ODBC/JDBC Client
  client:
    port: 31010
    tls:
      # To enable TLS for the client endpoints, set the enabled flag to
      # true and provide the appropriate Kubernetes TLS secret. Client
      # endpoint encryption is available only on Dremio Enterprise
      # Edition and should not be enabled otherwise.
      enabled: true

      # To create a TLS secret, use the following command:
      # kubectl create secret tls ${TLS_SECRET_NAME} --key ${KEY_FILE} --cert ${CERT_FILE}
      secret: dremio-tls-secret-client

  # Flight Client
  flight:
    port: 32010
    tls:
      # To enable TLS for the Flight endpoints, set the enabled flag to
      # true and provide the appropriate Kubernetes TLS secret.
      enabled: true

      # To create a TLS secret, use the following command:
      # kubectl create secret tls ${TLS_SECRET_NAME} --key ${KEY_FILE} --cert ${CERT_FILE}
      secret: dremio-tls-secret-ui

# Dremio Executor
executor:
  # CPU & Memory
  # Memory allocated to each executor, expressed in MB.
  # CPU allocated to each executor, expressed in CPU cores.
  cpu: 1
  memory: 4200

  engines: ["default"]
  count: 1

  # Executor volume size.
  volumeSize: 5Gi


  extraStartParams: >-
    -Dservices.web-admin.port=9090
    -Dservices.web-admin.enabled=true
    -Dservices.web-admin.host=0.0.0.0

  podAnnotations:
    prometheus.io/scrape: "true" #Enable scraping for this pod
    prometheus.io/scheme: "http" #If the metrics endpoint is secured then you will need to set this to `https`, if not default ‘http’
    prometheus.io/path: "/metrics" #If the metrics path is not /metrics, define it with this annotation.
    prometheus.io/port: "9090" #If port is not 9102 use this annotation


  cloudCache:
    enabled: true

    volumes:
    - size: 100Gi


# Zookeeper
zookeeper:
  # The Zookeeper image used in the cluster.
  image: zookeeper
  imageTag: latest

  # CPU & Memory
  # Memory allocated to each zookeeper, expressed in MB.
  # CPU allocated to each zookeeper, expressed in CPU cores.
  cpu: 0.5
  memory: 1024
  count: 1

  volumeSize: 10Gi


# Control where uploaded files are stored for Dremio.
# For more information, see https://docs.dremio.com/deployment/distributed-storage.html
distStorage:
  type: "aws"

  aws:
    bucketName: "dremio"
    path: "/"
    authentication: "accessKeySecret"
    credentials:
     accessKey: "XXXXXXXXXXXXXXXX"
     secret: "XXXXXXXXXXXXXXX"

    extraProperties: |
     <property>
       <name>fs.s3a.endpoint</name>
       <value>XXXXXXXXXXX</value>
     </property>
     <property>
       <name>fs.s3a.path.style.access</name>
       <value>true</value>
     </property>
     <property>
       <name>dremio.s3.compat</name>
       <value>true</value>
     </property>
     <property>
       <name>fs.s3a.connection.ssl.enabled</name>
       <description>Value can either be true or false, set to true to use SSL with a secure Minio server.</description>
       <value>false</value>
     </property>

serra · June 7, 2024, 12:09pm

Thank you @Benny_Chow , that was the information I was looking for. Not sure how I could have missed it.

Topic		Replies	Views
Dremio Grafana integration	2	2734	September 27, 2021
Metric not available	1	1700	October 29, 2019
Configure dremio-telemetry.yaml in EKS Helm deployment	1	1114	September 13, 2021
Patterns for alerts and monitoring	1	1569	November 21, 2020
CPU information is missing	3	1045	December 14, 2021

Documentation of exposed Prometheus metrics

Related topics