0% found this document useful (0 votes)
53 views

Cheat Sheet Kubernetes

This document provides a cheatsheet for monitoring Kubernetes clusters. It lists various cluster state, container, node resource, job, service, and event metrics that can be monitored along with the metric name in Prometheus or Kubernetes and the corresponding command to view more details. It also shows how to monitor some of these metrics using Datadog checks and metrics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Cheat Sheet Kubernetes

This document provides a cheatsheet for monitoring Kubernetes clusters. It lists various cluster state, container, node resource, job, service, and event metrics that can be monitored along with the metric name in Prometheus or Kubernetes and the corresponding command to view more details. It also shows how to monitor some of these metrics using Datadog checks and metrics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Cheatsheet: Kubernetes Monitoring

Cluster state metrics MORE INFO > Container metrics


DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND

Running pods kube_pod_status_phase kubectl get pods Containers running on a pod kube_pod_container_info kubectl describe pod <POD_NAME>
Number of pods desired for a Containers restarted on a pod kube_pod_container_status_restarts_total kubectl describe pod <POD_NAME>
kube_deployment_spec_replicas kubectl get deployment <DEPLOYMENT>
Deployment
Containers terminated on a pod kube_pod_container_status_terminated kubectl describe pod <POD_NAME>
Number of pods desired for a
kube_daemonset_status_desired_number_scheduled kubectl get daemonset <DAEMONSET>
DaemonSet
Number of pods currently running
in a Deployment
kube_deployment_status_replicas kubectl get deployment <DEPLOYMENT> Disk I/O & Network metrics
Number of pods currently running DESCRIPTION PROMETHEUS METRIC NAME COMMAND
kube_daemonset_status_current_number_scheduled kubectl get daemonset <DAEMONSET>
in a DaemonSet kubectl get --raw /api/v1/nodes/<NODE_
Network in per node container_network_receive_bytes_total
Number of pods currently NAME>/proxy/metrics/cadvisor
kube_deployment_status_replicas_available kubectl get deployment <DEPLOYMENT>
available in a Deployment kubectl get --raw /api/v1/nodes/<NODE_
Network out per node container_network_transmit_bytes_total
Number of pods currently NAME>/proxy/metrics/cadvisor
kube_daemonset_status_number_available kubectl get daemonset <DAEMONSET>
available in a DaemonSet kubectl get --raw /api/v1/nodes/<NODE_
Disk writes per node container_fs_writes_bytes_total
Number of pods currently not NAME>/proxy/metrics/cadvisor
kube_deployment_status_replicas_unavailable kubectl get deployment <DEPLOYMENT>
available in a Deployment kubectl get --raw /api/v1/nodes/<NODE_
Disk reads per node container_fs_reads_bytes_total
NAME>/proxy/metrics/cadvisor
Number of pods currently not
kube_daemonset_status_number_unavailable kubectl get daemonset <DAEMONSET>
available in a DaemonSet container_network_receive_errors_total, kubectl get --raw /api/v1/nodes/<NODE_
Network errors per node
container_network_transmit_errors_total NAME>/proxy/metrics/cadvisor

Node resource and status metrics MORE INFO >


DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND
Kubernetes events MORE INFO >
DESCRIPTION COMMAND
Current health status of a node
kube_node_status_condition kubectl describe node <NODE_NAME>
​​
(kubelet) List events kubectl get events
Total memory requests (bytes)
kube_pod_container_resource_requests_memory_bytes kubectl describe node <NODE_NAME>
per node
Total memory in use on a node N/A kubectl describe node <NODE_NAME>
Total CPU requests (cores) per
kube_pod_container_resource_requests_cpu_cores kubectl describe node <NODE_NAME>
node
Total CPU in use on a node N/A kubectl describe node <NODE_NAME>

Job metrics MORE INFO >


DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND

kubectl get jobs --all-namespaces |


Number of successful jobs kube_job_status_succeeded
grep “succeeded”

kubectl get jobs --all-namespaces |


Number of failed jobs kube_job_status_failed
grep “failed”

Number of active jobs kube_job_status_active kubectl get jobs --all-namespaces

Number of CronJobs kube_cronjob_info kubectl get cronjobs --all-namespaces

Service metrics MORE INFO >


DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND

Service types per cluster kube_service_info kubectl get services --all-namespaces


Number of pods running by kubectl get pods --selector=<SERVICE_SELECTOR>
kubectl get jobs --all-namespaces
service -o=name
Cheatsheet: Kubernetes Monitoring with Datadog
1. Cluster state metrics
METRIC DESCRIPTION DATADOG STATUS CHECK/METRIC NAME

Running pods kubernetes.pods.running

Number of pods desired for a Deployment kubernetes_state.deployment.replicas_desired

Number of pods desired for a DaemonSet kubernetes_state.daemonset.desired

Number of pods currently running in a Deployment kubernetes_state.deployment.replicas

Number of pods currently running in a DaemonSet kubernetes_state.daemonset.scheduled

Number of pods currently available in a Deployment kubernetes_state.deployment.replicas_available

Number of pods currently available in a DaemonSet kubernetes_state.daemonset.ready

Number of pods currently not available in a Deployment kubernetes_state.deployment.replicas_unavailable

Number of pods currently not available in a DaemonSet kubernetes_state.daemonset.desired - kubernetes_state.daemonset.ready

2. Node resource and status metrics


METRIC DESCRIPTION DATADOG METRIC NAME

Current health status of a node (kubelet) kubernetes.kubelet.check

Total memory requests (bytes) per node kubernetes.memory.requests

Total memory in use on a node kubernetes.memory.usage

Total CPU requests (cores) per node kubernetes.cpu.requests

Total CPU in use on a node kubernetes.cpu.usage.total

3. Job metrics
METRIC DESCRIPTION DATADOG METRIC NAME

Number of successful jobs kubernetes_state.job.succeeded

Number of failed jobs kubernetes_state.job.failed

Number of active jobs kubernetes_state.job.count

Number of CronJobs kubernetes_state.job.count (filtered by the owner_kind:cronjob tag)

4. Service metrics
METRIC DESCRIPTION DATADOG METRIC NAME

Service types per cluster kubernetes_state.service.count

Number of pods running by service kubernetes.pods.running

5. Container metrics
METRIC DESCRIPTION DATADOG METRIC NAME
Containers running on a pod kubernetes_state.container.running

Containers restarted on a pod kubernetes_state.container.restarts

Containers terminated on a pod kubernetes_state.container.terminated

6. Disk I/O & Network metrics


METRIC DESCRIPTION DATADOG METRIC NAME

Network in per node kubernetes.network.rx_bytes

Network out per node kubernetes.network.tx_bytes

Disk writes per node kubernetes.io.write_bytes

Disk reads per node kubernetes.io.read_bytes

Network errors per node kubernetes.network.rx_errors, kubernetes.network.tx_errors

7. Events
Kubernetes events will appear in the Datadog Events Explorer and in event widgets on dashboards
Best-In-Class Kubernetes
Monitoring and Security
Start your free trial today and get real-time visibility into the health and
performance of your Kubernetes environment - all in a matter of minutes.

TRY DATADOG FOR FREE

You might also like