Kubelet: Add a metrics in kubelet to track how long it takes for pod to fully start #124892

JeffLuoo · 2024-05-15T15:53:07Z

What would you like to be added?

Add a new metrics to record the end-to-end startup latency of the pod since pod created to pod ready for the first time. The metrics will include all stages of the pod life cycle like scheduling and image pulling.

Metrics Name: kubelet_pod_first_ready_latency_seconds {namespace=<namespace_name>, pod=<pod_name>, uid=<uid>, node=<node_name>}

Metrics Type: Gauge

Metrics Unit: Seconds

The metric exists for the lifetime of the pod.

Why is this needed?

Kubelet currently reports a Histogram metric pod_start_total_duration_seconds that gives users overview of the pod end-to-end startup latency from pod creation to pod running. However, pod ready will usually be the signal to say that a pod is ready to serve traffic.

Having the new metric will allow users to track how long it takes for their pods under the workload to fully start and ready to serve traffic, and with the metrics label of node_name, this metric can be a supplementation to the existing metric pod_start_total_duration_seconds if users want to track the node-level pod end-to-end startup latency from creation to ready.

Also, user could aggregate the metric by the workload (Deployment, StatefulSet, and etc.) to present the workload-level pod end-to-end startup latency.

The text was updated successfully, but these errors were encountered:

JeffLuoo · 2024-05-15T15:53:25Z

cc: @ruiwen-zhao for review.

JeffLuoo · 2024-05-15T15:55:54Z

/sig instrumentation

ruiwen-zhao · 2024-05-16T00:08:14Z

/sig node

ruiwen-zhao · 2024-05-16T00:10:53Z

Just to bring up previous discussion around metric cardinality, adding both pod name and node name to metric labels might be too much cardinality. We need to come up with a way to address this.

cc @SergeyKanzhelev @logicalhan @dashpole

JeffLuoo · 2024-05-16T15:00:05Z

Thank you Ruiwen, for the cardinality issue I have some comments on it:

Kubernetes already has some metrics from scheduler that include the pod name, namespace, and node name as metrics label:

kubernetes/pkg/scheduler/metrics/resources/resources.go

Line 58 in 06b813f

[]string{"namespace", "pod", "node", "scheduler", "priority", "resource", "unit"},
kubernetes/pkg/kubelet/metrics/collectors/log_metrics.go

Lines 32 to 34 in 06b813f

"uid",

"namespace",

"pod",

Kubenetes has another metric kubelet_container_log_filesystem_used_bytes that also use pod name and namespace as metrics labels.

KSM also exports pod metrics in prometheus format and some metrics have pod name and namespace as metrics labels: https://github.com/kubernetes/kube-state-metrics/blob/main/docs/metrics/workload/pod-metrics.md.

dgrisonnet · 2024-05-16T16:36:17Z

cc @dgrisonnet @richabanker

yujuhong · 2024-05-17T19:25:50Z

Thank you Ruiwen, for the cardinality issue I have some comments on it:

This is very different from the existing implementation of pod_start_total_duration_seconds. Waiting for @dashpole or others from sig-instrumentation to give some advice on the best way to record one-time per-pod metrics like this.

JeffLuoo · 2024-05-23T17:12:01Z

@yujuhong Yes. The pod_start_total_duration_seconds is a Distribution over all pods in the node, but newly added metric in this feature proposed to add a gauge metric that provides the exact startup time for a single pod to become ready.

@dashpole Hi David, could you please provide some insights here? Thanks!

dashpole · 2024-05-24T13:51:18Z

A few questions to get the discussion started:

Why a gauge instead of a histogram? A gauge is OK when looking at a single stream, or if you want to graph the average. But durations are often best represented by a histogram, as you can graph percentiles, or show a distribution. But if you graph a bunch of gauges, you will just see lots of lines on the graph, which isn't that helpful.
Does this need to be in the kubelet? IIUC, this metric is produced by watching pods, and emitting a metric when it becomes ready for the first time. It doesn't need any special knowledge that the kubelet has, right?
How long would the metric exist for? The startup will occur at the very beginning of the pod's life (in a single instant). Most pod-level metrics exist for the lifetime of the pod, but doing that would mean any aggregation would be less meaningful. Averaging the startup time of all currently-running pods in the cluster won't tell you if pod startup is currently slow. We could emit the metric for an arbitrary amount of time (e.g. 5 minutes), but that risks a scraper missing a pod entirely.

Bikeshedding: From the names, kubelet_pod_full_startup_duration_seconds vs pod_start_total_duration_seconds, I wouldn't know what the difference is. Would pod_ready_duration_seconds or pod_first_ready_duration_seconds be better?

JeffLuoo · 2024-05-24T16:20:37Z

@dashpole Hi David thank you for the comment.

Why a gauge instead of a histogram? A gauge is OK when looking at a single stream, or if you want to graph the average. But durations are often best represented by a histogram, as you can graph percentiles, or show a distribution. But if you graph a bunch of gauges, you will just see lots of lines on the graph, which isn't that helpful.

I want to use a gauge because I want to record the exact startup time of the pod, and it will allow users to know the exact time it takes for their pods to become ready to serve. With the pod-level metric, users could also group them together under the workload (e.g. deployment).

Does this need to be in the kubelet? IIUC, this metric is produced by watching pods, and emitting a metric when it becomes ready for the first time. It doesn't need any special knowledge that the kubelet has, right?

I use kubelet as kubelet will track the status of each pod in pod_startup_latency_tracker, and kubelet will watch for the status change of each pod. Also, kubelet is usually the first layer to process the pod status and it's a stable component (compared to other components in the cluster like kube-state-metrics which I usually see out-of-memory issue..) Do you have any recommendation for other places to add such metric?

How long would the metric exist for? The startup will occur at the very beginning of the pod's life (in a single instant). Most pod-level metrics exist for the lifetime of the pod, but doing that would mean any aggregation would be less meaningful. Averaging the startup time of all currently-running pods in the cluster won't tell you if pod startup is currently slow. We could emit the metric for an arbitrary amount of time (e.g. 5 minutes), but that risks a scraper missing a pod entirely.

For "Most pod-level metrics exist for the lifetime of the pod, but doing that would mean any aggregation would be less meaningful", can you provide more context here to help me understand? Thanks!

From the names, kubelet_pod_full_startup_duration_seconds vs pod_start_total_duration_seconds, I wouldn't know what the difference is. Would pod_ready_duration_seconds or pod_first_ready_duration_seconds be better?

pod_first_ready_duration_seconds looks good to me!

yujuhong · 2024-05-28T18:42:09Z

Does this need to be in the kubelet? IIUC, this metric is produced by watching pods, and emitting a metric when it becomes ready for the first time. It doesn't need any special knowledge that the kubelet has, right?

Would something like kube-state-metrics more suitable for this?
https://kubernetes.io/docs/concepts/cluster-administration/kube-state-metrics/

dashpole · 2024-05-30T16:16:57Z

/assign @JeffLuoo
/assign
/triage accepted

JeffLuoo added the kind/feature Categorizes issue or PR as related to a new feature. label May 15, 2024

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 15, 2024

k8s-ci-robot added sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 15, 2024

JeffLuoo changed the title ~~Add a metrics in kubelet to track how long it takes for pod to fully start~~ Kubelet: Add a metrics in kubelet to track how long it takes for pod to fully start May 15, 2024

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label May 16, 2024

JeffLuoo linked a pull request May 17, 2024 that will close this issue

[WIP] kubelet: Add metric for pod startup latency for each pod when the pod becomes ready #124935

Open

k8s-ci-robot assigned dashpole and JeffLuoo May 30, 2024

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet: Add a metrics in kubelet to track how long it takes for pod to fully start #124892

Kubelet: Add a metrics in kubelet to track how long it takes for pod to fully start #124892

JeffLuoo commented May 15, 2024 •

edited

JeffLuoo commented May 15, 2024

JeffLuoo commented May 15, 2024

ruiwen-zhao commented May 16, 2024

ruiwen-zhao commented May 16, 2024

JeffLuoo commented May 16, 2024 •

edited

dgrisonnet commented May 16, 2024

yujuhong commented May 17, 2024

JeffLuoo commented May 23, 2024

dashpole commented May 24, 2024

JeffLuoo commented May 24, 2024

yujuhong commented May 28, 2024

dashpole commented May 30, 2024

Kubelet: Add a metrics in kubelet to track how long it takes for pod to fully start #124892

Kubelet: Add a metrics in kubelet to track how long it takes for pod to fully start #124892

Comments

JeffLuoo commented May 15, 2024 • edited

What would you like to be added?

Why is this needed?

JeffLuoo commented May 15, 2024

JeffLuoo commented May 15, 2024

ruiwen-zhao commented May 16, 2024

ruiwen-zhao commented May 16, 2024

JeffLuoo commented May 16, 2024 • edited

dgrisonnet commented May 16, 2024

yujuhong commented May 17, 2024

JeffLuoo commented May 23, 2024

dashpole commented May 24, 2024

JeffLuoo commented May 24, 2024

yujuhong commented May 28, 2024

dashpole commented May 30, 2024

JeffLuoo commented May 15, 2024 •

edited

JeffLuoo commented May 16, 2024 •

edited