Event Hub - incorrect metrics values #5784

Duri9292 · 2024-05-06T08:39:09Z

Report

I'm getting incorrect values from the external metric. Sometimes the external metrics provide the correct values but most of the time the current values are way off.

Metrics examples:
Here you can see that averageValue is 1040334m which does not make sense and it will trigger the maximum possible scaling.

currentMetrics:
  - external:
      current:
        averageValue: 1040334m
      metric:
        name: s0-azure-eventhub-onb
        selector:
          matchLabels:
            scaledobject.keda.sh/name: event-hub-scaler

From time to time the averageValue is more accurate and it looks more realistic.

   currentMetrics:
  - external:
      current:
        averageValue: "814"
      metric:
        name: s0-azure-eventhub-onb
        selector:
          matchLabels:
            scaledobject.keda.sh/name: event-hub-scaler

Here are the incoming messages metrics directly from Azure and as you can see we have usually an average of 100 incoming messages per minute.

Expected Behavior

The Average values should be more consistent and showing the real values.

Actual Behavior

The current values are jumping from 600 to 580334m while the real average incoming message are usually around 100. We are processing approximately 22 000 messages per day so the average value like 580334m does not make any sense.

Steps to Reproduce the Problem

Configure the azure-eventhub trigger
Monitor the HPA average values

Logs from KEDA operator

2024-05-04T18:45:09Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "eb42d4de-1c9f-4dce-b243-32901de7ce0e"}
2024-05-04T18:45:24Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "f30d4ff1-8640-4cf8-8bc1-414fe92bd72c"}
2024-05-04T18:45:40Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "5e95c24b-f0d5-4a0e-bf19-b437ea3b6d71"}
2024-05-04T18:45:55Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "3b326319-9811-441b-899c-c4c712d4451c"}
2024-05-04T18:50:44Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "9fd2fdf5-3de0-4f12-ba91-9733a41a2670"}
2024-05-04T18:51:00Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "a6bf465f-fe21-472c-9059-bc16ddf56617"}
2024-05-04T18:53:51Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "ec65a58d-8f68-418b-bca4-ee34a3a3f952"}
2024-05-04T18:55:08Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "35cb23fa-6803-459a-a459-dd09beafb8b1"}
2024-05-04T18:55:24Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "98d6080b-9c5e-4f14-a4b0-883f7325648c"}
2024-05-04T18:56:26Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "43613e30-d318-410e-8633-d46cec379c31"}
2024-05-04T18:56:42Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "f0ffdb33-b0a5-4bd1-82d7-1238e817f960"}
2024-05-04T18:56:57Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "45100633-32b2-42fb-bd93-0831d15b4ac9"}
2024-05-04T18:57:13Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "dec7edf4-7580-4c96-91ed-8c6766ea98c3"}
2024-05-04T18:57:28Z	INFO	Reconciling ScaledObject	{"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"event-hub-scaler","namespace":"my-app-namespace"}, "namespace": "my-app-namespace", "name": "event-hub-scaler", "reconcileID": "6eec8aec-30e1-44e9-8aec-fca3dc955665"}

KEDA Version

2.11.2

Kubernetes Version

1.28

Platform

Microsoft Azure

Scaler Details

azure-eventhub

Anything else?

No response

The text was updated successfully, but these errors were encountered:

JorTurFer · 2024-05-06T13:06:33Z

Hello,
Which is the problem exactly? The value using m (1040334m)?
In K8s context 'm' means mili and it's used when the value is a float number because k8s doesn't use float numbers. When you see 1040334m it means 1040,334. In the same way, jumps between 600 and 580334m are quite normal because it's jumping from 600 and 580,334

Duri9292 · 2024-05-06T14:05:23Z

Hello @JorTurFer thank you for your quick response. The issue is that once the number is in mili scale the HPA is always scaling to maximum possible replica number. When the average value is non float number the scaler is decreasing the replicas or scaling as expected.

e.g. current average number: 1741
replica:1

current average number: 580334m
replica:3
I configured the trigger value to 5000 and the scaler is always active which should not be in this case.

JorTurFer · 2024-05-06T14:26:28Z

Are you scrapping prometheus metric generated by KEDA? I almost sure that you have a peak which justifies the scaling out, as you said, you're under the threshold. The only option for that behaviour without a peak is that you have changed the target value and the HPA controller is still during the scaling cooldown (300 after the last scaling out)

Duri9292 · 2024-05-06T14:46:19Z

The thing is that we turned down Event Hub data ingestion for the last 24 hrs which means that we are getting 0 incoming messages. (we wanted to test scaling to 0) So there are no peaks. Even a value like 1741 does not make very sense but if it is calculating the average value for the last few days it can be relevant. I will be monitoring the behavior once we enable data ingestion again.

Below is a graph for incoming messages to Event Hub (past 48hrs)
Data granularity: 5 minutes

JorTurFer · 2024-05-07T20:36:13Z

No no, it doesn't use the average value at all. KEDA uses the current value, so if it's 0 in the eventhub and you don't see 0 in KEDA, it can be a misconfiguration or a bug. Do you see any value different from 0? You can manually query the metric value and check what KEDA returns: https://keda.sh/docs/2.14/operate/metrics-server/#querying-metrics-exposed-by-keda-metrics-server

Duri9292 · 2024-05-14T06:50:37Z

Hello @JorTurFer to answer your question "Do you see any value different from 0?" yes, even when even hub was turned off the HPA had always some number in meterics.

We enabled the event hub again and for some reason, we stopped getting float values, and scaling is working as expected. Or at least I did not catch any float number during my observation since there is no history of this value I cannot confirm. But it seems that once the float values stopped occurring the scaling is ok.

No no, it doesn't use the average value at all.

The documentation mentions that these are average values, we are using default. If that is not true than sorry I must missed it.

However, the values from metrics still do not match values from event hub metrics.

Event Hub (sum) for the past 30 min

Event Hub (avg) for past 24 hrs

JorTurFer · 2024-05-26T14:17:59Z

The documentation mentions that these are average values, we are using default. If that is not true than sorry I must missed it.

mb, I understood that KEDA recovers the average value from the eventhub. You are right and k8s workload will be scaled based on the average value calculated using the instant eventhub value

JorTurFer · 2024-05-26T14:19:23Z

We enabled the event hub again and for some reason, we stopped getting float values, and scaling is working as expected. Or at least I did not catch any float number during my observation since there is no history of this value I cannot confirm. But it seems that once the float values stopped occurring the scaling is ok.

Float values are correct and they can happen, if eventhub returns 7 and you have 4 pods, you'll have a float value in average

Duri9292 added the bug Something isn't working label May 6, 2024

Duri9292 changed the title ~~Event Hub - incorrect metrics al~~ Event Hub - incorrect metrics values May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event Hub - incorrect metrics values #5784

Event Hub - incorrect metrics values #5784

Duri9292 commented May 6, 2024 •

edited

JorTurFer commented May 6, 2024

Duri9292 commented May 6, 2024

JorTurFer commented May 6, 2024

Duri9292 commented May 6, 2024

JorTurFer commented May 7, 2024

Duri9292 commented May 14, 2024

JorTurFer commented May 26, 2024

JorTurFer commented May 26, 2024

Event Hub - incorrect metrics values #5784

Event Hub - incorrect metrics values #5784

Comments

Duri9292 commented May 6, 2024 • edited

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

JorTurFer commented May 6, 2024

Duri9292 commented May 6, 2024

JorTurFer commented May 6, 2024

Duri9292 commented May 6, 2024

JorTurFer commented May 7, 2024

Duri9292 commented May 14, 2024

JorTurFer commented May 26, 2024

JorTurFer commented May 26, 2024

Duri9292 commented May 6, 2024 •

edited