-
Notifications
You must be signed in to change notification settings - Fork 781
feat(provider): add External Metrics provider #1863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
85c595d to
139a34a
Compare
3757b5a to
72ad54a
Compare
72ad54a to
9925699
Compare
Datadog provider is often meeting API rate limits on bigger implementations. Datadog Cluster Agent can batch metric queries and expose them through an endpoint compatible with Kubernetes External Metrics API. This implementations allows to use this endpoint and any other server implementing Kubernetes External Metrics API. Including k8s API server itself. Co-authored-by: Johan Lore <johan.lore@decathlon.com> Co-authored-by: Maxime Véroone <maxime.veroone@decathlon.com> Signed-off-by: Johan Lore <johan.lore@decathlon.com>
9925699 to
2ee47e0
Compare
31f6e7d to
eeeccfc
Compare
Signed-off-by: Maxime Véroone <maxime.veroone@decathlon.com>
eeeccfc to
86cc361
Compare
aryan9600
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for opening this PR!
| applicationBearerToken = "token" | ||
| ) | ||
|
|
||
| // ExternalMetricsProvider executes datadog queries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // ExternalMetricsProvider executes datadog queries | |
| // ExternalMetricsProvider fetches metrics from an ExternalMetricsProvider. |
| bearerToken string | ||
|
|
||
| timeout time.Duration | ||
| client *http.Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use an ExternalMetricsClient object created by to fetch the external_metrics.ExternalMetricValueList? we can create one using the NewForConfig function in this package. it takes care of loading the service account token automatically and provides a nice interface to fetch the metrics?
Proposed addition
The current Datadog metric provider relies on their Metric API.
However, this API has pretty low rate limits, and people with a moderately sized infrastructure tend to reach these limits quite easily when scaling their usage of Flagger or datadog-based autoscaling (like KEDA).
Datadog offers a more scalable alternative by making its Cluster Agent batch requests by groups of 35 see Cluster Agent Autoscaling Metrics. It then makes these metrics available within the cluster by exposing an endpoint following Kubernetes External Metrics API.
Note
This endpoint is not documented by Datadog, as they expect people to have the agent register against the control plane as the cluster's external metrics provider and then making these metrics available through k8s API Server, removing the need to query the endpoint directly.
However, by implementing a kubernetes API, its behavior is predictable and stable enough to be used directly.
We've relied on the way KEDA implemented a similar feature during design and implementation. However, Flagger is not an autoscaling solution so we're not going to mimic the metric proxy Keda operates. We simply propose to query the external metric server directly. By doing this, we also chose to make the provider generic and compatible with any external metrics server. The downside is that we cannot abstract the way datadog names its metrics which isn't trivial.
fix: #1235
Any alternatives you've considered?
We've pondered modifying the Datadog metric provider instead of making an external metrics provider. But we felt that this had the benefit of making other external metric providers compatible and kept the code datadog-agnostic.
We could theoretically make it even more generic and use any kubernetes metric API (standard, Custom or External), but I think Flagger already offers this
Disclaimer