-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve alert aggregations for multiple clusters #633
base: master
Are you sure you want to change the base?
Improve alert aggregations for multiple clusters #633
Conversation
ff2b9c3
to
9f786b1
Compare
Hi, is there anything else I can do to move this along? I did just find #524, which I missed when making this PR but largely does the same thing, that's been hanging around since last year. |
@@ -32,11 +32,11 @@ | |||
// label exists for 2 values. This avoids "many-to-many matching | |||
// not allowed" errors when joining with kube_pod_status_phase. | |||
expr: ||| | |||
sum by (namespace, pod) ( | |||
max by(namespace, pod) ( | |||
sum by (namespace, pod, %(clusterGroupLabelsStr)s) ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be using clusterLabel
? I don't see why we would need another variable for this.
Same question for all cases with clusterGroupLabelsStr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value for clusterGroupLabels
is clusterLabel
, so in the simple case they are the same thing.
But clusterLabel
is used in other places (e.g. in dashboards as a selector like %(clusterLabel)s="$cluster"
) and so can only be a single label.
Using a different variable for grouping labels gives more flexibility and allows use cases like I described in the PR, where I ideally want to group by $clusterLabel, dc
and not just the clusterLabel
.
9f786b1
to
597ee3d
Compare
fa29ff2
to
5d51afc
Compare
This PR has been automatically marked as stale because it has not The next time this stale check runs, the stale label will be Thank you for your contributions! |
Many of the alerts are aggregating on metrics and throwing away the cluster label.
While this isn't the end of the world, pod names are probably going to be unique, but if you have many clusters trying to figure out which one
kube-proxy-4jlm2
is running on is a pain.Because these labels are currently just hardcoded into strings its also difficult to override them in your own jsonnet.
This PR adds extra labels to aggregations.
Only if multi cluster is enabled and extensible.
For example in my environment I also inject a
dc
label with the AWS region for all metrics as well as thecluster
label.So in my config I can do
and include that
dc
label in aggregations