-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(collector): actually enable the deployment of PrometheusRule when using PodMonitor #1416
base: main
Are you sure you want to change the base?
Changes from all commits
d66946d
a73d0f8
d2d253a
ba5fd56
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,4 +1,4 @@ | ||||||
{{- if and .Values.prometheusRule.enabled .Values.serviceMonitor.enabled }} | ||||||
{{- if and .Values.prometheusRule.enabled (or (eq (include "opentelemetry-collector.serviceEnabled" .) "true") (or .Values.serviceMonitor.enabled .Values.podMonitor.enabled)) }} | ||||||
apiVersion: monitoring.coreos.com/v1 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@crutonjohn this sentence reads as if the condition should be
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in hindsight i probably shouldn't have tried to gotmpl that late in the day 😅 i think i mostly have it figured out:
Suggested change
however, with this logic a user could use these values: mode: "deployment"
image:
repository: "otel/opentelemetry-collector-k8s"
prometheusRule:
enabled: true
ports:
metrics:
enabled: true
podMonitor:
enabled: false
serviceMonitor:
enabled: false
service:
enabled: false and believe they are going to get a PrometheusRule, but instead get no PrometheusRule/ServiceMonitor/PodMonitor because neither service nor pod monitors are enabled. would it be considered acceptable UX to prevent the user from getting a PrometheusRule CR when they explicitly say they want one? or should we remain somewhat opinionated in that you should only be deploying a PrometheusRule CR when you have a Pod/ServiceMonitor getting deployed in tandem with said PrometheusRule? happy to accommodate either way, i just don't know what's best for the project. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm thats a good point. Is there any harm in creating a prometheusRule even if pod/service monitor don't exist? What will happen if something tries to use the Prometheus rule but the monitors dont exist? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i'd have to do some verification, but iirc a PrometheusRule is just what it sounds like. it would effectively create the rule(s) contained within the file. there is some nuance with how prometheus operator is deployed that can affect how the PrometheusRule can "land" on the cluster...for example if you are running multiple prometheus instances in the same cluster scraping different targets. the principle is still the same. the rules in the collector chart are generic enough (read: have low granularity due to promql label selector omission) that there could be a scenario wherein a metric like otelcol_exporter_send_failed_log_records exists in the TSDB from a source that isn't our deployed collector (or the user has this chart deployed multiple times for different situations), users could hypothetically be getting alerts for something unrelated to our collector deployment. this would be kind of a "perfect storm" type scenario to be fair, but definitely possible. i've seen some pretty funky prometheus operator setups before. i think the crux of the issue is that there are tons of potential deployment scenarios that we couldn't possibly account for every one with the logic within the chart. part of the responsibility is on the end user being familiar with what the chart's behavior enough to know "hey i don't need to enable the PrometheusRule if i'm not going to be using the Service/PodMonitor". could add some docs somewhere, idk where the best place would be tho. sorry, not trying to be a headache. |
||||||
kind: PrometheusRule | ||||||
metadata: | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run
make generate-examples CHARTS=opentelemetry-collector
to generate the examples.