Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube_cronjob_status_active doesn't tell if a cronjob is running #2429

Open
koote opened this issue Jun 20, 2024 · 4 comments
Open

kube_cronjob_status_active doesn't tell if a cronjob is running #2429

koote opened this issue Jun 20, 2024 · 4 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@koote
Copy link

koote commented Jun 20, 2024

What happened:
I am using kube_cronjob_status_active to monitor my cronjob, it runs every 5 minutes.
In the given time window, kube_cronjob_status_active{cronjob=my-job} shows that there is no active cronjob running:
3Kdf3orl
However if I check kube_job_status_start_time{job_name=my-job*} (my-job is a cronjob, every job instance has name like my-job-12345678) within same time window, it shows that there are jobs scheduled to run approximately every 5 minutes:
pz1gP0xf

What you expected to happen:
I would like to see that the kube_cronjob_status_active matches kube_job_status_start_time.

How to reproduce it (as minimally and precisely as possible):
It is running in our internal k8s cluster so I don't know how to let others repo it, but I am happy to provide as much information as needed.

# An example: https://github.com/kubernetes/kube-state-metrics/issues/2223#issuecomment-1792850276
minikube start
...
go run main.go --custom-resource-state-only --custom-resource-state-config-file ksm-2223/custom-resource-config-file.yaml --kubeconfig ~/.kube/config

Anything else we need to know?:

Environment: AWS EKS

  • kube-state-metrics version:
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration: AWS EKS
  • Other info:
@koote koote added the kind/bug Categorizes issue or PR as related to a bug. label Jun 20, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 20, 2024
@dgrisonnet
Copy link
Member

/assign @rexagod
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 27, 2024
@Haleygo
Copy link

Haleygo commented Oct 29, 2024

Hi @koote ,
How long does your job usually take to complete? You can find out by querying kube_job_status_completion_time{job_name=my-job*}-kube_job_status_start_time{job_name=my-job*}.
If the duration is too short (e.g., several seconds) and your scrape interval is relatively high (30s or 1m), you may miss the kube_cronjob_status_active and get zero values in database, as kube_cronjob_status_active reflects the running job at that time.
To validate this, try extending your job's exection time with deliberate sleep cmd, or reduce the scrape interval for KSM job.

@koote
Copy link
Author

koote commented Oct 29, 2024

Hi @koote , How long does your job usually take to complete? You can find out by querying kube_job_status_completion_time{job_name=my-job*}-kube_job_status_start_time{job_name=my-job*}. If the duration is too short (e.g., several seconds) and your scrape interval is relatively high (30s or 1m), you may miss the kube_cronjob_status_active and get zero values in database, as kube_cronjob_status_active reflects the running job at that time. To validate this, try extending your job's exection time with deliberate sleep cmd, or reduce the scrape interval for KSM job.

image Thanks @Haleygo, I checked last 3 days, usually the job runs for 40-80 seconds, so that is not the issue.

@Haleygo
Copy link

Haleygo commented Oct 30, 2024

@koote hmm, that's odd, it works in my test with a cronjob running every 5m and completing in 1m.
image

What's your scrape interval for KSM job then? Can you get non-zero values for kube_cronjob_status_active by calling {ksm-address}/metrics directly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants