What is CronJob Guardian?
CronJob Guardian is a Kubernetes operator that monitors CronJobs with SLA tracking, intelligent alerting, and a built-in dashboard. It ensures your critical scheduled jobs run successfully and alerts you when something goes wrong.The Problem
CronJobs power critical operations like backups, ETL pipelines, and reports—but Kubernetes provides no built-in monitoring for them. When jobs fail silently or stop running, you only find out when it’s too late. Common issues that go undetected:- Silent failures: Jobs fail but no one knows until data is missing
- Jobs stop running: Schedule issues or resource constraints prevent execution
- Performance degradation: Jobs slow down gradually over time
- Resource leaks: Failed jobs consume cluster resources
How Guardian Helps
CronJob Guardian watches your CronJobs and alerts you when something goes wrong, with rich context to help you diagnose and fix issues quickly.Key Features
Dead-Man's Switch
Alert when CronJobs don’t run within expected windows. Automatically calculates thresholds from cron schedules or set custom intervals.
SLA Tracking
Monitor success rates, duration percentiles (P50/P95/P99), and detect regressions. Set minimum success rates and maximum duration thresholds.
Intelligent Alerts
Get rich context with pod logs, Kubernetes events, and suggested fixes. Alerts include everything you need to diagnose the issue.
Multiple Channels
Send alerts to Slack, PagerDuty, webhooks, or email. Route different severities to different channels.
Built-in Dashboard
Feature-rich web UI with charts, heatmaps, execution history, and CSV exports. No external tools required.
Prometheus Metrics
Export metrics for existing monitoring infrastructure. Integrates with your existing observability stack.
Architecture
CronJob Guardian runs as a single operator pod in your cluster with three main components:- Operator: Watches CronJobs and Jobs, tracks execution history, calculates SLA metrics
- Storage: SQLite (default), PostgreSQL, or MySQL for execution history and metrics
- Dashboard: Embedded web UI for viewing metrics and execution history
- Custom Resources:
CronJobMonitorandAlertChanneldefine what to monitor and where to alert
Who Should Use This?
CronJob Guardian is ideal for teams that:- Run critical scheduled jobs (backups, ETL, reports)
- Need to maintain SLA commitments
- Want to catch failures before customers notice
- Need visibility into job performance and trends
- Want centralized monitoring across multiple namespaces or clusters
Example Use Cases
Database Backups
Ensure nightly backups run successfully with 100% success rate monitoring:ETL Pipelines
Monitor data pipelines with duration regression detection:Financial Reports
Quiet alerts during planned maintenance:What’s Next?
Quickstart
Get CronJob Guardian running in 5 minutes
Installation
Detailed installation guide with all configuration options
Core Concepts
Learn about monitors, alert channels, and SLA tracking
Examples
Real-world configuration examples