Maintenance windows allow you to suppress alerts during planned maintenance, deployments, or testing periods. This prevents alert fatigue and false alarms when you expect CronJobs to fail or miss schedules.
How Maintenance Windows Work
Maintenance windows are defined using cron expressions and durations. During a window:
Alerts are suppressed : No notifications are sent to alert channels
Monitoring continues : Guardian still tracks executions and metrics
Status is visible : The dashboard shows when a window is active
Basic Maintenance Window
Define a weekly maintenance window every Sunday at 2 AM for 4 hours:
apiVersion : guardian.illenium.net/v1alpha1
kind : CronJobMonitor
metadata :
name : my-monitor
namespace : production
spec :
maintenanceWindows :
- name : weekly-maintenance
schedule : "0 2 * * 0" # Every Sunday at 2 AM
duration : 4h
timezone : America/New_York
suppressAlerts : true
The schedule field uses standard cron syntax. Times are in the specified timezone (default: UTC).
Multiple Maintenance Windows
You can define multiple windows for different maintenance scenarios:
apiVersion : guardian.illenium.net/v1alpha1
kind : CronJobMonitor
metadata :
name : production-monitor
namespace : production
spec :
maintenanceWindows :
# Weekly database maintenance
- name : database-maintenance
schedule : "0 2 * * 0" # Sunday 2 AM
duration : 4h
timezone : UTC
suppressAlerts : true
# Monthly platform updates
- name : platform-updates
schedule : "0 0 1 * *" # First day of month, midnight
duration : 6h
timezone : UTC
suppressAlerts : true
# Daily backup window
- name : backup-window
schedule : "0 3 * * *" # Every day at 3 AM
duration : 1h
timezone : America/Los_Angeles
suppressAlerts : true
Timezone Support
Maintenance windows support timezone-aware scheduling:
maintenanceWindows :
- name : us-east-maintenance
schedule : "0 2 * * 0"
duration : 4h
timezone : America/New_York # Eastern Time
- name : europe-maintenance
schedule : "0 2 * * 6"
duration : 4h
timezone : Europe/London # British Time
- name : asia-maintenance
schedule : "0 2 * * 5"
duration : 4h
timezone : Asia/Tokyo # Japan Time
Choose your timezone
Use IANA timezone names (e.g., America/New_York, Europe/London, Asia/Tokyo).
Define the schedule in local time
The cron expression is interpreted in the specified timezone, not UTC.
Verify the window times
Check the CronJobMonitor status to see when the next window is scheduled.
Cron Schedule Examples
Description Cron Expression Every Sunday at 2 AM 0 2 * * 0Every day at midnight 0 0 * * *First day of month at 3 AM 0 3 1 * *Every weekday at 1 AM 0 1 * * 1-5Every Saturday at 4 AM 0 4 * * 6Twice a week (Wed, Sun) at 2 AM 0 2 * * 0,3
Conditional Alert Suppression
You can optionally disable alert suppression while keeping monitoring active:
maintenanceWindows :
- name : observation-window
schedule : "0 2 * * 0"
duration : 4h
suppressAlerts : false # Monitor but still send alerts
Setting suppressAlerts: false means alerts will still be sent during the window. This is useful for observation periods where you want to monitor but not perform destructive changes.
Real-World Examples
Weekly Database Maintenance
apiVersion : guardian.illenium.net/v1alpha1
kind : CronJobMonitor
metadata :
name : database-backups
namespace : databases
spec :
selector :
matchLabels :
type : backup
maintenanceWindows :
- name : weekly-db-maintenance
schedule : "0 2 * * 0" # Sunday 2 AM
duration : 4h
timezone : America/New_York
suppressAlerts : true
deadManSwitch :
enabled : true
maxTimeSinceLastSuccess : 25h
alerting :
channelRefs :
- name : pagerduty-dba
Deployment Windows
apiVersion : guardian.illenium.net/v1alpha1
kind : CronJobMonitor
metadata :
name : api-cronjobs
namespace : production
spec :
maintenanceWindows :
# Weekly deployment window
- name : weekly-deploy
schedule : "0 20 * * 2" # Tuesday 8 PM
duration : 2h
timezone : UTC
suppressAlerts : true
# Emergency hotfix window
- name : emergency-window
schedule : "0 22 * * *" # Daily at 10 PM
duration : 30m
timezone : UTC
suppressAlerts : true
Multi-Region Maintenance
apiVersion : guardian.illenium.net/v1alpha1
kind : CronJobMonitor
metadata :
name : global-monitor
namespace : production
spec :
selector :
allNamespaces : true
maintenanceWindows :
# US region maintenance (Sunday 2 AM EST)
- name : us-maintenance
schedule : "0 2 * * 0"
duration : 4h
timezone : America/New_York
suppressAlerts : true
# EU region maintenance (Saturday 2 AM GMT)
- name : eu-maintenance
schedule : "0 2 * * 6"
duration : 4h
timezone : Europe/London
suppressAlerts : true
# APAC region maintenance (Friday 2 AM JST)
- name : apac-maintenance
schedule : "0 2 * * 5"
duration : 4h
timezone : Asia/Tokyo
suppressAlerts : true
Checking Active Maintenance Windows
Via kubectl
kubectl describe cronjobmonitor my-monitor -n production
Look for the maintenance windows section in the status:
Status :
Phase : Active
Maintenance Windows :
- Name : weekly-maintenance
Next Start : 2024-01-21T02:00:00Z
Duration : 4h
Active : false
Via Dashboard
The web dashboard shows:
Active windows : Highlighted in yellow when a window is in progress
Next scheduled window : Countdown timer to next window
Window history : Past windows and their duration
Via API
curl http://localhost:8080/api/v1/monitors/production/my-monitor
Response includes maintenance window configuration and status.
Impact on Monitoring
What Happens During a Window
Alerts Suppressed No notifications sent to alert channels (Slack, PagerDuty, email).
Monitoring Active Guardian continues tracking job executions and updating metrics.
Status Updated Dashboard and API show the window is active.
History Retained All executions are recorded for post-maintenance analysis.
What Happens After a Window
Alert suppression ends : Normal alerting resumes immediately
Pending alerts sent : If issues persist after the window, alerts are sent
SLA calculations continue : Success rates and durations include maintenance period executions
Interaction with Other Features
Dead-Man’s Switch
Maintenance windows pause dead-man’s switch timers:
deadManSwitch :
enabled : true
maxTimeSinceLastSuccess : 25h
maintenanceWindows :
- name : weekly-maintenance
schedule : "0 2 * * 0"
duration : 4h
suppressAlerts : true
During the window, the dead-man’s switch won’t trigger even if jobs don’t run.
SLA Tracking
SLA calculations include maintenance window executions:
sla :
enabled : true
minSuccessRate : 95
windowDays : 7
maintenanceWindows :
- name : weekly-maintenance
schedule : "0 2 * * 0"
duration : 4h
suppressAlerts : true
Failed jobs during maintenance count toward the success rate but don’t trigger alerts.
If you want to exclude maintenance windows from SLA calculations, consider using suspendedHandling.pauseMonitoring or temporarily suspending CronJobs during maintenance.
Suspended CronJobs
Maintenance windows work alongside suspended CronJob handling:
suspendedHandling :
pauseMonitoring : true # Pause monitoring when CronJob is suspended
alertIfSuspendedFor : 168h # Alert if suspended for 7 days
maintenanceWindows :
- name : weekly-maintenance
schedule : "0 2 * * 0"
duration : 4h
suppressAlerts : true
If a CronJob is suspended during a maintenance window, monitoring is paused (if pauseMonitoring: true).
Troubleshooting
Window Not Activating
Verify the cron expression
Test your cron expression using crontab.guru or similar tools. # Validate the schedule
echo "0 2 * * 0" | crontab -
Check the timezone
Ensure the timezone is correct. Guardian logs the next scheduled window at startup. kubectl logs -n cronjob-guardian deployment/cronjob-guardian-controller | grep "maintenance window"
Verify monitor status
Check if the monitor is in Active phase: kubectl get cronjobmonitor my-monitor -o jsonpath='{.status.phase}'
Alerts Still Sent During Window
Ensure suppressAlerts: true is set. The default value is true, but if you explicitly set it to false, alerts will still be sent.
Check your configuration:
kubectl get cronjobmonitor my-monitor -o yaml | grep -A 5 maintenanceWindows
Wrong Timezone
If windows activate at the wrong time:
Verify the timezone name is correct (use IANA format)
Check Guardian’s timezone configuration
Confirm the cron expression is in local time, not UTC
# List available timezones
timedatectl list-timezones | grep America
Best Practices
Use Descriptive Names Name windows clearly: weekly-db-maintenance, not window-1.
Coordinate Across Teams Share maintenance schedules with all teams to avoid conflicts.
Don't Overuse Too many windows reduce monitoring effectiveness. Only suppress during actual maintenance.
Document Windows Add comments explaining why each window exists and what maintenance occurs.
Next Steps
Dashboard View active maintenance windows in the UI
Troubleshooting Common issues and solutions