Cloud Monitoring adds long-lookback alert policies for PromQL
Choosing the threshold of an alert policy can be a headache. You have to analyze historical data, aggregate it into semantically meaningful time series, and choose a threshold that matters. If the workload grows, your previously set static threshold might become too low, and your alert might fire too frequently. New workloads might require setting new thresholds, and setting separate thresholds for separate workloads requires creating separate policies, resulting in the annoyance of managing a fleet of mostly similar policies.
Not to mention, some metrics can’t even be alerted on using static thresholds. If your metric varies by time of day, like many e-commerce metrics do, then no single threshold will work. For example, what do you do if your metric looks like this:
Clearly something went wrong in the middle of that chart… but because the anomalous value is within the normal range of the daily data, no static...
Copyright of this story solely belongs to google.com. To see the full text click HERE