A Hybrid ML and Rule-Based Approach to SQL Backup Monitoring

https://hackernoon.imgix.net/images/sql-monitoring-tlv5btan57zxxw4rr2we2v5q.png

SQL Server backup history is typically like a troubleshooting artifact, something checked when an error occurs, not necessarily analyzed over time. In this article, we will go over how to implement a structured backup telemetry table based on data and use Isolation Forest (a machine learning model) in conjunction with simple rules to detect pre-incident performance drift, I/O degradation, and recoverability risks.

The Problem with Binary Backup Monitoring

Backup monitoring is typically binary. Either the backup is successful or not, and that is about as far as your analysis goes.

The issue with this approach is that most backup problems do not begin with failure. They take slow steps, such as longer run times, lower throughput, and lower compression ratios. A change that can be detected in the metrics long before it manifests into an operational incident. I have seen backups drift from 20 minutes to 55 minutes, and no...

Copyright of this story solely belongs to hackernoon.com. To see the full text click HERE

Read more