Archive

Archive for October, 2016

Monitoring and Alerting for Availability Groups and Why I Love PASS Summit

October 31, 2016 Leave a comment

I have returned from my eighth PASS Summit and, as in years past, they just keep getting better.  I am amazed and humbled that I am a part of the PASS community.  PASS has given me the opportunity to meet and learn from many of the best and brightest in the SQL world.  When I meet first-timers, I tell them to talk to people, no matter who they are.  If you share an interest, they will share with you.  If you have never been to PASS Summit, I recommend it.  If you cannot make Summit, get involved with your local PASS Chapter or SQL Saturday.  Virtual Chapters are another great way to find out what PASS has to offer.

One of the highlights of this year’s Summit was Shawn Meyers’ (t) presentation on Monitoring and Alerting of Availability Groups. (You will need to be logged in to the PASS website to access the link.)  I have worked with Availability Group since its release in SQL Server 2012 and Shawn provided the best solution for monitoring and alerting.

Shawn suggested implementing three alerts:

  • 1480 – AG Role Change
  • 35264 – AG Data Movement Suspended
  • 35265 – AG Data Movement Resumed

Additionally, Shawn provided an MSDN link with recommendations for using Policy-Based Management (PBM) to monitor AGs.  The custom policies monitor the recovery time objective (RTO) and recovery point objective (RPO) for your availability groups.  Microsoft recommends 600 seconds, or 10 minutes, for RTO, and 3600 seconds, or 60 minutes, for RPO. You can set your own values when defining the policy conditions.

I have scripted out the PBM conditions, policies, and SQL Server Agent alerts.  The conditions and policies can be used as-is, because they use system schedules.  The alerts need to updated to replace the TestOperator with an SQL Agent operator of your own.

I hope you can make use of these scripts.  And I encourage you to get involved with PASS. You will be amazed at what it can do for you.

agmonitoringandalertingscripts