I’ve been meaning to write about this for a while, because I was thrilled when I found this new configuration element in the expression filter module when SCOM 2012 hit the press.
For reference, here are the differences in the expression filters:
SCOM 2007: http://msdn.microsoft.com/en-us/library/ee692962.aspx
SCOM 2012: http://msdn.microsoft.com/en-us/library/jj129836.aspx
Previously, the System.ExpressionFilter did not include suppression – today it does!
What this means is, we can now count the number of passes through a condition detection, and it will only pass data to the next module when the MatchCount value exceeds the configuration provided.
It doesn’t sound like a big deal really – but it is. I’ve had cases where I needed to count condition passes, and the only way to do it before was to include a consolidation module. This was not fun and it turned out to be a lot more work than was necessary – and it was confusing to the customer when they looked at the code.
What I do not like so much is the fact that Microsoft doesn’t expose this new configuration in their base monitoring at this time. For example, it’s not possible to override the match count for a service monitor that you created using the service monitoring template – or even interval for that matter. To me, it doesn’t make sense to introduce a new configuration element without providing a way to override it – especially a valuable configuration such as this.
The default monitoring for Windows services (at this time) is to sample every 30 seconds and exceed a match count of 2. This equates to a state change within 60 seconds of service downtime.
What I am providing here is a Windows service monitoring VSAE fragment that will allow you to override both the interval as well as the match count. I’ve also included an additional state value to account for service not found conditions. I added this condition because sometimes a pack needs to take into account upgrade scenarios where a service name changes – you don’t want an alert on a service that had been renamed due to an upgrade!
By the way, MatchCount has nothing to do with service monitoring – it’s a part of the expression filter, and can be used anywhere. This is just a working example of how you can use it in a custom service monitor type.
Here you go!
<ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <TypeDefinitions> <MonitorTypes> <UnitMonitorType ID="Example.CustomeModuleLibrary.MonitorType.CheckServiceState" Accessibility="Public"> <MonitorTypeStates> <MonitorTypeState ID="MTS_Running" /> <MonitorTypeState ID="MTS_NotRunning" /> </MonitorTypeStates> <Configuration> <xsd:element name="ComputerName" type="xsd:string" /> <xsd:element name="ServiceName" type="xsd:string" /> <xsd:element name="IntervalSeconds" type="xsd:integer" /> <xsd:element name="MatchCount" type="xsd:integer" /> </Configuration> <OverrideableParameters> <OverrideableParameter ID="IntervalSeconds" Selector="$Config/IntervalSeconds$" ParameterType="int" /> <OverrideableParameter ID="MatchCount" Selector="$Config/MatchCount$" ParameterType="int" /> </OverrideableParameters> <MonitorImplementation> <MemberModules> <DataSource ID="DS" TypeID="Windows!Microsoft.Windows.Win32ServiceInformationProvider"> <ComputerName>$Config/ComputerName$</ComputerName> <ServiceName>$Config/ServiceName$</ServiceName> <Frequency>$Config/IntervalSeconds$</Frequency> </DataSource> <ProbeAction ID="Probe" TypeID="Windows!Microsoft.Windows.Win32ServiceInformationProbe"> <ComputerName>$Config/ComputerName$</ComputerName> <ServiceName>$Config/ServiceName$</ServiceName> </ProbeAction> <ConditionDetection ID="CD_ServiceRunning" TypeID="System!System.ExpressionFilter"> <Expression> <RegExExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='State']</XPathQuery> </ValueExpression> <Operator>MatchesRegularExpression</Operator> <Pattern>^(4|8)$</Pattern> </RegExExpression> </Expression> </ConditionDetection> <ConditionDetection ID="CD_ServiceNotRunning" TypeID="System!System.ExpressionFilter"> <Expression> <RegExExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='State']</XPathQuery> </ValueExpression> <Operator>DoesNotMatchRegularExpression</Operator> <Pattern>^(4|8)$</Pattern> </RegExExpression> </Expression> <SuppressionSettings> <MatchCount>$Config/MatchCount$</MatchCount> </SuppressionSettings> </ConditionDetection> </MemberModules> <RegularDetections> <RegularDetection MonitorTypeStateID="MTS_Running"> <Node ID="CD_ServiceRunning"> <Node ID="DS" /> </Node> </RegularDetection> <RegularDetection MonitorTypeStateID="MTS_NotRunning"> <Node ID="CD_ServiceNotRunning"> <Node ID="DS" /> </Node> </RegularDetection> </RegularDetections> <OnDemandDetections> <OnDemandDetection MonitorTypeStateID="MTS_Running"> <Node ID="CD_ServiceRunning"> <Node ID="Probe" /> </Node> </OnDemandDetection> <OnDemandDetection MonitorTypeStateID="MTS_NotRunning"> <Node ID="CD_ServiceNotRunning"> <Node ID="Probe" /> </Node> </OnDemandDetection> </OnDemandDetections> </MonitorImplementation> </UnitMonitorType> </MonitorTypes> </TypeDefinitions> <LanguagePacks> <LanguagePack ID="ENU" IsDefault="true"> <DisplayStrings> <DisplayString ElementID="Example.CustomeModuleLibrary.MonitorType.CheckServiceState" SubElementID="IntervalSeconds"> <Name>Interval (seconds)</Name> <Description>Check service state interval.</Description> </DisplayString> <DisplayString ElementID="Example.CustomeModuleLibrary.MonitorType.CheckServiceState" SubElementID="MatchCount"> <Name>Match Count</Name> <Description>Number of intervals service is not running before changing monitor state.</Description> </DisplayString> </DisplayStrings> </LanguagePack> </LanguagePacks> </ManagementPackFragment>
Now you can implement new unit monitors that use this monitor type, and extend to your operators the ability to override interval and match count. You might want to replace "Example" with your company name before implementing in your library.
🙂
Hi Jonathan,
Thanks for this post. Reading this, I realize that I can generate a service monitor which will change its health state only after x number of times(which we can provide through override) the service stop condition is detected, however, I am not sure how can I implement this in my existing service monitors. Could you please provide an example of how this can be used in a service monitor?
Hi. How do we expose this new monitor type? Once imported into VSAE, I can use this monitor type to create a monitor, but I don’t know what to put in the configuration. Should this new monitor type be exposed in the Console now? Many Thanks
I think I’ve figured this out, but the new monitor doesn’t like true which is usually a configuration item in a Service Monitor no?
That stripped out my test. It should say that it won’t accept the CheckStartUp Configuration property in this new monitor. Should this really be included?
Hello Jonathan, I try to use the MatchCount on a custom Log monitor and it’s didnt Work. i use some module from NiceLog MP. all my condition Module Working. I can flip the monitor from healty to Warning ant to Error with the matchcount set to 1. When i set the matchcount to 2 or more it’s didn’t work. I Need some Help Pls.
Hi Jonathan,
I am curious can this be implemented with “Application Pool availability” monitor from IIS management pack? We often restart Application Pools and that results in alerts that close in a next interval. It would be good for us if we could create “Application Pool availability” monitor that will alert only in Application Pool is disabled after, say 5 checks/intervals?
Thanks,
Nike
To implement MatchCount in a sealed pack, my suggestion is to “forklift” those unit monitors and put them into your own “extended” pack, and implement MatchCount there. Then disable the monitors that are in the sealed pack.