Suppressing Module Events (event id 21405 example)

I was recently faced with an uncommon scenario, where I needed a script-based discovery to exit without submitting data. Exiting a script-based discovery without submitting discovery data, even if it’s an empty data item, is not considered a good practice. But in this particular scenario, I understood the reasons and the resulting instance space was minimal, so I was open to finding a trick that would accomplish this without having 21405 events logged on the agent-managed computer at each interval if the instance did not pass the test.

What I discovered was the CommandExecuterEventPolicyType in the System.CommandExecuterSchema. This is a Schema Type defined in the System.Library.

All script-based probes have a default event policy that describes how exit codes, standard error, and standard out are handled by the system. If any of these data streams match an event policy expression, the system will log an event to the Operations Manager log on that agent-managed computer.

For example, the Microsoft.Windows.ScriptProbeDiscoveryBase has a default policy as follows:

<StdOutMatches Operator="DoesNotMatchRegularExpression">&lt;DataItem.+/DataItem\b*&gt;}|{&lt;DataItem.*/&gt;}</StdOutMatches>
In my particular scenario, this default policy was capturing an exited discovery script as an error and writing the following event 21405 on those agents.
The process started at 2:04:14 PM failed to create System.Discovery.Data, no errors detected in the output
By overriding the default event policy, without having to change anything else in the module, those errors built into the product may be ignored by specifying event policy configuration as follows.

In this example, I documented the override by specifying an expression that describes my reason for overriding the policy. This expression will never match actual standard out or standard error, so it serves two purposes.

Overriding a default event policy like this is very uncommon, and I do not recommend it unless you have a good understanding of how it may impact your workflow – but it is a nice trick to use if you find yourself in a unique situation that calls for it.

As a side note – this scenario probably could be handled in a more sophisticated way, by composing a new module that would filter the data stream before it even reaches a discovery module. This way would produce no errors, so overriding event policies would not be required. Food for thought!

Passing Boolean values to Powershell modules

Often times we need to pass a Boolean value to a Powershell provider, and a perfect example is for the purpose of a “debug” flag.

In my experience, Powershell probe and write action modules always type cast the script input parameter as String, whether the configuration element is of type Boolean or String. This means the default value in your workflow could be true, false, 1, or 0, and it will always equal true if you are handling this input parameter as Boolean in the script.

Obviously, this will cause some unexpected behaviors, and most likely will be viewed as a bug in the management pack.

The way I work around this issue (at least until this is fixed in the product) is NOT typing the script parameter, and then converting the variable to Boolean just after the param block like this:

$debugFlag = [System.Convert]::ToBoolean($debugFlag)

By handling a Boolean input parameter like this in the script, we can effectively use the Boolean type in module configuration; the default value in the monitoring workflow can be true or false, the value will be handled appropriately by the script, the operator will see the default value as it should appear in console, and the operator will be presented with a meaningful (True|False) selector in the override interface.

UPDATE (09/02/2015) – This problem applies to Boolean, Integer, and Double values!

After writing this post, I wanted to get down to the bottom of the problem, so I poked around the Windows Library and what I found made it clear.

Taking a look at the implementation of any Powershell probe module, you will notice the Parameters element type is NamedParametersType, and this is defined in the Microsoft.Windows.PowerShellSchema. Notice the name-value pair are of type ID and string, respectively. This is the reason any type of value will be cast as String in a Powershell module.

<SchemaType ID="Microsoft.Windows.PowerShellSchema" Accessibility="Public"> 
<xsd:complexType name="NamedParametersType">
<xsd:element name="Parameter" minOccurs="0" maxOccurs="unbounded">
<xsd:element name="Name" type="xsd:ID" />
<xsd:element name="Value" type="xsd:string" />
<xsd:complexType name="SnapInsType">
<xsd:element name="SnapIn" minOccurs="0" maxOccurs="unbounded" type="xsd:string" />
<xsd:simpleType name="NonNullString">
<xsd:restriction base="xsd:string">
<xsd:minLength value="1" />

It seems to me, the way to resolve this is to update the library with a choice for value type. This would allow the developer the option to type cast, and would remove the assumption that type casting is occurring when it is not. Since this schema exists in the Microsoft Windows Library, it would make the most sense for Microsoft to fix this.

Unable to add the domain to the subject

Quick break/fix post here, because I was unable to find the solution to this subject anywhere in the community blogs or in KB articles.

I ran into this error while attempting to perform a manual installation of the System Center 2012 SP1 Operations Manager agent on a Linux (Ubuntu 10.04) server.

Full context of this error is as follows:

jonathan@ubuntu-01:/etc/opt/microsoft/scx$ sudo dpkg -i /home/jonathan/scx-1.4.0-906.universald.1.x64.deb (Reading database ... 44245 files and directories currently installed.) Preparing to replace scx (using .../scx-1.4.0-906.universald.1.x64.deb) ... * Shutting down Microsoft SCX CIM Server: [fail] invoke-rc.d: initscript scx-cimd, action "stop" failed. Unpacking replacement scx ... Setting up scx ( ... Checking existence of /lib64/ and /lib64/ ... Checking existence of /lib64/ and /lib64/ ... Found /lib64/ and /lib64/ ... Generating certificate with hostname="ubuntu-01", domainname="" WARNING! Could not read 256 bytes of random data from /dev/random. Will revert to less secure /dev/urandom. See the security guide for how to regenerate certificates at a later time when more random data might be available. Error generating SSL certificate: 'Unable to add the domain to the subject.' dpkg: error processing scx (--install): subprocess installed post-installation script returned error exit status 3 Processing triggers for ureadahead ... Errors were encountered while processing: scx

Long story short, the problem was resolved by modifying the /etc/resolv.conf file. Specifically, removing the trailing “dot” at the end of

Here is the full context after modifying that file, which resulted in a successful installation:

jonathan@ubuntu-01:/etc/opt/microsoft/scx/ssl$ sudo nano /etc/resolv.conf jonathan@ubuntu-01:/etc/opt/microsoft/scx/ssl$ sudo rm /etc/opt/microsoft/scx//ssl/scx-key.pem jonathan@ubuntu-01:/etc/opt/microsoft/scx/ssl$ sudo dpkg -i /home/jonathan/scx-1.4.0-906.universald.1.x64.deb (Reading database ... 44245 files and directories currently installed.) Preparing to replace scx (using .../scx-1.4.0-906.universald.1.x64.deb) ... * Shutting down Microsoft SCX CIM Server: [fail] invoke-rc.d: initscript scx-cimd, action "stop" failed. Unpacking replacement scx ... Setting up scx ( ... Checking existence of /lib64/ and /lib64/ ... Checking existence of /lib64/ and /lib64/ ... Found /lib64/ and /lib64/ ... Generating certificate with hostname="ubuntu-01", domainname="" WARNING! Could not read 256 bytes of random data from /dev/random. Will revert to less secure /dev/urandom. See the security guide for how to regenerate certificates at a later time when more random data might be available. * Starting Microsoft SCX CIM Server: [ OK ] Processing triggers for ureadahead ...

Notice that I first removed the scx-key.pem file that was generated by the failed install, and then ran the installer package again – don’t actually know if this was necessary, but I thought it might be best to clean it up. As you can see, the final result is a signing of the certificate and the SCX CIM Server started successfully.

A little more background to the problem (if your interested):

The sequence of events that (I believe) led to this situation was the fact that I initially had the Linux server directly connected to an internet accessible access point, and the Ubuntu box was also configured to receive its network settings via DHCP. For some reason, DHCP added an extra “dot” in the resolv.conf file domain lines, and this apparently was an invalid configuration in the certificate signing process.

Hope this helps someone out there in a similar situation.

WFAnalyzer.exe has stopped working

Just a quick post here, for those management pack developers that have run into the problem of simulating a workflow using the Visual Studio Authoring Extensions.

I have been missing the Workflow Analyzer companion to the MP Simulator for QUITE SOME TIME. I’ve tried troubleshooting the problem on several occasions, probably spending more than 10-12 hours burning midnight oil over the past months researching, debugging, sifting through logs and many Stack Overflow pages regarding .NET exceptions.

I’ve uninstalled and installed again the VSAE, and probably Visual Studio as well at some point, to no avail – the catastrophic behavior of Workflow Analyzer was clearly the bane of my development work. “The system cannot find the file specified?” What a benign message that is, especially when there is no file specified in the exception!

Just fixed it, though – out of what seems to be sheer luck.

I uninstalled Microsoft Monitoring Agent from my development workstation and installed the System Center 2012 SP1 agent – low and behold, the Workflow Analyzer sprung to life!

So happy now that I can actually see runtime data!


Here are a couple other references that didn’t provide a solution for me, but this issue seems very eluding and they might work in your case.

Coupling time offset to monitoring interval

The requirements gathering phase of the management pack development lifecycle is critically important to the success of the project. Something that may come out of this phase is receiving company health check scripts, and this is an excellent opportunity to incorporate familiar company knowledge into a new monitoring solution.

These scripts might be used to check for some condition that may have occurred in the past n minutes or hours – n is referred to as a time offset in this case. This article will briefly describe a simple concept to a best practice around implementing this type of script in a custom data source.

This concept can be broken down into the simplest term, where n and monitoring interval share configuration.

For example, a script executes the following SQL query:

SELECT COUNT(Column1) as [Count], Name 
FROM MyDatabase

The part I want to draw your attention to is the WHERE clause in the SQL query, because this is where time offset comes into the picture – it is how time offset is identified, and allows for the implementation of this coupling concept.

The query above would return records that have been written in the past 60 minutes from now. When the script is plugged into a data source, “now” is the monitoring interval, which is configured on the scheduler that triggers script execution.

So, we conclude that “now” is IntervalSeconds on the simple scheduler module.

Now that we know we can couple time offset with monitoring interval, we can easily use the same value for both by sharing the same configuration. In order to do this, two minor changes need to be made in any script you plan to incorporate using this concept:

1. Ensure time offset is in seconds.
2. Replace the time offset value with the IntervalSeconds configuration.

In this scenario, we cover points 1 and 2 above by updating the 1st and 2nd arguments in the DATEADD function like this:
WHERE Timestamp BETWEEN DATEADD(second,-$Config/IntervalSeconds$,GETDATE()) AND GETDATE()

Now compose the module as usual…
Why is using this concept a good practice?

Monitoring interval is a standard override parameter, and inevitably it will be overridden – maybe not on this particular monitor, and maybe not until you’re long gone. But don’t assume the customer is going to keep the default interval – ever.

By coupling script time offsets to monitoring intervals, a basic interval override will not cause monitor state skewing.