Management Pack Variable Notation Default Values

This is not really news – more so a reminder for myself that might also help someone else.

We’ve been aware of using default values in various areas of SCOM, especially in command channel and notification workflows. I’ve found this helpful in the past, but where I’ve used default values most often is in variable notation elements when developing management packs. It’s especially useful when using modules such as the System.OleDbProbe modules, because often times when we query a database in a workflow the results may be empty; there are some reliable and consistent data returned using this module, but if no records are returned then we need might need to use a default variable if further processing is required, like in a script write action.

Here is an example of using a default value in a recovery, which is helpful if no records are returned by a query probe in a diagnostic.

<Value>$Data/[Default='']/Diagnostic/DataItem/RowLength$</Value>

Also, a little tip for getting all data if a simulation isn’t possible and you’re working blindly, as is the case when developing diagnostic and recovery tasks; you can dump the entire data stream into the recovery output by simply using <Value>$Data$</Value>.

Now you can reference something like this article to figure out the value xpath.

OleDbProbe – RegLocation Configuration – Design Flaw

I’ve been messing around with the OleDbProbe module this week, and have hit a couple issues that have been a bit frustrating and cost quite a bit of time. The first issue I ran into is described here, and now I’ll tell you about another problem.

Taking a snippet from the module definition below, we can inspect the registry to set the DatabaseName and ServerName configuration values for the SQL connection string.

<ProbeActionModuleType ID="System.OleDbProbe" Accessibility="Public" PassThrough="false" Batching="false">
<Configuration>
<xsd:element name="ConnectionString" type="xsd:string"/>
<xsd:element name="Query" type="xsd:string" minOccurs="0" maxOccurs="1"/>
<xsd:element name="GetValue" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
<xsd:element name="IncludeOriginalItem" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
<xsd:element name="OneRowPerItem" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
<xsd:element name="DatabaseNameRegLocation" type="xsd:string" minOccurs="0" maxOccurs="1"/>
<xsd:element name="DatabaseServerNameRegLocation" type="xsd:string" minOccurs="0" maxOccurs="1"/>
<xsd:element name="QueryTimeout" type="xsd:integer" minOccurs="0" maxOccurs="1"/>
<xsd:element name="GetFetchTime" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
</Configuration>
<ModuleImplementation Isolation="Any">
<Native>
<ClassID>B5A35748-86F5-46A3-9BC2-F9A494E36B25</ClassID>
</Native>
</ModuleImplementation>
<OutputType>System.OleDbData</OutputType>
<InputType>System.BaseData</InputType>
</ProbeActionModuleType>

The example registry key path and string name provided on the MSDN page is SOFTWARE\Company\Product\1.0\DatabaseName and SOFTWARE\Company\Product\1.0\ServerName.

Ok – no problem. Good example. The problem is, it’s not an example – the string name must be exactly DatabaseName and ServerName.

It’s not exactly a useful configuration, unless your registry string names happen to conform to the name specified in the native code for the OleDbProbe module (which, of course, is hidden from us). For example, if you want to use this module to query the Operations Manager databases, forget about using this registry configuration option because the string names for either the Operational database or Data Warehouse database do not exactly match what the OleDbModule expects. I can say, without a doubt, this is a design flaw.

If you really want to use a registry location of your choice, you’ll need to add another module before the OleDbProbe to read the registry string name, and then pass that into the connection string element of the OleDbProbe module. (see below for an example)

Here are the events you might see on the target server that attempts to execute the workflow without having the expected registry string names.

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="HealthService" />
<EventID Qualifiers="49152">4511</EventID>
<Level>2</Level>
<Task>1</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2016-06-07T02:19:20.000000000Z" />
<EventRecordID>7520157</EventRecordID>
<Channel>Operations Manager</Channel>
<Computer>ms01.scomskills.com</Computer>
<Security />
</System>
- <EventData>
<Data>2012-SP1</Data>
<Data>Your.Workflow</Data>
<Data>ms01.scomskills.com</Data>
<Data>{2C420F32-475D-019B-F7D2-E6F92B60E0C0}</Data>
<Data>OleDbProbe</Data>
<Data>{B5A35748-86F5-46A3-9BC2-F9A494E36B25}</Data>
<Data>The system cannot find the file specified.</Data>
</EventData>
</Event>

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="Health Service Modules" />
<EventID Qualifiers="49152">11851</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2016-06-07T02:19:20.000000000Z" />
<EventRecordID>7520156</EventRecordID>
<Channel>Operations Manager</Channel>
<Computer>ms01.scomskills.com</Computer>
<Security />
</System>
- <EventData>
<Data>2012-SP1</Data>
<Data>Your.Workflow</Data>
<Data>ms01.scomskills.com</Data>
<Data>{2C420F32-475D-019B-F7D2-E6F92B60E0C0}</Data>
<Data><Configuration><ConnectionString>Provider=SQLOLEDB;Integrated Security=SSPI</ConnectionString><Query>Your Query</Query><GetValue>true</GetValue><OneRowPerItem>true</OneRowPerItem><DatabaseNameRegLocation>SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup\DataWarehouseDBName</DatabaseNameRegLocation><DatabaseServerNameRegLocation>SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup\DataWarehoueDBServerName</DatabaseServerNameRegLocation><QueryTimeout>60</QueryTimeout><GetFetchTime>false</GetFetchTime></Configuration></Data>
<Data>0x80070002</Data>
<Data>The system cannot find the file specified.</Data>
</EventData>
</Event>
 
Here’s an example to solve the problem. String these together in a module if you want to grab a registry string name/value of your choice.
 
<ProbeAction ID="RegistryProbe" TypeID="Windows!Microsoft.Windows.RegistryProbe">
<ComputerName>$Config/ComputerName$</ComputerName>
<RegistryAttributeDefinitions>
<RegistryAttributeDefinition>
<AttributeName>ServerName</AttributeName>
<Path>SOFTWARE\WhateverYouWant</Path>
<PathType>1</PathType>
<AttributeType>1</AttributeType>
</RegistryAttributeDefinition>
<RegistryAttributeDefinition>
<AttributeName>DatabaseName</AttributeName>
<Path>SOFTWARE\WhateverYouWant</Path>
<PathType>1</PathType>
<AttributeType>1</AttributeType>
</RegistryAttributeDefinition>
</RegistryAttributeDefinitions>
</ProbeAction>
<ProbeAction ID="Query" TypeID="System!System.OleDbProbe">
<ConnectionString>Provider=SQLOLEDB;Server=$Data/Values/ServerName$;Database=$Data/Values/DatabaseName$;Integrated Security=SSPI</ConnectionString>
<Query>$Config/Sql$</Query>
<GetValue>$Config/GetValue$</GetValue>
<OneRowPerItem>true</OneRowPerItem>
<QueryTimeout>$Config/QueryTimeoutSeconds$</QueryTimeout>
<GetFetchTime>false</GetFetchTime>
</ProbeAction>

OleDbProbe – VariantType=”0” – Empty

Today I found an issue with the OleDbProbe module in which I found no solution in my searches, so it was one of those cases where I needed to step through various debugging tools and scour through code to solve the problem. It’s can be rewarding, because it gives me something to write about, but it sure can be time consuming. I’ve used this module many times in the past, but this is the first time I’ve seen odd behavior in the results.

The Problem

I had plugged my query into the module and simulated the workflow, and the results of each column was <Column VariantType=”0” />, as shown below.

- <DataItems>
- <DataItem type="System.OleDbData" time="2016-06-04T14:15:45.9305344-05:00" sourceHealthServiceId="D4E9691B-9F54-0F31-786C-BAF110FB769F">
<HRResult>0</HRResult>
<ResultLength>8</ResultLength>
<Result>Success</Result>
<InitializationTime>909</InitializationTime>
<OpenTime>0</OpenTime>
<ExecutionTime>20</ExecutionTime>
<FetchTime>0</FetchTime>
<RowLength>5</RowLength>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
<OriginalDataLength>0</OriginalDataLength>
<ErrorDescriptionLength>0</ErrorDescriptionLength>
<ResultCode>0</ResultCode>
</DataItem>
</DataItems>

As we can see from the results, the query succeeded. One would assume with a successful query comes valid and filled columns. Wondering why the results were all empty, I expanded the query selection to include all columns, and what I found was quite interesting. Below is just the columns section. As we can see here, about half the columns filled with data and with their corresponding types. And about half did not fill, with VariantType=”0”, which means it’s empty.

<Columns>
<Column VariantType="3">147</Column>
<Column VariantType="3">1</Column>
<Column VariantType="8">{A6B2A91A-BA8E-6D80-06BD-3F12BF28652D}</Column>
<Column VariantType="3">64</Column>
<Column VariantType="3">147</Column>
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="3">64</Column>
<Column VariantType="8">{EA99500D-8D52-FC52-B5A5-10DCD1E9D2BD}</Column>
<Column VariantType="3">8</Column>
<Column VariantType="8">Microsoft.Windows.Computer</Column>
<Column VariantType="8">Windows Computer</Column>
<Column VariantType="0" />
<Column VariantType="3">273</Column>
<Column VariantType="3">147</Column>
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="1" />
<Column VariantType="0" />
<Column VariantType="0" />
</Columns>

I thought there might be some sort of pattern to this, but I haven’t discovered it. I looked at the column properties in the database, and did not find anything resembling a consistent pattern that would indicate any reason why some columns are filled during execution and some are not. I gave up on trying to answer the question of “why” some columns refuse to fill, and found the solution by casting in the sql SELECT statement.

The Solution

The problem was solved by casting the columns that did not fill in the SELECT statement.

Here is the first query that returned empty columns.

SELECT me.Name
FROM vManagedEntity AS me
inner join vManagedEntityType AS met ON met.ManagedEntityTypeRowId = me.ManagedEntityTypeRowId
inner join vManagedEntityProperty AS mep ON mep.ManagedEntityRowId = me.ManagedEntityRowId
WHERE met.ManagedEntityTypeSystemName = 'Microsoft.Windows.Computer' AND
mep.ToDateTime IS NULL

And here is the modified query with CAST in the SELECT statement.

SELECT CAST (me.Name as NVARCHAR)
FROM vManagedEntity AS me
inner join vManagedEntityType AS met ON met.ManagedEntityTypeRowId = me.ManagedEntityTypeRowId
inner join vManagedEntityProperty AS mep ON mep.ManagedEntityRowId = me.ManagedEntityRowId
WHERE met.ManagedEntityTypeSystemName = 'Microsoft.Windows.Computer' AND
mep.ToDateTime IS NULL

That’s it folks. Hope this helps if you find yourself in the same bind as me.

Suppressing Module Events (event id 21405 example)

I was recently faced with an uncommon scenario, where I needed a script-based discovery to exit without submitting data. Exiting a script-based discovery without submitting discovery data, even if it’s an empty data item, is not considered a good practice. But in this particular scenario, I understood the reasons and the resulting instance space was minimal, so I was open to finding a trick that would accomplish this without having 21405 events logged on the agent-managed computer at each interval if the instance did not pass the test.

What I discovered was the CommandExecuterEventPolicyType in the System.CommandExecuterSchema. This is a Schema Type defined in the System.Library.

All script-based probes have a default event policy that describes how exit codes, standard error, and standard out are handled by the system. If any of these data streams match an event policy expression, the system will log an event to the Operations Manager log on that agent-managed computer.

For example, the Microsoft.Windows.ScriptProbeDiscoveryBase has a default policy as follows:

<DefaultEventPolicy> 
<StdOutMatches Operator="DoesNotMatchRegularExpression">&lt;DataItem.+/DataItem\b*&gt;}|{&lt;DataItem.*/&gt;}</StdOutMatches>
<StdErrMatches>\a+</StdErrMatches>
<ExitCodeMatches>[^0]+</ExitCodeMatches>
</DefaultEventPolicy>
 
In my particular scenario, this default policy was capturing an exited discovery script as an error and writing the following event 21405 on those agents.
 
The process started at 2:04:14 PM failed to create System.Discovery.Data, no errors detected in the output
 
By overriding the default event policy, without having to change anything else in the module, those errors built into the product may be ignored by specifying event policy configuration as follows.
 
<EventPolicy> 
<StdOutMatches>^suppress_event_21405$</StdOutMatches>
<StdErrMatches>^suppress_event_21405$</StdErrMatches>
</EventPolicy>

In this example, the override has been documented by specifying an expression that describes a reason for overriding the policy. This expression will never match actual standard out or standard error, so it serves two purposes.

Overriding a default event policy like this is very uncommon, and I do not recommend it unless you have a good understanding of how it may impact your workflow – but it is a nice trick to use if you find yourself in a unique situation that calls for it.

As a side note – this scenario probably could be handled in a more sophisticated way, by composing a new module that would filter the data stream before it even reaches a discovery module. This way would produce no errors, so overriding event policies would not be required. Food for thought!