OleDbProbe – RegLocation Configuration – Design Flaw

I’ve been messing around with the OleDbProbe module this week, and have hit a couple issues that have been a bit frustrating and cost quite a bit of time. The first issue I ran into is described here, and now I’ll tell you about another problem.

Taking a snippet from the module definition below, we can inspect the registry to set the DatabaseName and ServerName configuration values for the SQL connection string.

<ProbeActionModuleType ID="System.OleDbProbe" Accessibility="Public" PassThrough="false" Batching="false">
<Configuration>
<xsd:element name="ConnectionString" type="xsd:string"/>
<xsd:element name="Query" type="xsd:string" minOccurs="0" maxOccurs="1"/>
<xsd:element name="GetValue" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
<xsd:element name="IncludeOriginalItem" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
<xsd:element name="OneRowPerItem" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
<xsd:element name="DatabaseNameRegLocation" type="xsd:string" minOccurs="0" maxOccurs="1"/>
<xsd:element name="DatabaseServerNameRegLocation" type="xsd:string" minOccurs="0" maxOccurs="1"/>
<xsd:element name="QueryTimeout" type="xsd:integer" minOccurs="0" maxOccurs="1"/>
<xsd:element name="GetFetchTime" type="xsd:boolean" minOccurs="0" maxOccurs="1"/>
</Configuration>
<ModuleImplementation Isolation="Any">
<Native>
<ClassID>B5A35748-86F5-46A3-9BC2-F9A494E36B25</ClassID>
</Native>
</ModuleImplementation>
<OutputType>System.OleDbData</OutputType>
<InputType>System.BaseData</InputType>
</ProbeActionModuleType>

The example registry key path and string name provided on the MSDN page is SOFTWARE\Company\Product\1.0\DatabaseName and SOFTWARE\Company\Product\1.0\ServerName.

Ok – no problem. Good example. The problem is, it’s not an example – the string name must be exactly DatabaseName and ServerName.

It’s not exactly a useful configuration, unless your registry string names happen to conform to the name specified in the native code for the OleDbProbe module (which, of course, is hidden from us). For example, if you want to use this module to query the Operations Manager databases, forget about using this registry configuration option because the string names for either the Operational database or Data Warehouse database do not exactly match what the OleDbModule expects. I can say, without a doubt, this is a design flaw.

If you really want to use a registry location of your choice, you’ll need to add another module before the OleDbProbe to read the registry string name, and then pass that into the connection string element of the OleDbProbe module. (see below for an example)

Here are the events you might see on the target server that attempts to execute the workflow without having the expected registry string names.

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="HealthService" />
<EventID Qualifiers="49152">4511</EventID>
<Level>2</Level>
<Task>1</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2016-06-07T02:19:20.000000000Z" />
<EventRecordID>7520157</EventRecordID>
<Channel>Operations Manager</Channel>
<Computer>ms01.scomskills.com</Computer>
<Security />
</System>
- <EventData>
<Data>2012-SP1</Data>
<Data>Your.Workflow</Data>
<Data>ms01.scomskills.com</Data>
<Data>{2C420F32-475D-019B-F7D2-E6F92B60E0C0}</Data>
<Data>OleDbProbe</Data>
<Data>{B5A35748-86F5-46A3-9BC2-F9A494E36B25}</Data>
<Data>The system cannot find the file specified.</Data>
</EventData>
</Event>

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="Health Service Modules" />
<EventID Qualifiers="49152">11851</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2016-06-07T02:19:20.000000000Z" />
<EventRecordID>7520156</EventRecordID>
<Channel>Operations Manager</Channel>
<Computer>ms01.scomskills.com</Computer>
<Security />
</System>
- <EventData>
<Data>2012-SP1</Data>
<Data>Your.Workflow</Data>
<Data>ms01.scomskills.com</Data>
<Data>{2C420F32-475D-019B-F7D2-E6F92B60E0C0}</Data>
<Data><Configuration><ConnectionString>Provider=SQLOLEDB;Integrated Security=SSPI</ConnectionString><Query>Your Query</Query><GetValue>true</GetValue><OneRowPerItem>true</OneRowPerItem><DatabaseNameRegLocation>SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup\DataWarehouseDBName</DatabaseNameRegLocation><DatabaseServerNameRegLocation>SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup\DataWarehoueDBServerName</DatabaseServerNameRegLocation><QueryTimeout>60</QueryTimeout><GetFetchTime>false</GetFetchTime></Configuration></Data>
<Data>0x80070002</Data>
<Data>The system cannot find the file specified.</Data>
</EventData>
</Event>
 
Here’s an example to solve the problem. String these together in a module if you want to grab a registry string name/value of your choice.
 
<ProbeAction ID="RegistryProbe" TypeID="Windows!Microsoft.Windows.RegistryProbe">
<ComputerName>$Config/ComputerName$</ComputerName>
<RegistryAttributeDefinitions>
<RegistryAttributeDefinition>
<AttributeName>ServerName</AttributeName>
<Path>SOFTWARE\WhateverYouWant</Path>
<PathType>1</PathType>
<AttributeType>1</AttributeType>
</RegistryAttributeDefinition>
<RegistryAttributeDefinition>
<AttributeName>DatabaseName</AttributeName>
<Path>SOFTWARE\WhateverYouWant</Path>
<PathType>1</PathType>
<AttributeType>1</AttributeType>
</RegistryAttributeDefinition>
</RegistryAttributeDefinitions>
</ProbeAction>
<ProbeAction ID="Query" TypeID="System!System.OleDbProbe">
<ConnectionString>Provider=SQLOLEDB;Server=$Data/Values/ServerName$;Database=$Data/Values/DatabaseName$;Integrated Security=SSPI</ConnectionString>
<Query>$Config/Sql$</Query>
<GetValue>$Config/GetValue$</GetValue>
<OneRowPerItem>true</OneRowPerItem>
<QueryTimeout>$Config/QueryTimeoutSeconds$</QueryTimeout>
<GetFetchTime>false</GetFetchTime>
</ProbeAction>

OleDbProbe – VariantType=”0” – Empty

Today I found an issue with the OleDbProbe module in which I found no solution in my searches, so it was one of those cases where I needed to step through various debugging tools and scour through code to solve the problem. It’s can be rewarding, because it gives me something to write about, but it sure can be time consuming. I’ve used this module many times in the past, but this is the first time I’ve seen odd behavior in the results.

The Problem

I had plugged my query into the module and simulated the workflow, and the results of each column was <Column VariantType=”0” />, as shown below.

- <DataItems>
- <DataItem type="System.OleDbData" time="2016-06-04T14:15:45.9305344-05:00" sourceHealthServiceId="D4E9691B-9F54-0F31-786C-BAF110FB769F">
<HRResult>0</HRResult>
<ResultLength>8</ResultLength>
<Result>Success</Result>
<InitializationTime>909</InitializationTime>
<OpenTime>0</OpenTime>
<ExecutionTime>20</ExecutionTime>
<FetchTime>0</FetchTime>
<RowLength>5</RowLength>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
- <Columns>
<Column VariantType="0" />
</Columns>
<OriginalDataLength>0</OriginalDataLength>
<ErrorDescriptionLength>0</ErrorDescriptionLength>
<ResultCode>0</ResultCode>
</DataItem>
</DataItems>

As we can see from the results, the query succeeded. One would assume with a successful query comes valid and filled columns. Wondering why the results were all empty, I expanded the query selection to include all columns, and what I found was quite interesting. Below is just the columns section. As we can see here, about half the columns filled with data and with their corresponding types. And about half did not fill, with VariantType=”0”, which means it’s empty.

<Columns>
<Column VariantType="3">147</Column>
<Column VariantType="3">1</Column>
<Column VariantType="8">{A6B2A91A-BA8E-6D80-06BD-3F12BF28652D}</Column>
<Column VariantType="3">64</Column>
<Column VariantType="3">147</Column>
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="3">64</Column>
<Column VariantType="8">{EA99500D-8D52-FC52-B5A5-10DCD1E9D2BD}</Column>
<Column VariantType="3">8</Column>
<Column VariantType="8">Microsoft.Windows.Computer</Column>
<Column VariantType="8">Windows Computer</Column>
<Column VariantType="0" />
<Column VariantType="3">273</Column>
<Column VariantType="3">147</Column>
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="0" />
<Column VariantType="1" />
<Column VariantType="0" />
<Column VariantType="0" />
</Columns>

I thought there might be some sort of pattern to this, but I haven’t discovered it. I looked at the column properties in the database, and did not find anything resembling a consistent pattern that would indicate any reason why some columns are filled during execution and some are not. I gave up on trying to answer the question of “why” some columns refuse to fill, and found the solution by casting in the sql SELECT statement.

The Solution

The problem was solved by casting the columns that did not fill in the SELECT statement.

Here is the first query that returned empty columns.

SELECT me.Name
FROM vManagedEntity AS me
inner join vManagedEntityType AS met ON met.ManagedEntityTypeRowId = me.ManagedEntityTypeRowId
inner join vManagedEntityProperty AS mep ON mep.ManagedEntityRowId = me.ManagedEntityRowId
WHERE met.ManagedEntityTypeSystemName = 'Microsoft.Windows.Computer' AND
mep.ToDateTime IS NULL

And here is the modified query with CAST in the SELECT statement.

SELECT CAST (me.Name as NVARCHAR)
FROM vManagedEntity AS me
inner join vManagedEntityType AS met ON met.ManagedEntityTypeRowId = me.ManagedEntityTypeRowId
inner join vManagedEntityProperty AS mep ON mep.ManagedEntityRowId = me.ManagedEntityRowId
WHERE met.ManagedEntityTypeSystemName = 'Microsoft.Windows.Computer' AND
mep.ToDateTime IS NULL

That’s it folks. Hope this helps if you find yourself in the same bind as me.

Suppressing Module Events (event id 21405 example)

I was recently faced with an uncommon scenario, where I needed a script-based discovery to exit without submitting data. Exiting a script-based discovery without submitting discovery data, even if it’s an empty data item, is not considered a good practice. But in this particular scenario, I understood the reasons and the resulting instance space was minimal, so I was open to finding a trick that would accomplish this without having 21405 events logged on the agent-managed computer at each interval if the instance did not pass the test.

What I discovered was the CommandExecuterEventPolicyType in the System.CommandExecuterSchema. This is a Schema Type defined in the System.Library.

All script-based probes have a default event policy that describes how exit codes, standard error, and standard out are handled by the system. If any of these data streams match an event policy expression, the system will log an event to the Operations Manager log on that agent-managed computer.

For example, the Microsoft.Windows.ScriptProbeDiscoveryBase has a default policy as follows:

<DefaultEventPolicy> 
<StdOutMatches Operator="DoesNotMatchRegularExpression">&lt;DataItem.+/DataItem\b*&gt;}|{&lt;DataItem.*/&gt;}</StdOutMatches>
<StdErrMatches>\a+</StdErrMatches>
<ExitCodeMatches>[^0]+</ExitCodeMatches>
</DefaultEventPolicy>
 
In my particular scenario, this default policy was capturing an exited discovery script as an error and writing the following event 21405 on those agents.
 
The process started at 2:04:14 PM failed to create System.Discovery.Data, no errors detected in the output
 
By overriding the default event policy, without having to change anything else in the module, those errors built into the product may be ignored by specifying event policy configuration as follows.
 
<EventPolicy> 
<StdOutMatches>^suppress_event_21405$</StdOutMatches>
<StdErrMatches>^suppress_event_21405$</StdErrMatches>
</EventPolicy>

In this example, the override has been documented by specifying an expression that describes a reason for overriding the policy. This expression will never match actual standard out or standard error, so it serves two purposes.

Overriding a default event policy like this is very uncommon, and I do not recommend it unless you have a good understanding of how it may impact your workflow – but it is a nice trick to use if you find yourself in a unique situation that calls for it.

As a side note – this scenario probably could be handled in a more sophisticated way, by composing a new module that would filter the data stream before it even reaches a discovery module. This way would produce no errors, so overriding event policies would not be required. Food for thought!

Passing Boolean values to Powershell modules

Often times we need to pass a Boolean value to a Powershell provider, and a perfect example is for the purpose of a “debug” flag.

In my experience, Powershell probe and write action modules always type cast the script input parameter as String, whether the configuration element is of type Boolean or String. This means the default value in your workflow could be true, false, 1, or 0, and it will always equal true if you are handling this input parameter as Boolean in the script.

Obviously, this will cause some unexpected behaviors, and most likely will be viewed as a bug in the management pack.

The way I work around this issue (at least until this is fixed in the product) is NOT typing the script parameter, and then converting the variable to Boolean just after the param block like this:

param($debugFlag)
$debugFlag = [System.Convert]::ToBoolean($debugFlag)

By handling a Boolean input parameter like this in the script, we can effectively use the Boolean type in module configuration; the default value in the monitoring workflow can be true or false, the value will be handled appropriately by the script, the operator will see the default value as it should appear in console, and the operator will be presented with a meaningful (True|False) selector in the override interface.

UPDATE (09/02/2015) – This problem applies to Boolean, Integer, and Double values!

After writing this post, I wanted to get down to the bottom of the problem, so I poked around the Windows Library and what I found made it clear.

Taking a look at the implementation of any Powershell probe module, you will notice the Parameters element type is NamedParametersType, and this is defined in the Microsoft.Windows.PowerShellSchema. Notice the name-value pair are of type ID and string, respectively. This is the reason any type of value will be cast as String in a Powershell module.

<SchemaType ID="Microsoft.Windows.PowerShellSchema" Accessibility="Public"> 
<xsd:complexType name="NamedParametersType">
<xsd:sequence>
<xsd:element name="Parameter" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Name" type="xsd:ID" />
<xsd:element name="Value" type="xsd:string" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="SnapInsType">
<xsd:sequence>
<xsd:element name="SnapIn" minOccurs="0" maxOccurs="unbounded" type="xsd:string" />
</xsd:sequence>
</xsd:complexType>
<xsd:simpleType name="NonNullString">
<xsd:restriction base="xsd:string">
<xsd:minLength value="1" />
</xsd:restriction>
</xsd:simpleType>
</SchemaType>

It seems to me, the way to resolve this is to update the library with a choice for value type. This would allow the developer the option to type cast, and would remove the assumption that type casting is occurring when it is not. Since this schema exists in the Microsoft Windows Library, it would make the most sense for Microsoft to fix this.

Unable to add the domain to the subject

Quick break/fix post here, because I was unable to find the solution to this subject anywhere in the community blogs or in KB articles.

I ran into this error while attempting to perform a manual installation of the System Center 2012 SP1 Operations Manager agent on a Linux (Ubuntu 10.04) server.

Full context of this error is as follows:

jonathan@ubuntu-01:/etc/opt/microsoft/scx$ sudo dpkg -i /home/jonathan/scx-1.4.0-906.universald.1.x64.deb (Reading database ... 44245 files and directories currently installed.) Preparing to replace scx 1.4.0.906 (using .../scx-1.4.0-906.universald.1.x64.deb) ... * Shutting down Microsoft SCX CIM Server: [fail] invoke-rc.d: initscript scx-cimd, action "stop" failed. Unpacking replacement scx ... Setting up scx (1.4.0.906) ... Checking existence of /lib64/libssl.so.0.9.8k and /lib64/libcrypto.so.0.9.8k ... Checking existence of /lib64/libssl.so.0.9.8 and /lib64/libcrypto.so.0.9.8 ... Found /lib64/libssl.so.0.9.8 and /lib64/libcrypto.so.0.9.8 ... Generating certificate with hostname="ubuntu-01", domainname="scomskills.com." WARNING! Could not read 256 bytes of random data from /dev/random. Will revert to less secure /dev/urandom. See the security guide for how to regenerate certificates at a later time when more random data might be available. Error generating SSL certificate: 'Unable to add the domain to the subject.' dpkg: error processing scx (--install): subprocess installed post-installation script returned error exit status 3 Processing triggers for ureadahead ... Errors were encountered while processing: scx

Long story short, the problem was resolved by modifying the /etc/resolv.conf file. Specifically, removing the trailing “dot” at the end of scomskills.com..

Here is the full context after modifying that file, which resulted in a successful installation:

jonathan@ubuntu-01:/etc/opt/microsoft/scx/ssl$ sudo nano /etc/resolv.conf jonathan@ubuntu-01:/etc/opt/microsoft/scx/ssl$ sudo rm /etc/opt/microsoft/scx//ssl/scx-key.pem jonathan@ubuntu-01:/etc/opt/microsoft/scx/ssl$ sudo dpkg -i /home/jonathan/scx-1.4.0-906.universald.1.x64.deb (Reading database ... 44245 files and directories currently installed.) Preparing to replace scx 1.4.0.906 (using .../scx-1.4.0-906.universald.1.x64.deb) ... * Shutting down Microsoft SCX CIM Server: [fail] invoke-rc.d: initscript scx-cimd, action "stop" failed. Unpacking replacement scx ... Setting up scx (1.4.0.906) ... Checking existence of /lib64/libssl.so.0.9.8k and /lib64/libcrypto.so.0.9.8k ... Checking existence of /lib64/libssl.so.0.9.8 and /lib64/libcrypto.so.0.9.8 ... Found /lib64/libssl.so.0.9.8 and /lib64/libcrypto.so.0.9.8 ... Generating certificate with hostname="ubuntu-01", domainname="scomskills.com" WARNING! Could not read 256 bytes of random data from /dev/random. Will revert to less secure /dev/urandom. See the security guide for how to regenerate certificates at a later time when more random data might be available. * Starting Microsoft SCX CIM Server: [ OK ] Processing triggers for ureadahead ...

Notice that I first removed the scx-key.pem file that was generated by the failed install, and then ran the installer package again – don’t actually know if this was necessary, but I thought it might be best to clean it up. As you can see, the final result is a signing of the certificate and the SCX CIM Server started successfully.

A little more background to the problem (if your interested):

The sequence of events that (I believe) led to this situation was the fact that I initially had the Linux server directly connected to an internet accessible access point, and the Ubuntu box was also configured to receive its network settings via DHCP. For some reason, DHCP added an extra “dot” in the resolv.conf file domain lines, and this apparently was an invalid configuration in the certificate signing process.

Hope this helps someone out there in a similar situation.