Move the Health Service State directory on the RMS

 

1. Ensure the destination volume is formatting to 64k allocation unit size for best performance.
To partition a drive: http://technet.microsoft.com/en-us/library/cc722475.aspx

2. If the Health Service State directory will not be located at the root of the new volume, create the destination folder that will host it.
Note: The Health Service State folder will be automatically created when the services are restarted, so it is not necessary to create this folder.

3. Stop all SCOM services (Health Service, Config Service and SDK Service).

4. Open REGEDIT and navigate to HKLM\System\CurrentControlSet\Services\HealthService\Parameters

5. Modify the State Directory string value with the path to the new location of the Health Service State directory.
Example: H:\Health Service State

6. Close the Registry Editor.

7. Start all SCOM services (Health Service, Config Service and SDK Service).

8. Open the new location that hosts the Health Service State folder and verify the contents have been created successfully.

9. Open the Operations Manager event log to verify there are no errors from source; Health Service, Health Service Modules, Health Service ESE Store.

Move RMS

 

· In my case, I had a clean install of Windows 2008 Enterprise. I installed the prereq’s, including POSH, and ran the installation wizard to install a new management server role. You’ll need:

o MSAA and SDK credentials

o SDK account is local administrator

o Current user is local administrator, member of SCOM Admins, and has SA on databases

· On the new MS, copy the ManagementServerConfigTool.exe tool from the SupportTools folder of the installation media to the SCOM program files directory.

· Backup the operational database.

· Note: it is not necessary to stop the SCOM services on the original RMS. This is all handled by the promotion process. Online doc states we need to stop the services if RMS is clustered, but I do not have a clustered RMS in my test environment.

· Open an administrative command prompt and navigate to the SCOM program files directory and run the secure storage backup executable to restore the encryption key:
SecureStorageBackup.exe Restore EncryptionKeyFile

· While still in the administrative command prompt, run the following command to promote to RMS role:
ManagementServerConfigTool.exe PromoteRMS
Note: there are additional options to add to the command if you plan to decommission the original RMS (/DeleteExistingRMS)

· If you plan to decommission or repurpose the original RMS hardware, skip this step. If you plan to use the original RMS as a MS, then start an administrative command prompt on the original RMS and run the following:
ManagementServerConfigTool.exe UpdateDemotedRMS
Note: this step was not required during my exercise.

Then delete the Health Service State directory and restart the Health Service. Ensure the SDK and Config services are configured as disabled.

· Logon to the Report Server role and navigate to the “…Reporting Services\ReportServer” directory, and open the rsreportserver.config file in notepad. Locate the two instances of <ServerName>ServerName</ServerName>, and update these to the new RMS computer name.

· While still logged onto the Report Server role, open registry editor and locate the following key:
HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft Operations Manager\3.0\Reporting
Update the DefaultSDKServiceMachine value to the name of the new RMS.

c:\Program Files\System Center Operations Manager 2007>SecureStorageBackup.exe r

estore c:\RMS-Enc-Key-01-20-2011.bin

Please enter the password to use for storage/retrieval

Key successfully restored from c:\RMS-Enc-Key-01-20-2011.bin

c:\Program Files\System Center Operations Manager 2007>ManagementServerConfigToo

l.exe PromoteRMS

Running this tool can cause irreversible damage to your Operations Manager DB. P

lease backup your DB before continuing. Continue the PromoteRMS action? (Y/N):y

Collecting discovered data, please be patient…

If you are working with a clustered RMS, make sure the cluster service resources

are offline before proceeding. Running this tool could cause irreversible dama

ge if cluster resources or cluster node services are brought back online without

following instructions in the OpsGuide. Continue the PromoteRMS action? (Y/N):

y

Starting OMSDK service on SRVRMS01..started

Updating local settings on SRVRMS01 for promotion.

Stopping HealthService service on SRVRMS01..stopped.

Warning: the OMSDK service on SRVRMS01 was set to Auto

Stopping OMSDK service on MG1-RMS.opsmgrlab.com..stopped.

Stopping OMCFG service on MG1-RMS.opsmgrlab.com…stopped.

Stopping HealthService service on MG1-RMS.opsmgrlab.com..stopped.

Updating class structure to reflect changes:

Promoting SRVRMS01.opsmgrlab.com

Demoting MG1-RMS.opsmgrlab.com

Please be patient this may take some time.

Adjusting DW old RMS: MG1-RMS.opsmgrlab.com new RMS: SRVRMS01.opsmgrlab.com

Starting HealthService service on SRVRMS01………………started

Starting OMCFG service on SRVRMS01…started

Updating local settings on MG1-RMS.opsmgrlab.com for demotion.

Warning: MG1-RMS.opsmgrlab.com already has a default sdk service

Starting HealthService service on MG1-RMS.opsmgrlab.com………………..start

ed

Warning: the OMCFG service on MG1-RMS.opsmgrlab.com was set to Auto

Warning: the OMSDK service on MG1-RMS.opsmgrlab.com was set to Auto

Updating parent health service on MG1-MS-01.opsmgrlab.com.

Stopping HealthService service on MG1-MS-01.opsmgrlab.com….stopped.

Failed to update secondary MS with the following error:

Could not find a part of the path ‘Microsoft.Windows.InternetInformation

Services.CommonLibrary.{EAF90020-2BB1-E014-CC68-8FD82CF9CB08}.{BF5BA96B-70B5-969

D-D96E-B8EDC37715F2}.xml’.

Please run the following command locally on MG1-MS-01.opsmgrlab.com

ManagementServerConfigTool.exe UpdateSecondaryMS

Verifying updated discovered data, please be patient…

PromoteRMS performed successfully

c:\Program Files\System Center Operations Manager 2007>

c:\Program Files\System Center Operations Manager 2007>ManagementServerConfigToo

l.exe UpdateSecondaryMS

Running this tool can cause irreversible damage to your Operations Manager DB. P

lease backup your DB before continuing. Continue the UpdateSecondaryMS action?

(Y/N):y

Collecting discovered data, please be patient…

Updating parent health service on MG1-MS-01.

Starting HealthService service on MG1-MS-01……………………started

Verifying updated discovered data, please be patient…

UpdateSecondaryMS performed successfully

c:\Program Files\System Center Operations Manager 2007>

Steps that were outlined in the online documentation (http://technet.microsoft.com/en-us/library/cc540401.aspx) that I did not need to do.

· Set the broker service to 1 after the promotion, but I didn’t have to do anything here because the broker service was still enabled.

· Update the web console server Web.Config file. I didn’t have one to update in my lab, so no output to share on that step.

· Currently, the online documentation states to update the data warehouse security (user mapping) for the SDK account. This is incorrect and will be updated probably by the time you read this. It should be prefixed with a clause that we need to update the SDK user mappings only if you are running the SDK service under the context of Local System (which is an uncommon configuration). Additioinally, it’s the OperationsManager database that needs to be updated, not the OperationsManagerDW database.

· Run UpdateDemotedRMS on the original RMS.

Connecting to console

clip_image002

clip_image004

Agent connections

I happened to have all my agents configured to report to the RMS prior to the promotion process just to see how the failover process would be handled. Looks like all agent automatically failed over to the promoted RMS with no problems.

clip_image006

Alerts during the move

After the move I took a look at any alerts that were raised during the move. There were two alerts raised and closed. But no other alerts raised during the move. I suspect the HS HB failure from the MS was a side-effect of it losing the parent HS relationship with the MG. This was resolved, though, after running the ManagementServerConfigTool on the secondary MS.

clip_image008

Management Server health

Everything looks great, with one exception. According to the online documentation, I am led to believe that the original RMS should no longer be in service, but it is apparently still serving as a MS role (MG1-RMS).

clip_image010

Just for fun, I decided to run the ManagementServerConfigTool UpdateDemotedRMS command on the original RMS, and the output states that this is already marked as a secondary management server.

c:\Program Files\System Center Operations Manager 2007>ManagementServerConfigToo

l.exe UpdateDemotedRMS

Running this tool can cause irreversible damage to your Operations Manager DB. P

lease backup your DB before continuing. Continue the UpdateDemotedRMS action? (

Y/N):y

Collecting discovered data, please be patient…

Updating local settings on MG1-RMS for demotion.

Stopping HealthService service on MG1-RMS..stopped.

Warning: MG1-RMS is already marked as a secondary RMS!

Warning: MG1-RMS already has a parent health service

Warning: MG1-RMS already has a default sdk service

Starting HealthService service on MG1-RMS…………………started

Warning: the OMCFG service on MG1-RMS was set to Disabled

Warning: the OMSDK service on MG1-RMS was set to Disabled

Verifying updated discovered data, please be patient…

UpdateDemotedRMS performed successfully

c:\Program Files\System Center Operations Manager 2007>

So it seems that if you want to repurpose or decommission the original RMS, you may need to uninstall the SCOM MS role before wiping the machine, otherwise it will not be removed as a parent health service.

I went ahead and uninstalled the SCOM program files from the original RMS and turned off the computer. Now it shows as a grey MS.

clip_image012

I deleted the original RMS from the administration space, and it’s gone.

clip_image014

After performing these steps to manually remove the old RMS, I took another look at the command line options for promoting to a RMS and noticed there is a /DeleteExistingRMS:<true|false> switch. When running PromoteRMS this option defaults to false. If I would have explicitly specified true for this option, the promotion process would have also removed the original RMS.

ManagementServerConfigTool.exe <Action> /vs:<Virtual Server NetBios Name>

[/node:<Virtual Server Node> /Disk:<Virtual Server Disk Resource>

/DeleteExistingRMS:<true|false>]

After removing the original RMS, I received two alerts. One for HS HB failure, which I’d expect. But another referring to AD Integration problems.

clip_image016

clip_image018

A container for the management group MG1 either does not exist in domain opsmgrlab.com or the Run As Account associated with the AD based agent assignment rule does not have access to the container. Please run MomADAdmin for this Management Group before configuring assignment rules and make sure the associated Run As Account is the member of the Operations Manager Administrator role
Workflow name: CleanerOf__OPSMGRLAB_MG1_MS_01_opsmgrlab.com
Instance name: SRVRMS01.opsmgrlab.com
Instance ID: {5BF207CA-00AF-2E5F-3236-CF9CDAED0690}
Management group: MG1

clip_image020

clip_image022

What this means is I need to add the new RMS computer account to the Operations Manager container in AD in order for the AD Int rules to work again.

If we take a look at the effective permissions on the Operations Manager container for the original RMS, we see it has List contents, Read all properties, Delete subtree and Read permissions.

clip_image024

On this object and all descendant objects.

clip_image026

I deleted the original RMS computer account and added the new RMS computer account, mirroring these permissions on the Operations Manager container.

Note: this is only on the MG container within the Operations Manager container that corresponds to the MG where you’ve performed the RMS promotion. If you have multiple MG’s using AD Int, do not update those containers.

clip_image028

After restarting the Health Service on the RMS, the AD Integration rule succeeded. I verified this by deleting all AD Integration rules for one of the MS’s, and these objects were removed from the Operations Manager container within a couple minutes.

SPN cleanup

I had manually added the SPN’s for the SDK account in my lab, so I needed to make sure the SPN’s that reference the original RMS were removed. I opened ADSIEdit.msc on a DC, navigated to the SDK account and removed the highlighted SPN’s.

clip_image030

I think that about does it. The RMS role has been moved, the original RMS has been decommissioned, additional cleanup tasks were completed and the MG is functioning. Somewhat of a rough ride, but this was a good exercise.

If SCOM is a mission critical application at your company, I suggest going through this exercise on a regular basis. I remember in previous systems engineer roles having to go through these types of DR procedures about once every year. It always sidetracked whatever project I was working on, but every year we improved recovery time.

SCOM Verbose Tracing

Capture verbose tracing for an Agent or Management Server

This section describes how to capture the trace on the Agent or Management Server.

Prepare for tracing

  • Logon to the Agent computer or Management Server
  • Open a command prompt and navigate to the Operations Manager “tools” directory.
    • Usually located in “%ProgramFiles%\System Center Operations Manager 2007\Tools”
  • Enter the following command to stop the current trace.
    • StopTracing.cmd

· In Windows Explorer, navigate to “%windir%\Temp\OpsMgrTrace“, and delete all files in this directory. This is the current and previous trace data, which can be discarded now.

Start tracing

  • Switch back to the command prompt and enter the following command to start a new verbose trace.
    • StartTracing.cmd VER
      Note: “VER” must be capitalized as shown above.

· Keep this command prompt open, as you’ll need to switch back to it in order to stop and format the trace data later.

Wait for issue to reproduce

It is important to wait for the issue to reproduce, and stop tracing immediately after it does. Monitor the problem closely, so that tracing can be stopped immediately after problem occurs.

Stop tracing

  • Switch back to the Agent or Management Server and enter the following commands in the command prompt we had left open.
    • StopTracing.cmd
    • FormatTracing.cmd

· After formatting completes, Windows Explorer should open to the %WINDIR%\Temp\OpsMgrTrace directory, displaying the files containing the trace data.

  • Select all files and send to compressed folder, as shown below.

· Send the compressed file to your Microsoft support engineer for analysis.
Note: if the compressed file is more than 10MB, as your support engineer to create a FTP workspace to upload the files. Otherwise, it is usually okay to send via email.

Capture verbose tracing for an Operations Manager Console session

This section describes how to capture a verbose trace for a particular Operations Manager Console session.

Prepare for tracing

  • Logon to the computer in which you’d like to capture a trace of the Console session.
  • Open a command prompt and navigate to the Operations Manager “tools” directory.
    • Usually located in “%ProgramFiles%\System Center Operations Manager 2007\Tools”

Note: Administrator may be required to run tracing on client computers, because the trace writes to the Windows directory, which is usually a protected file system. If there are issues with capturing the trace, ensure that the command prompt is launched under the Administrator account, as shown below.

  • Enter the following command to stop the current trace.
    • StopTracing.cmd

· In Windows Explorer, navigate to “%windir%\Temp\OpsMgrTrace“, and delete all files in this directory. This is the current and previous trace data, which can be discarded now.

Start tracing

  • Switch back to the command prompt and enter the following command to start a new verbose trace.
    • StartTracing.cmd VER
      Note: “VER” must be capitalized as shown above.

· Keep this command prompt open, as you’ll need to switch back to it in order to stop and format the trace data later.

Wait for issue to reproduce

It is important to wait for the issue to reproduce, and stop tracing immediately after it does. Monitor the problem closely, so that tracing can be stopped immediately after problem occurs.

Stop tracing

  • Switch back to the computer hosting the console session and enter the following commands in the command prompt we had left open.
    • StopTracing.cmd
    • FormatTracing.cmd

· After formatting completes, Windows Explorer should open to the %WINDIR%\Temp\OpsMgrTrace directory, displaying the files containing the trace data.

  • Select all files and send to compressed folder, as shown below.

· Send the compressed file to your Microsoft support engineer for analysis.
Note: if the compressed file is more than 10MB, ask your support engineer to create a FTP workspace to upload the files. Otherwise, it is usually okay to send via email.

Group health rollup–increase RMS performance

Disable computer group health rollup across the board
Override Target Context Parameter Default Value Override Value Scope Management Pack Object Type Enforced Target Override Management Pack Target Management Pack Override Target Management Pack
Computer Security Health Rollup Computer Group Enabled TRUE FALSE Class Monitor *FALSE Computer Group System Center Core Monitoring (overrides) System Center Core Library System Center Core Library
Computer Performance Health Rollup Computer Group Enabled TRUE FALSE Class Monitor *FALSE Computer Group System Center Core Monitoring (overrides) System Center Core Library System Center Core Library
Computer Configuration Health Rollup Computer Group Enabled TRUE FALSE Class Monitor *FALSE Computer Group System Center Core Monitoring (overrides) System Center Core Library System Center Core Library
Computer Availability Health Rollup Computer Group Enabled TRUE FALSE Class Monitor *FALSE Computer Group System Center Core Monitoring (overrides) System Center Core Library System Center Core Library
*You may set the enforced parameter to TRUE to force this setting in case of override conflict (not recommended).
If you disabled across the board, but want to enable only for a specific group
Override Target ` Parameter Default Value Override Value Scope Management Pack Object Type Enforced Target Override Management Pack Target Management Pack Override Target Management Pack
Computer Security Health Rollup Windows Server 2008 Computer Group Enabled TRUE TRUE Class Monitor *FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library
Computer Performance Health Rollup Windows Server 2008 Computer Group Enabled TRUE TRUE Class Monitor *FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library
Computer Configuration Health Rollup Windows Server 2008 Computer Group Enabled TRUE TRUE Class Monitor *FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library
Computer Availability Health Rollup Windows Server 2008 Computer Group Enabled TRUE TRUE Class Monitor *FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library
*Enforced parameter is not necessary to resolve override conflicts in this case.
If you do not want to disable across the board, but only for specific group
Override Target Context Parameter Default Value Override Value Scope Management Pack Object Type Enforced Target Override Management Pack Target Management Pack Override Target Management Pack
Computer Security Health Rollup Windows Server 2008 Computer Group Enabled TRUE FALSE Class Monitor FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library
Computer Performance Health Rollup Windows Server 2008 Computer Group Enabled TRUE FALSE Class Monitor FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library
Computer Configuration Health Rollup Windows Server 2008 Computer Group Enabled TRUE FALSE Class Monitor FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library
Computer Availability Health Rollup Windows Server 2008 Computer Group Enabled TRUE FALSE Class Monitor FALSE Windows Server 2008 Computer Group System Center Core Monitoring (overrides) Windows Server 2008 Operating System (Discovery) System Center Core Library