SCOM / WMI

Operations Manager Failed to Start a Process

I hate this alert. As a SCOM admin, your main job is to keep SCOM healthy and ensure it’s monitoring. One of the first things I check in the morning is the Operations Manager Active Alerts (sorted by Repeat Count). This morning I found a few machines with a high number of “Operations Manager Failed to Start a Process” alerts. Here’s how I went about resolving one of them.

1. On the client server, looked in the OpsMgr event log. Too much red/yellow!
2. Noticed a lot of event IDs 21403 and 21402. None of the scripts/rules/monitors were running properly!
3. Found an event with a well-known script. These are scripts that remain in the Monitoring Host Temporary Files folder after being run and often are the first folders in the list. I.e. SCOMpercentageCPUTimeCounter.vbs, MemoryUtilization.vbs, WMIFunctionalCheck.vbs, etc.
4. Manually ran the script using the working directory and parameters from the eventlog entry.
5. The first time it ran and quickly returned the property bag.
6. Ran it again…this time it hung. (you can add “wscript.echo time” statements to the script to output the time started and finished if you want to know how long it ran).
7. I have two choices now, override the script timeout or see what is making the script run long. In this case the script ran quickly the first time, so I want to investigate why it ran long the second time.
8. Browsed to the vbs and opened it. I noticed it was making a WMI call to gather data.
9. Opened wbemtest and manually ran the WMI query from the script…and it too hung.
10. Now I know WMI is problematic, even though I see no errors in the application or system event logs. WMI is a very common data source for SCOM and very prone to issues. This would explain why so many rules/monitors were failing.
11. I applied WMI hotfixes recommended by Kevin Holman (http://blogs.technet.com/b/kevinholman/archive/2009/01/27/which-hotfixes-should-i-apply.aspx)
12. Rebooted the server and the error are gone!

Advertisements

3 thoughts on “Operations Manager Failed to Start a Process

    • Oliver – I applied all WMI hotfixes available for my OS. I believe that included the one you listed. If that doesn’t work for you, I’d recommend searching to see if newer WMI hotfixes have been released that you can apply.

      • Thanks Nicole. I’m always a bit wary about installing Hotfixes where my Symptoms aren’t exactly as the Hotfix description, so was looking for the specific KB that likely fixed this. Seeing similar probs on a Win 2012 server. Will try and see!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s