An area where I have had little experience is server monitoring on Windows.  I've done a fair amount with Linux but never really had a need to do it on Windows.  I did some research and looked into Nagios and some commercial software solutions (all starting at around $20,000 with annual licensing fees) but I knew I wanted something smaller and easier to deal with - we're talking about a pair of servers that'll probably grow to a dozen or so over the next year.

I knew about Windows performance counters and I knew I could log them so I went and had a look to see if anyone had written an Excel spreadsheet to make the pretty pictures everyone enjoys.  When I resurfaced I came back with Performance Analysis of Logs (PAL) which is an incredible piece of VBScript that saved me tons of time and/or money (depending entirely on which path a parallel me chose in an alternate reality).  Essentially what PAL does is crunch the numbers in your binary or text Performance Monitor logs, processes them with profiles defined in XML files, and produces a great HTML report with graphics, links that has further explanation, and most importantly alerts.  PAL includes a GUI tool which you can use to point to a log file, choose a profile, answer some questions (four) about your server hardware, do some additional configuration, and generate a log file.

PAL is only part of the solution.  If you're like me, you'll want to retain those log files and reports for performance comparison in the future so you can see if your performance changes have the desired effect.  You'll need some place to store the logs (zip archiving them is best) and a sane way to manage the .htm reports it generates.  My suggestion (for the manual approach): use MSIE to save your PAL report in *.mht format.  Once saved in HTML Archive Format (.mht) you can delete the directory and *.htm file PAL creates and you have your report in a single file.  If you use Firefox, its okay, install IE Tab and direct all "file:///C:*" to view in MSIE (you may also find this useful for pointing to your Outlook Web Access URL if your organization uses Exchange).  Finally the *.mht files are viewable from within OS X, simply give them a *.eml extension and OS X's Mail.app will display them with no issues.  (It should also be noted that Firefox has its own web archive-like format you could use just as well, but in part 2 I'll talk about automating this process and include some source code that does this, and uses CDO to make *.mht files.)

To get started with PAL you'll need a few things: PAL, Microsoft's Log Parser, and Office 2003 (or later) Web Components.  Fortunately all of these items are free.

PAL (the betas have been working for me using SQL Server 2005 and IIS profiles, and... NICE 2008 HyperV support in beta 8!)

Microsoft Log Parser 2.2

Microsoft Office 2003 Web Components (later versions should work as well)

So great, enough information to be dangerous now.  But what should you record?  I record the following performance counters on our servers:

  • Memory\*
  • Paging File\*
  • Physical Disk\*
  • Processor(*)\*

PAL may not analyze every single counter but it does hit all the big ones and pretty much covers anything that may be important to you excluding custom performance counters you implement in your own code (more on that in part 3).  There are a number of other performance counters you can track depending on what purpose your server serves.  If its an IIS web or application host you can track (and display) IIS performance counters; likewise for SQL Server, Exchange, or a host of other applications.  PAL comes with many *.xml profile files that contain information on where and how to display the performance data in the reports it generates.

In terms of overhead I've read a lot of people claim there is little to no overhead, and its a greater risk to just not track performance (and setup alerts).  I agree with these statements and I have so far not found performance tracking to adversely impact the performance of our production servers.

What about scheduling PAL to run nightly?  While you can use perfmon.msc to configure the performance logs and counters they will track you cannot schedule re-occuring jobs.  Fortunately logman, a utility included in Windows, allows you to do just that.

logman update logname -b 07/14/2008 23:00:00 -e 07/15/2008 09:00:00 -r

This command will cause the performance log named "logman" to start recording at 11 PM server time on 7/14 and end recording at 9 AM server time on 7/15.  The important switch here, -r, will cause the schedule to be recurring so the following evening at 11 PM on 7/15 it will begin recording anew and end at 9 AM on 7/16.