Archive for August, 2008

Windows Server Performance Monitoring and PAL (Part 1)

Comments

An area where I have had little experience is server monitoring on Windows.  I've done a fair amount with Linux but never really had a need to do it on Windows.  I did some research and looked into Nagios and some commercial software solutions (all starting at around $20,000 with annual licensing fees) but I knew I wanted something smaller and easier to deal with - we're talking about a pair of servers that'll probably grow to a dozen or so over the next year.

I knew about Windows performance counters and I knew I could log them so I went and had a look to see if anyone had written an Excel spreadsheet to make the pretty pictures everyone enjoys.  When I resurfaced I came back with Performance Analysis of Logs (PAL) which is an incredible piece of VBScript that saved me tons of time and/or money (depending entirely on which path a parallel me chose in an alternate reality).  Essentially what PAL does is crunch the numbers in your binary or text Performance Monitor logs, processes them with profiles defined in XML files, and produces a great HTML report with graphics, links that has further explanation, and most importantly alerts.  PAL includes a GUI tool which you can use to point to a log file, choose a profile, answer some questions (four) about your server hardware, do some additional configuration, and generate a log file.

PAL is only part of the solution.  If you're like me, you'll want to retain those log files and reports for performance comparison in the future so you can see if your performance changes have the desired effect.  You'll need some place to store the logs (zip archiving them is best) and a sane way to manage the *.htm reports it generates.  My suggestion (for the manual approach): use MSIE to save your PAL report in *.mht format.  Once saved in HTML Archive Format (*.mht) you can delete the directory and *.htm file PAL creates and you have your report in a single file.  If you use Firefox, its okay, install IE Tab and direct all "file:///C:\*" to view in MSIE (you may also find this useful for pointing to your Outlook Web Access URL if your organization uses Exchange).  Finally the *.mht files are viewable from within OS X, simply give them a *.eml extension and OS X's Mail.app will display them with no issues.  (It should also be noted that Firefox has its own web archive-like format you could use just as well, but in part 2 I'll talk about automating this process and include some source code that does this, and uses CDO to make *.mht files.)

To get started with PAL you'll need a few things: PAL, Microsoft's Log Parser, and Office 2003 (or later) Web Components.  Fortunately all of these items are free.

PAL (the betas have been working for me using SQL Server 2005 and IIS profiles, and... NICE 2008 HyperV support in beta 8!)

Microsoft Log Parser 2.2

Microsoft Office 2003 Web Components (later versions should work as well)

So great, enough information to be dangerous now.  But what should you record?  I record the following performance counters on our servers:

  • Memory\*
  • Paging File\*
  • Physical Disk\*
  • Processor(*)\*

PAL may not analyze every single counter but it does hit all the big ones and pretty much covers anything that may be important to you excluding custom performance counters you implement in your own code (more on that in part 3).  There are a number of other performance counters you can track depending on what purpose your server serves.  If its an IIS web or application host you can track (and display) IIS performance counters; likewise for SQL Server, Exchange, or a host of other applications.  PAL comes with many *.xml profile files that contain information on where and how to display the performance data in the reports it generates.

In terms of overhead I've read a lot of people claim there is little to no overhead, and its a greater risk to just not track performance (and setup alerts).  I agree with these statements and I have so far not found performance tracking to adversely impact the performance of our production servers.

What about scheduling PAL to run nightly?  While you can use perfmon.msc to configure the performance logs and counters they will track you cannot schedule re-occuring jobs.  Fortunately logman, a utility included in Windows, allows you to do just that.

logman update logname -b 07/14/2008 23:00:00 -e 07/15/2008 09:00:00 -r

This command will cause the performance log named "logman" to start recording at 11 PM server time on 7/14 and end recording at 9 AM server time on 7/15.  The important switch here, -r, will cause the schedule to be recurring so the following evening at 11 PM on 7/15 it will begin recording anew and end at 9 AM on 7/16.


SharePoint, WebDAV, and http (not https)

Comments

Our company loves to co-locate services.  Source control, production servers, HR, you name it, if we can get it out of the building, we do it.  So it comes as no surprise that our document management/repository exists out on the internet hosted by a service provider using Microsoft SharePoint.

A very irksome issue with SharePoint and Vista is that for many people SharePoint shares aren't accessible as mappable network drives or through the common dialog for opening/saving files in Office 2007.  People upgrading from Windows XP find they suddenly just cannot access their previously mapped and usable SharePoint shares.  This problem occurs when you use Vista and attempt to map to a share across http because Vista, coming configured for a much more stringent security model, doesn't do WebDAV over http, only https.  If you are having issues with a Windows 2008 server connecting to a SharePoint (or WebDAV) share over http I would imagine the solution below will work for you as well.  (And yes, SharePoint or WebDAV over http is probably a silly idea in the first place... obviously if its within your power to https-ify the share that's likely the best solution - even if its via a self-signed cert).  (Update: Windows Server 2003 instructions appear below.)

(I ran across a lot of people having this problem, some solutions to similar problems, but the most informative source was a post I found in Robert McMurray's blog.)

  1. Fire up regedit.exe (if UAC is enabled you'll have to approve it of course)
  2. Navigate to the key HKEY_LOCAL_MACHINE\
    SYSTEM\CurrentControlSet\Services\WebClient\Parameters
  3. Change the setting BasicAuthLevel from its current value (default 1) to 2 (0 means disabled, 1 is https only, 2 is both http and https)
  4. Optional. Change the setting ServerNotFoundCacheLifeTimeInSec from its current value (default 60) to 0 (change from hexadecimal to decimal) (I made this change just because I had been hammering on a box trying to fix the issue, I didn't want a cache issue to be my undoing)
  5. Right-click a shortcut to cmd.exe and choose Run As Administrator (or just fire up cmd.exe if you have disabled UAC)
  6. Enter the following commands:
    1. net stop webclient
    2. net start webclient
    3. net use x: "http://yoursharepointprovider.tld/somesharename" /user:"whateverdomain\yourusername" /persistent:yes
    4. Enter your password when prompted

The net use command is optional, you can map using the Windows Map Network Drive functionality or even open the share using the common file dialog in Office.

Note that this solution does not fix the fact that with most shared SharePoint hosting providers you cannot map to the root share (that is, http://yoursharepointprovider.tld); you must choose a share.  There are ways around this if you have access to the IIS instance hosting SharePoint but since that wasn't my problem I did not explore them.

And yes, what this means is that if you are automating uploading files to SharePoint you don't have to use the SharePoint web service API or the SharePoint SDK to perform such a simple task - you can just use File.Copy and away you go!

Update:

For Windows 2003 server the DWORD value you must add to the registry under the Parameters key above is UseBasicAuth and set its value to 1 and you'll also want to update the AcceptOfficeAndTahoeServers key from 0 to 1.  You cannot just net stop/start webclient; instead you must net stop mrxdav then net start webclient.  Finally, and this is important if you are using things such as "%20" to represent spaces in your folder names - don't; instead just use spaces or the equivalent character.  I found this out from KB841215.


Flames of War: Dawn Attack on Aa Canal

Comments

(This post is very late; the scenario we were playing was being play tested for Historicon 2008.  Because Ididn't take copious notes there isn't a whole lot of meat to this post.)

[singlepic=57,320,240,,left]Dawn Attack on the Aa Canal was a scenario ran at the Game Vault for play testing purposes before the scenario was ran at Historicon 2008.  The game was ran by Ron Bingham from the Battle Barn and the scenario had already underwent some previous play testing tweaks.  All of the miniatures and terrain you see came from Battle Barn, only the hands of the people playing [mostly] don't belong to them.

The scenario was really fun and well laid out.  I'd go into more details but I don't have scanned copies of the maps available so it wouldn't be very informative.

View all images in gallery


Grid Computing: Amazon EC2, Mosso, GoGrid

Comments

I work for a startup where we have what I think would be a fairly common configuration; a mixture of development environments and services with the need to connect everything together.  Our primary setup is an ASP.NET/IIS 6 service with some satellite services written in PHP5 hosted on Apache 2.

Up until recently our satellite services were hosted with GoDaddy on a shared machine; needless to say performance was not up to speed for production purposes.  So we began shopping around looking at various grid computing solutions.  We chose to do grid not because we need some massive processing power or many boxes, but because we wanted an environment that was redundant without having to invest in several dedicated boxes to get there.  Our research led us to three grid computing providers: Amazon's Elastic Compute Cloud (EC2), Rackspace's Mosso service, and ServePath's GoGrid.  At the time, none of these grid systems was out of beta so we expected our mileage to vary.

The problem we were trying to solve through grid computing was to host (redundantly) several PHP pages and scripts that process some XML files received by FTP push from a third party.  The end results of some of the PHP pages - used essentially as a web service for our ASP.NET/IIS6 environment - is to generate images.  One large part of our processing generates tons of images and is handled asynchronously on a schedule managed by cron.

Amazon's Elastic Compute Cloud (EC2)

While EC2 was not the first grid solution we looked at we did consider it.  EC2 is a fairly complex setup with a pay-as-you-go model that seems to be typical of grid computing.  It requires that you also use an Amazon Simple Storage Service (S3) for storing your images (think of images as virtual computers).  The idea behind EC2 is that you configure your images and then spin up any number of them to perform a task when the computer power is required.  There's no guarantee that the internal state of an image is retained and its possible that images may go down (kernel panic in your virtual machine or Amazon's own needs).  To use EC2 optimally you must write your software to take advantage of other Amazon services, such as S3 for file storage or SimpleDB for database access.  You could stand up a MySQL instance on one of your images but with no guarantee that the data you want to store in it will be there the next time your image spins up.  That's the nature of EC2 - store the data on other redundant data stores and spin up images to do the processing as required.

To run an EC2 virtual machine for a month (i.e. 31 days) in a 24/7 accessible manner costs about $72 assuming fairly moderate outbound traffic and some usage of S3 (S3 fees are separate from the CPU time costs of EC2 but included in my estimation figure).  This configuration would be similar to what you would get for a dedicated box running Apache 2 with PHP5 and MySQL to do something like host multiple sites.  All in all not bad.  Additionally you pay a fee for public IP addresses and you link IP addresses to virtual machines.

So EC2 is fairly impressive and with the community tools its usable.  Like S3 the only tools that you have for working with it are the ones created by the community.  There is a bit of knowledge spin up required, some key management, and of course you must also conform to the Amazon S3/SimpleDB data storage APIs.  This means if you want to install blog software that needs a MySQL database... well you can do it, but no guarantees that your data will remain between machine spin ups.  When a machine is up and running you can ssh it.

We ended up not using EC2 due to complexity and the necessity that we use Amazon APIs for storing data.  Also after our initial setup we "lost" a box - and I don't mean we could ping it but we didn't know where it was in our apartment, I mean it was inexplicably gone.

Rackspace's Mosso

Mosso is a grid computing solution that sells itself more as a web host reseller than anything else.  Their claims are the ability to host ASP.NET applications under IIS6 (or 7 now) and PHP, ruby, Perl, and Python applications under a specially configured Apache 1.3 instance at the same URL.  Yes, that means: www.yourdomain.com/somepage.aspx and www.yourdomain.com/anotherpage.php.  You could do this with IIS7, FastCGI, and install and configure the various other languages but where would be the fun in that?  They also offer SQL Server (costs more) and MySQL database hosting.

Essentially the virtual machine aspect of the grid is abstracted away from you.  You are presented with a web management front end (one of two, a "final" version that's buggy or a beta AJAXy version that's also buggy) that allows you to create accounts, databases, database users, and configure cron jobs (with minimum 5 minute intervals).  Everything there points to you subleasing Mosso grid time and hosting to other people but you could certainly use it for your own non-subleasing needs.  It appears that you can configure pricing structures for your clients but we didn't need to do anything with that so its a side we did not explore.

Mosso is a pay as you go but with $100 up front monthly entry fee and SQL Server databases costing additional monthly fees.  You pay for bandwidth and processor cycles and it looks like after the $100 its fairly cheap for a "dedicated" box - as dedicated as a virtual instance abstracted away from you can be.  There is no ssh/scp access but they have temperamental FTP access (it disconnects you constantly).  The tech support is pretty good, there's a live operator text-based chat and a phone both of which are accessible (and responsive) at all hours of the day.

We started with Mosso, ran on it for about a month, encountered so many problems that we had to move away from it.  An annoyance, but not a deal killer, were the naming conventions - for example: ftp1.ftptoyoursite.com bit one of our contractors since he thought "ftptoyoursite" meant he should stick in the hosted domain name.  Paths on the SAN are long and contain an 8 character numeric directory, databases have a numeric prefix, and database users have the same prefix.  Two problems we ran into: the cron jobbing was unreliable - tasks scheduled at midnight actually ran at noon instead, but tasks scheduled for 1 AM ran at 1 AM (not PM as one would expect).  We got around this by adding a task that needs to run at midnight to our ruby script within a time check - surprisingly the grid's time was correct even if it confused midnight and noon.  The other larger problem - and this was the deal killer - output write speed and the technical support that wasn't.  As I said before one of our PHP processes involved writing out a large number of images to many subdirectories (this uses gd2 and while its not the most performance friendly code, it executes quickly on local installations).  The execution of this process was taking longer than 6 minutes, and possibly several hours - and this is a process we needed scheduled to run every 15 minutes!  I communicated with Mosso tech support via text chat, email, and phone and after a week the problem was not resolved and I had to badger them to get updates.  Overall a cool idea but probably needs a few more months in the oven before its ready for prime time; also the tech support responsiveness could improve.

ServePath's GoGrid

GoGrid is another grid computing solution that's more like EC2 - the virtualization is not abstracted away from you.  You choose server images, fire them up, and off you go.  Additionally you get a free load balancer so you don't have to program against an API (a la EC2) to get load balancing.  That's not to say there isn't an API available because there is, we just didn't have to explicitly use it.  Because the implementation is not abstracted you do end up with separate virtual machines running different OSes - Windows for IIS6 and Linux for Apache for example - but through some magic you can get the same level of functionality that Mosso provides.

Like Amazon's EC2 there are many images available - though not quite the cesspool of choices EC2 provides.  Fortunately your choices are limited to one of several Windows 2003, CentOS, or Redhat Enterprise images.  We didn't stand up a Windows 2003 server but one would assume to configure it you just RDC to it and you're on your way.  The CentOS image we did use immediately allowed us to ssh to it and yum away our configuration.  Something missing from GoGrid which we've been promised soon is the ability to backup your virtual machine images.  This is certainly unlike EC2 (which is a totally different data storage paradigm anyway) and the fact that our VMs retain state between going up and down is reassuring.  I was immediately comfortable with the management UI presented by GoGrid and after our instance (a CentOS Apache/MySQL image) was up sshing into it and configuring everything.

GoGrid is also a pay as you go, but they charge on RAM time instead of processor time.  The load balancer is free (update: you can have multiple free load balancers if you need them).  I think we're looking at $50 a month for a 1 GB RAM machine 24/7 mode which is what we require for our purposes.  For redundancy you can spin up another instance of your image, link it to the load balancer, and there you go.  One thing you'll note - their hosting software attempts to place virtual images in different server clusters so if one cluster goes down the other images aren't affected (redundancy!) but to verify that this is indeed the case you need to phone them and have their techs check.

We settled on GoGrid because configuration was a breeze, our ability to administer our servers did not require community supported tools (standard industry programs were suitable), our code ran exceptionally well, and for being a beta their uptime has been incredible thus far (100%).  Revealed to us after the fact is great communication - we know what's going on at GoGrid via email updates in terms of scheduled maintenance (Mosso also had this in the form of a blog - but the number of problems they encountered with their solution was rather distressing).

Updated for grammar, spelling, completeness; content remains essentially the same.