Archive for the ‘Servers’ Category
April 23rd, 2009
So I went to stream a movie (DVD) off my Windows Home Server (Windows Server 2003 based) to my Windows Media Center 2005 (XP 32-bit) and encountered CONSTANT stuttering. The night before I had watched a movie with no problems. I spent about 5 hours trying to figure out what had happened - both machines had a "Your computer was recently updated!" message from automatic updates. I knew I was in serious trouble.
I spent a long time trying to troubleshoot codecs (both audio and video) and going through all manner of issues. I mucked around with the registry on both machines as I narrowed down the problem to horrible, horrible gigabit network performance. I watched the networking performance through Task Manager on the server and saw my network usage NEVER go above 1%.
Then finally I came across the hotfix from Microsoft to unfuck the hotfix automatic updates kindly installed for me:
http://support.microsoft.com/kb/948496/
Now network utilization hangs out at 25% while copying a 7 GB file across my gigabit network.
And I've learned the lesson I seem to learn every 6 months or so - pretty much every time a new install of Windows or a new PC comes online in my house - disable Automatic Update. If you don't - you will regret it.
Update: I also had to install a hotfix rollup for Windows Media Center 2005 available from Windows Update and reboot the machine to put my network bandwidth consumption at something over 0.5% which is apparently what I need to play DVDs without stutters... though /sigh there is *still* some stuttering but not nearly as bad as before.
Update[2]: FINALLY. I ran across this:
http://www.winaims.com/network_patch.html
So on my Windows Media Center machine I fired up regedit again and did:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanworkstation\parameters
Key: ReadAheadGranularity
Type: DWORD
Value: 0
rebooted, and now network utilization seems to stay at a constant 20% when copying a 6 GB file over my network.
Finally I can sleep with a minor feeling of accomplishment.
October 9th, 2008
The command to get a performance counter to recur on a set daily schedule under Windows Server 2000 and 2003 is:
logman update yourlogname -b 08/12/2008 23:00:00
-e 08/13/2008 09:00:00 -r
So the -r at the end causes it to recur on every date after 8/12-8/13 in those same hour windows (8/13-8/14, 8/14-8/15, et al.).
Server 2008 of course allows you to schedule the logging via the GUI.
September 11th, 2008
I have a job that runs on a Windows 2003 server every morning using Windows Task Scheduler, and that job relies on cscript.exe (to execute PAL.vbs). I have spent probably a day trying to figure out why my job is no longer working after a recent automatic Windows Update. I altered the program I wrote to shell to cscript.exe to include logging and stared at the log output with a puzzled expression for a long time. My job ran completely fine in interactive mode, and would run fine if the job was set to run as a user logged in (without the "only run when this user is logged in" checkbox checked). Completely unattended however, that was a no go. Security context or paths or something was just not making it down the shelled cscript.exe.
Turns out that the "HID Input Service" causes the problem, and a recent Windows update caused it to emerge on my Windows 2003 system.
To fix it you must disable the HID Input Service (set its startup action to "Disabled") then reboot the computer. The HID Input Service is responsible for the extra key buttons on your keyboard, like Calculator for launching calc.exe. Not a big loss.
Sources:
http://ewbi.blogs.com/develops/2003/09/scheduled_tasks.html
http://support.microsoft.com/default.aspx?scid=kb;en-us;812400
September 3rd, 2008
So an odd thing happened around the 29th/30th of August that turned our production system upside down for a short time: the GoGrid machine we had running for a bit with no problems suddenly mounted the root file system as read only and stopped accepting incoming ssh connections.
Naturally we tried to resolve the problem through their tech support, but all we could were uninformative replies like "you must have upgraded your kernel" and "I can't get the machine to get an address through DHCP". Of course we haven't upgraded the kernel or any such thing. At one point the first tech could connect but said the kernel panicked during the boot process. Mmm great, I knew I should have backed up our config.
So we went into disaster recovery mode and tried to stand up another GoGrid instance using CentOS 32-bit. No dice, the machine would boot but couldn't ssh to it (another trapped in kernel panic?). Same thing for a RHEL 5 64-bit instance, that one we could ssh to, but tried a RHEL 4 32-bit instance, boot but no ssh, and finally another RHEL 4 32-bit instance assigned from the bottom of the IP pool and we could ssh to it. Very hit or miss so it was too risky to proceed.
We ended up moving our Linux/Apache/PHP5 system to a Windows 2008/IIS7/PHP5 system we had sitting spare (as a hot spare of our production system actually) and configured FastCGI and had things chugging along in about 4 hours.
Loosing a production system is a tough problem to deal with. The day was spent sorting out problems, fixing bad data (a read only file system using file based caching can make some really really bad data), and essentially lost. This is the risk you take and sometimes the price you pay for hosting on a beta platform.
Too bad, we were planning on moving our development, demo, and test servers to GoGrid because it would be cheaper, minus these sorts of events of course.
August 21st, 2008
Amazon's EC2 now named EC2 Block Store has persistent storage that should be much easier to use for EC2 instances. Its appears to be as easy as using fstab to mount the EBS volume. Pricing is along the same lines as all the other Amazon services, a pay as you go model, at a rate of $0.10 per allocated GB and $0.10 for 1 million I/O operations per volume. They also allow you to make snapshots of a volume and store your snapshots on S3 and then start a new volume from a snapshot. Looks like this is the missing piece for many people to make EC2 a valid option, I know a lack of easily usable persistence was preventing us from using it previously.
Now just need to do a cost analysis...