Archive for 2008

IDictionary<TKey,TValue>, IXmlSerializable, and lambdas

Comments

The greatest problem I have encountered with the .NET framework (through 3.5) is that Dictionary instances are not serializable.  You have to write custom serialization routines and its different if you are doing binary or XML serialization.

An important aspect to XML serialization to me is readability.  If I'm serializing something to XML then I expect it to be readable and I prefer less nodey to more.  I like to use attributes whenever possible.

I originally used some somewhat dated code I found on Matt Berther's blog and while it worked, it gave me the nodey version I didn't care much for:

<dictionary>
  <item>
    <key>type</key>
    <value>resource</value>
  </item>
  <item>
    <key>version</key>
    <value>1</value>
  </item>
</dictionary>

The value of the nodey version is that when the types represented by TKey and TValue are themselves serializable in a way that can allow them to be represented by their own XML object graphs.  Ideally I wanted a format that could have output like:

<item key="type" value="resource" />

but still fall back to the more advanced nodey version as necessary, and mix and match (TKey could be a string while TValue could be a more complex serializable type, such as a domain class with many properties).

My solution was to establish what classes could be attributable:

public class Dictionary<TKey,TValue> :   IDictionary<TKey, TValue>,  
    ISerializable
    IDictionary
    IXmlSerializable

...

private readonly static List<Type> _attributableTypes;
static Dictionary()
{
    _attributableTypes = new List<Type>
    {
        typeof(Boolean),
        typeof(Byte),
        typeof(Char),
        typeof(DateTime),
        typeof(Decimal),
        typeof(Double),
        typeof(Enum),
        typeof(Guid),
        typeof(Int16),
        typeof(Int32),
        typeof(Int64),
        typeof(SByte),
        typeof(Single),
        typeof(String),
        typeof(TimeSpan),
        typeof(UInt16),
        typeof(UInt32),
        typeof(UInt64)
    };
}
private static bool IsAttributable(Type t)
{
    return _attributableTypes.Contains(t);
}

And the meat of the code, the IXmlSerializable interface implementation:

System.Xml.Schema.XmlSchema IXmlSerializable.GetSchema()
{
    return null;
}

void IXmlSerializable.ReadXml(System.Xml.XmlReader reader)
{
    // some types can be stored easily as attributes while others
    // require their own XML rendering
    Func<TKey> readKey;
    Func<TValue> readValue;

    var isAttributable = new { Key = IsAttributable(typeof(TKey)),
        Value = IsAttributable(typeof(TValue)) };

    // keys
    if (isAttributable.Key)
    {
        readKey = () => (TKey)Convert.ChangeType(
            reader.GetAttribute("key"), typeof(TKey)
        );
    }
    else
    {
        var keySerializer = new XmlSerializer(typeof(TKey));
        readKey = () =>
        {
            while (reader.Name != "key")
                reader.Read();
            reader.ReadStartElement("key");
            var key = (TKey)keySerializer.Deserialize(reader);
            reader.ReadEndElement();
            return key;
        };

    }

    // values
    if (isAttributable.Value && isAttributable.Key)
    {
        readValue = () => (TValue)Convert.ChangeType(
             reader.GetAttribute("value"), typeof(TValue)
        );
    }
    else
    {
        var valueSerializer = new XmlSerializer(typeof(TValue));
        readValue = () =>
        {
            while (reader.Name != "value")
                reader.Read();
            reader.ReadStartElement("value");
            var value = (TValue)valueSerializer.Deserialize(reader);
            reader.ReadEndElement();
            return value;
        };
    }

    var wasEmpty = reader.IsEmptyElement;
    reader.Read();

    if (wasEmpty)
        return;

    while (reader.NodeType != System.Xml.XmlNodeType.EndElement)
    {
       while (reader.NodeType == System.Xml.XmlNodeType.Whitespace)
            reader.Read();
        var key = readKey();
        var value = readValue();
        Add(key, value);

        if (!isAttributable.Key || !isAttributable.Value)
            reader.ReadEndElement();
        else
            reader.Read();
        while (reader.NodeType == System.Xml.XmlNodeType.Whitespace)
            reader.Read();
    }
    reader.ReadEndElement();
}

void IXmlSerializable.WriteXml(System.Xml.XmlWriter writer)
{
    Action<TKey> writeKey;
    Action<TValue> writeValue;

    var isAttributable = new     
    {
        Key = IsAttributable(typeof(TKey)),
        Value = IsAttributable(typeof(TValue))
    };

    if (isAttributable.Key)
    {
        writeKey = v => writer.WriteAttributeString("key",
            v.ToString()
        );
    }
    else
    {
        var keySerializer = new XmlSerializer(typeof(TKey));
        writeKey = v =>
            {
                writer.WriteStartElement("key");
                keySerializer.Serialize(writer, v);
                writer.WriteEndElement();
            };
    }

    // when keys aren't attributable, neither are values
    if (isAttributable.Value && isAttributable.Key)
    {
        writeValue = v => writer.WriteAttributeString("value",
            v.ToString()
        );
    }
    else
    {
        var valueSerializer = new XmlSerializer(typeof(TValue));
        writeValue = v =>
        {
            writer.WriteStartElement("value");
            valueSerializer.Serialize(writer, v);
            writer.WriteEndElement();
        };
    }

    foreach (var key in Keys)
    {
        writer.WriteStartElement("item");

        writeKey(key);
        writeValue(this[key]);

        writer.WriteEndElement();
    }
}

Bonus: I also learned that multiline lambdas exist and I snuck in an anonymous type to boot.  I like that I was able to take two separate approaches to XML serialization, distill their interfaces from their inner workings, and then just expose the correct method signature within a method to keep the iterative loop clean of if/else logic (even if I just moved the logic above the loop).  To me this code seems much more readable than if I had kept everything in the loop.

Update: it helps to post working code.  Also, here's a zip file with the implementation and some brief NUnit flavored tests.


HID Input Service, cscript.exe, Task Scheduler

Comments

I have a job that runs on a Windows 2003 server every morning using Windows Task Scheduler, and that job relies on cscript.exe (to execute PAL.vbs).  I have spent probably a day trying to figure out why my job is no longer working after a recent automatic Windows Update.  I altered the program I wrote to shell to cscript.exe to include logging and stared at the log output with a puzzled expression for a long time.  My job ran completely fine in interactive mode, and would run fine if the job was set to run as a user logged in (without the "only run when this user is logged in" checkbox checked).  Completely unattended however, that was a no go.  Security context or paths or something was just not making it down the shelled cscript.exe.

Turns out that the "HID Input Service" causes the problem, and a recent Windows update caused it to emerge on my Windows 2003 system.

To fix it you must disable the HID Input Service (set its startup action to "Disabled") then reboot the computer.  The HID Input Service is responsible for the extra key buttons on your keyboard, like Calculator for launching calc.exe.  Not a big loss.

Sources:

http://ewbi.blogs.com/develops/2003/09/scheduled_tasks.html

http://support.microsoft.com/default.aspx?scid=kb;en-us;812400


GoGrid: No ssh for you!

Comments

So an odd thing happened around the 29th/30th of August that turned our production system upside down for a short time: the GoGrid machine we had running for a bit with no problems suddenly mounted the root file system as read only and stopped accepting incoming ssh connections.

Naturally we tried to resolve the problem through their tech support, but all we could were uninformative replies like "you must have upgraded your kernel" and "I can't get the machine to get an address through DHCP".  Of course we haven't upgraded the kernel or any such thing.  At one point the first tech could connect but said the kernel panicked during the boot process.  Mmm great, I knew I should have backed up our config.

So we went into disaster recovery mode and tried to stand up another GoGrid instance using CentOS 32-bit.  No dice, the machine would boot but couldn't ssh to it (another trapped in kernel panic?).  Same thing for a RHEL 5 64-bit instance, that one we could ssh to, but tried a RHEL 4 32-bit instance, boot but no ssh, and finally another RHEL 4 32-bit instance assigned from the bottom of the IP pool and we could ssh to it.  Very hit or miss so it was too risky to proceed.

We ended up moving our Linux/Apache/PHP5 system to a Windows 2008/IIS7/PHP5 system we had sitting spare (as a hot spare of our production system actually) and configured FastCGI and had things chugging along in about 4 hours.

Loosing a production system is a tough problem to deal with.  The day was spent sorting out problems, fixing bad data (a read only file system using file based caching can make some really really bad data), and essentially lost.  This is the risk you take and sometimes the price you pay for hosting on a beta platform.

Too bad, we were planning on moving our development, demo, and test servers to GoGrid because it would be cheaper, minus these sorts of events of course.


Awesome

Comments

There is nothing else I can say:

http://develop-one.net/blog/2008/08/27/HugADeveloper.aspx


More Grid: EC2 Block Store (EBS)

Comments

Amazon's EC2 now named EC2 Block Store has persistent storage that should be much easier to use for EC2 instances.  Its appears to be as easy as using fstab to mount the EBS volume.  Pricing is along the same lines as all the other Amazon services, a pay as you go model, at a rate of $0.10 per allocated GB and $0.10 for 1 million I/O operations per volume.  They also allow you to make snapshots of a volume and store your snapshots on S3 and then start a new volume from a snapshot.  Looks like this is the missing piece for many people to make EC2 a valid option, I know a lack of easily usable persistence was preventing us from using it previously.

Now just need to do a cost analysis...