C# Code Snippet: Creating an md5 hash string.
While writing some code to cache objects from the web (actually items in an RSS channel) in a local database, I ran into the problem of uniquely identifying items in the database. RSS 0.91 defines an optional “guid” element that uniquely identifies an item, but many sites don’t provide it. There was no “key” that I could use to see if the data was already in the database, so I wanted to generate one using a hashing algorithm.
The .NET framework has GetHashCode() which seems promising, and which every object must support, but it returns an int, and considering the database could potentially have hundreds of thousands of entries in it, multiple objects generating the same hash code would eventually happen. Duplication wouldn’t be a catastrophic problem, but if it happened frequently, it would make the caching less useful.
The framework can generate an MD5 hash of an arbitrary piece of data, which is just what I want – a unique enough hash that it’s very unlikely that two arbitrary pieces of information would generate the same hash. The MD5 algorithm takes as input a message of arbitrary length and produces as output a 128-bit “fingerprint” of the input. The .NET framework glue for taking the 128-bit binary data and turn it into a hex string was a bit of a pain to figure out, so I’m presenting what I wrote here both to provide an example to anyone else with a similar need, and to ask if there’s a better way.
A full description of the MD5 algorithm is available in rfc1321.
// Create an md5 sum string of this string
static public string GetMd5Sum(string str)
// First we need to convert the string into bytes, which
// means using a text encoder.
Encoder enc = System.Text.Encoding.Unicode.GetEncoder();
// Create a buffer large enough to hold the string
byte unicodeText = new byte[str.Length * 2];
enc.GetBytes(str.ToCharArray(), 0, str.Length, unicodeText, 0, true);
// Now that we have a byte array we can ask the CSP to hash it
MD5 md5 = new MD5CryptoServiceProvider();
byte result = md5.ComputeHash(unicodeText);
// Build the final string by converting each byte
// into hex and appending it to a StringBuilder
StringBuilder sb = new StringBuilder();
for (int i=0;i<result.Length;i++)
// And return it
Thanks to Johannes Hiemer for pointing out a bug in my initial version.