How will you make it if you never even try?

April 2, 2009

Some more good ideas about parameter validation in C#

Filed under: C# — Tags: , , , , , — charlieflowers @ 6:20 am

As you can tell from several recent posts, I’m very interested in good syntax for parameter validation. The new features in C# 3.0 make so many things possible. I found another excellent post, from John Gilliland. He “amplifies” each plain old argument value into an ArgumentEx<T> instance, and then hangs extension methods such as “NotNull” and “InRange” off of ArgumentEx<T>. He uses an implicit conversion operator to make it easy to treat an ArgumentEx<T> as the plain old argument value.

Very nice, very thorough. Check it out. I’d like to combine elements of his approach with the lambda expression idea that allows you to avoid specifying both the parameter and the parameter name as a string.

Advertisements

Example where calling Extension Methods on null references is useful: Parameter Validation

Filed under: C# — Tags: , , , , , — charlieflowers @ 2:56 am

In a recent post, I pointed out that extension methods can be called on null references. For example, this works perfectly fine:

public static void PrintToConsole(this string self)
{
   if(self != null)
      Console.WriteLine("The string is: " + self);
   else
      Console.WriteLine("The string is NULL.");
}

// Elsewhere in the code
string myString = null;
myString.PrintToConsole();

I said I’d give some examples of where this would actually be useful (not just a gimmick as it might appear on first blush).

One such case is parameter validation. Rick Brewster has come up with a fantastic approach for parameter validation, which lets your code look something like this:

public static void Copy(T[] dst, long dstOffset, T[] src, long srcOffset, long length)
{
    Validate.Begin()
     .IsNotNull(dst, “dst”)
     .IsNotNull(src, “src”)
     .Check()
     .IsPositive(length)
     .IsIndexInRange(dst, dstOffset, “dstOffset”)
     .IsIndexInRange(dst, dstOffset + length, “dstOffset + length”)
     .IsIndexInRange(src, srcOffset, “srcOffset”)
     .IsIndexInRange(src, srcOffset + length, “srcOffset + length”)
     .Check();
  
     // Further code snipped.
}

No doubt that’s beautiful syntax. But one of Rick’s main goals was this: Incur the least possible overhead if the parameters are all correct. In particular, don’t instantiate any additional objects if the parameters are correct.

And the way this is achieved depends on the fact that extension methods can be called on null references. If all the parameters are OK, the Begin() method, the IsNotNull() method, and so on, all return null. However, they still have a return type, and that return type has extension methods on it called “Begin”, “IsNotNull”, “IsPositive” and so forth.

You can learn more about Rick’s approach here and here.

C# Delights: Extension methods can be called on null references (and that’s extremely useful)

Filed under: C# — Tags: , , , — charlieflowers @ 12:42 am

Did you know that C# extension methods can be called on null references?? Yes, they can. For example, the following method …

public static void PrintToConsole(this string self)
{
   if(self != null)
      Console.WriteLine("The string is: " + self);
   else
      Console.WriteLine("The string is NULL.");
}

… can be called as follows:

string someString = null;

someString.PrintToConsole();

The output would be:
The string is NULL.

Is this just a gimmick? You might think so at first blush, but it is actually remarkably useful. I’ll write another post soon giving some examples of when it is useful. Here’s one hint: imagine a case where you don’t want the overhead of instantiating objects unless you’re in an unusual situation. (OK, here’s another hint … what if you want to write code that applies business logic to non-null values, but seamlessly ignores nulls, so that the code doesn’t have to be all cluttered up with null checks).

April 1, 2009

Git on Windows and behind Firewalls

Filed under: C# — Tags: , , , , — charlieflowers @ 9:11 pm

I do .NET development on my Mac, and for source code management I use Git from the Mac Terminal window. But once I’ve developed the code, I have to get it onto the company’s development server, which is a Windows machine behind a firewall. I do this using git on Windows.

There are multiple choices for Git on Windows, but I chose the MSysGit route. I used putty to get around the firewall issues. I also had to set the whitespace setting in the MSysGit repo to match what it is by default on Unix — without that, things still worked but Git would change Windows-typical line endings to Unix line endings, causing a diff to think every single line had changed.

Once I worked through all that, everything worked beautifully (and it has for months now). I code on my machine, push to a GitHub repo, and the pull from GitHub to the Development Server (using MSysGit / Putty).

I culled through 100’s of articles while getting this set up. Here are the hand-picked articles that were the most helpful for me.

C# Delights: You can put Extension Methods onto Enums!

Filed under: C# — Tags: , , , , — charlieflowers @ 8:04 am

Often, you need to associate other information with the members of an Enum. For example, say you have the following enum:

public enum DaysOfWeek : int
{
   Sunday = 1,
   Monday = 2,
   Tuesday = 3,
   Wednesday = 4,
   Thursday = 5,
   Friday = 6,
   Saturday = 7
}

That’s all well and good. But say you need to associate additional information with each enum member. For example, say your legacy database represents the days of the week with the following 2 letter codes: “Sn”, “Mo”, “Te”, “Wn”, “Tr”, “Fi”, “St”. Notice I picked codes that are not intuitive and are not always the first 2 characters. Crazy, but we all know legacy databases can be crazy.

Also, let’s imagine that your company needs to associate a decimal hourly rate with each day, representing the fact that you charge different rates for different days.

C# has a very nice, new way you can do this. Before C# 3.0, the best way I knew of to handle this was to not use an Enum at all. Rather, I would make a class that was much like a singleton, but with more than one instance. It would have a private constructor, so that no other classes outside of it could make instances of it. However, it would expose static properties with the exact set of instances that were allowed (7 in our case, and the properties would be named “Sunday”, “Monday”, etc.). Each instance would have properties for “DatabaseCode”, “Name” and “HourlyRate”. That’s not bad, but the new way is better in many cases.

The new way is this: You can place extension methods onto Enums! So, in our case, we would do the following:

public static class DaysOfWeekExtensions
{
   private static string[] databaseCodes = new string[] { "Sn", "Mo", "Te",
      "Wn", "Tr", "Fi", "St" };

   private static decimal[] rates = new decimal[] { 2.5m, 3.6m, 0m, 1.2m,
      8.8m, 42m, 3.6m };

   public static string DatabaseCode(this DaysOfWeek self)
   {
      int index = (int)self - 1;
      return databaseCodes[index];
   }

   public static decimal HourlyRate(this DaysOfWeek self)
   {
       int index = (int)self - 1;
       return rates[index];
   }
}

And then you’d use it like this:

Console.WriteLine("For Tuesday, the database code is " + 
   DaysOfWeek.Tuesday.DatabaseCode() + " and we charge " +
   DaysOfWeek.Tuesday.HourlyRate() + ".");

Click here for another example.

Elegant, appealing parameter validation syntax in C# 3.0

Filed under: C# — Tags: , , , , — charlieflowers @ 7:21 am

This is a VERY cool trick that leads to much improved syntax for parameter validation in C# 3.0. (Kudos to Jon Fuller).

Often when writing methods, you need to validate that the parameters you’ve received are valid. So you might write some code like this:

public void SomeMethod(string firstName, decimal salary, int ageInYears)
{
   if(firstName == null)
      throw new WhateverException("The 'firstName' parameter cannot be null.");

   if(salary < 0)
      throw new WhateverException("The salary must not be negative.");

   // And so on ... you get the idea.
}
&#91;/sourcecode&#93;

When you write that in several places, it is only natural to start thinking about a <em>helper</em> to get rid of some of the duplication. Millions of ways to go about it, and it might look something like this:


public void SomeMethod(string firstName, decimal salary, int ageInYears)
{
   ValidationHelper.NotNull("firstName", firstName);
   ValidationHelper.NotNegative("salary", salary);
   // And so on ... you get the idea
}

The thing that sucks about this is that you have to provide the parameter name as a string. You want the error message to say, “The ‘salary’ parameter was severely messed up.” Therefore, it needs “salary” as a string. This is not DRY, because you’ve already designated which parameter you mean. Also, it is not friendly for refactoring.

And C# 3.0 lets you get rid of it!!!! Here’s how:

public void SomeMethod(string firstName, decimal salary, int ageInYears)
{
   ValidationHelper.NotNull( ()=> firstName );
}

The NotNull method is defined as follows:

public static void NotNull(Expression<Func> expr)
{
  if ( ! expr.Compile()().Equals(default(T)))
    return;

  var param = (MemberExpression)expr.Body;
  throw new WhateverException("The parameter '" + param.Member.Name + "' cannot be null.");
}

Notice that the exception message does contain the string “firstName”, but that there IS NO SUCH STRING anywhere in the code! How does this work?

The NotNull method takes an Expression. Because of that, our lambda expression will be turned into an expression tree. That expression tree has one node in it, which is a member expression asking for the value of the “firstName” member. (Why is it a member instead of a parameter? Because C# is generating a closure here and capturing the local variable named “firstName” into the closure. Under the hood, this becomes a class with a member named “firstName”. Our lambda expression is then an expression which asks for the value of that field, which means it is a member expression. And then that expression is turned into an expression tree).

The upshot is that we have no strings, but the helper code can obtain and use the string. Very nice!

You can find more about this here.

June 10, 2008

Speed up your XML code with OuterXmlCachingXmlDocument

Filed under: Performance — Tags: , , — charlieflowers @ 7:09 pm

Transitioning from a string to XmlDocument and vice versa is very slow

Recently I worked on a project which involved some performance profiling. We used tools like Red Gate’s profiler, the free .NET CLR profiler from Microsoft, and the AutomatedQa profiler. These profilers made one thing very clear — transitioning from an XmlDocument representation of XML to a text representation, and vice versa, is very expensive and slow.

However, in this particular project, we had no choice. Our code was building credit reports, which means our original input was XML and our final output was XML (both the MISMO XML format). At various points scattered through the processing of a request, we had to call legacy components for specific tasks. Most of those legacy components wanted XML text (not a DOM) as input, and they also returned their output as XML text. But the rest of our code wanted to work with the XML as a DOM (ie, an XmlDocument), so that we could navigate, set properties, use XPATH, etc.

So, we had no choice but to transition from a large text string to an XmlDocument and then back again, over and over. I said before this is slow, but I want to make sure you understand I mean very slow! You’d be surprised. It is slow because a) it is a big job for the computer to do, and b) because it generates tons of little objects and therefore causes garbage collection overhead.

Caching the OuterXml representation

After thinking about this for a while, I realized that it would help tremendously if we could just make XmlDocument “cache” its textual representation. In other words, when you load an XmlDocument from a string, I wanted XmlDocument to “remember” that string. And as long as no one had made any changes to the XmlDocument in any way, the XmlDocument would merely return that string every time you call OuterXml. But the minute someone makes a change to the XmlDocument, the XmlDocument now knows it no longer has a valid string representation. The next time you call OuterXml, the XmlDocument would go through the big, expensive process of creating the textual representation … but then it would “remember” it again, until the next time that some change invalidates it.

And it turns out, this was fairly simple to build.

Presenting the “OuterXmlCachingXmlDocument”

I called it the “OuterXmlCachingXmlDocument” because it is an XmlDocument that caches the “OuterXml”.

It inherits from XmlDocument and does the following:

  • Overrides Load() and LoadXml() — these methods let you load XML into an XmlDocument. They both find a string of XML text from somewhere (a file or stream, a variable, etc.). They have been overridden to store that string in an instance variable before performing the load operation. They also then register event listeners for XmlDocument’s three “Changed” events — NodeChanged, NodeInserted, and NodeRemoved. Those change events will tell us when the string representation that we have cached becomes invalid.
  • Overrides OuterXml — this is a property that returns the string representation. In XmlDocument, its implementation performs the expensive process of walking the linked list of objects in the DOM and creating a string representation. We have overridden it to first see if we have a valid cached version of the xml string. If so, we just return it! If not, then we have to let the base class do the expensive conversion … BUT! The good news is, once that expensive process has been done, we now have a valid string representation again! So we cache it in the same instance variable again.
  • Handles event notifications for NodeChanged, NodeInserted, and NodeRemoved — if any of these events is fired, we need to dump our cached string representation. We don’t recalculate a new string representation at this time, because avoiding that is why we’re here in the first place! We simply “make a note” that we no longer have a cached string representation. Also, and this is very important, when any of these events fire, we de-register our listener from the NodeChanged, NodeInserted, and NodeRemoved events! Otherwise, all DOM operations that change the XmlDocument will incur the overhead of calling us for no reason.

That’s really all there is to it. It is simple, and it simply “silently” replaces your XmlDocument usages. You can feel free to use it everywhere instead of XmlDocument — it is completely compatible. It made a very noticeable improvement to our performance, and if you’re transitioning a lot between DOM and text, it will likely help you quite a bit as well.

December 13, 2006

Understanding the “Atlas” ScriptResourceHandler

Filed under: Atlas — Tags: , , — charlieflowers @ 2:48 pm

Whew! Been very busy the past several months on an Atlas consulting engagement. There has been a lot to learn with Atlas and a lot of changes from Microsoft to adapt to (yes, I know the name changed, but “Atlas” is still the clearest term to use right now).

I recently had to dig into the internals of ScriptResourceHandler.axd, and I wanted to write my findings here for others and for my own sake. Atlas now comes with a new HTTP handler that serves JavaScript code — called “ScriptResourceHandler.axd”. This is a nice, feature-packed little thing, but it also obscures some things a little bit. It is worth it, but to get the most out of it you need a lot more info than I could find anywhere else on the web.

The Big Picture

In a nutshell, the ScriptResourceHandler.axd is meant to give you the following benefits:

1. Let you embed JavaScript code into your assemblies, and have it included from those assemblies into your web pages.

2. Take care of details of serving that script efficiently for you (primarily, making sure the browser caches it, and then compressing it using GZip).

3. Helping you with internationalization, by automatically reading a set of resources (ie, strings that you want to localize), and putting them into an object in JavaScript so your code can reference them.

It is fantastic to have all of this at your fingertips. Atlas itself uses this heavily. For example, the ScriptManager control uses the ScriptResourceHandler to load the “MicrosoftAjax.js” code that defines the foundation of the Microsoft Ajax ClientFx.

Now for some under-the-hood details…

What in the world is that QueryString??

If you view source on an Atlas page that has a ScriptManager, you will see at least one <script> reference whose “src” attribute refers to ScriptResourceHandler.axd. For example, you might see the following…


<span style="font-family:Courier New;font-size:xx-small;">
    <script src="/MsAjaxClientStuff/ScriptResource.axd?d=9uZzWonH9Yffv82x9JBEOboR-Z7Vtw-        y48sFq7UKtpiuxzt2r1pnjhJ9asp1Z8z4gK8HT0imILHAeWKHaiW1FZB5kueEppforSccLkW4LwX0I-    Yz-dsEC885R_smzSRH6U9TUvmoTDMWxfIIioA6AzzuM23jKe9-G5HkLa5GW5w1&t=633004023929547001" 
        type="text/javascript">
    </script></span>

The QueryString is obviously not meant for human consumption. It is encrypted, and it contains three key pieces of information:

1) The name of the assembly that contains an embedded resource which is the text of the JavaScript that you want to include in the page.

2) The name of the resource within that assembly that contains the JavaScript code you want to include in the page.

3) The culture to be used to select the correct bundle of resources (to over-simplify, this indicates which language you want). If this is not present, then you will get the InvariantCulture.

How is the QueryString encrypted / decrypted?

There is a pair of internal static methods on System.Web.UI.Page, called “EncryptString” and “DecryptString”. These encrypt and decrypt a string that is passed. Atlas uses reflection to call these methods since it otherwise would not be able to access them due to the internal modifier.

How does it find the resource within the assembly?

The primary usage pattern (the only one I plan to cover here ) is that you have placed some ScriptResourceAttribute’s onto your assembly. Each ScriptResourceAttribute lets you declare one particular script resource, and it lets you specify some information that applies to each script resource. At request time, the ScriptResourceHandler then reflects upon your assembly to find the ScriptResourceAttribute that matches the script it is being asked for. In particular, the attribute has a property called “ScriptName” which must match the request’s script name (the second thing in the QueryString from above).

Once it has found the matching ScriptResourceAttribute, it uses info from its properties to proceed. In particular, there are 3 key properties of the ScriptResourceAttribute:

1) ScriptName —  (must match the resource name from the QueryString). In the case of the main Atlas file, the ScriptName property is “Microsoft.Web.Resources.ScriptLibrary.MicrosoftAjax.js“

2) ScriptResourceName — this is the name of a .resources file in your assembly that contains a set of resource name/value pairs that your script needs to refer to. In the case of the main Atlas file, the ScriptResourceName property is “Microsoft.Web.Resources.ScriptLibrary.Res“

3) TypeName — the name of this property is confusing. This is the name of a JavaScript object that will contain all of the resource values collected for your application. (This will become more clear below).

How does it construct the final JavaScript to include in the page?

It sets up an output stream that uses GZip compression, and it configures the HTTP caching policy so that the browser will properly cache the JavaScript. Then, it begins to write to the stream. First, it writes the complete contents of the JavaScript text that was in the resource named by ScriptResourceAttribute.ScriptName.

Then, it writes your resource values. To do this, it generates JavaScript that instantiates an object. The object’s name comes directly from your ScriptResourceAttribute’s “TypeName” property. For the Atlas primary file, this TypeName property is set to “Sys.Res”. Therefore, you get something like the following:


Sys.Res = {

someResourceName: “Some value for the resource”,

// etc

};

The values that are set inside of the object (inside of Sys.Res, for example), come directly from an embedded .resources file in your assembly whose name matches the ScriptResourceAttribute’s ScriptResourceName property. Each entry in a .resources file is a name-value pair, and ScriptResourceHandler merely translates each pair into a property declaration on the object (it uses the JavaScriptSerializer to write the value of each resource).

So, now your client code has had the correct set of resources selected for the culture you want, and those values are “automagically” available to your client code.

One little extra thing here — if the script that was requested ends in “.debug.js”, the ScriptResourceHandler will do everything mentioned thus far, and also it will find a second ScriptResourceAttribute, for the same script without “.debug.js”. It will then get the .resources file for it, and include all of those resources in the same JavaScript object. This looks like a little hack that Microsoft added when they decided they wanted to support debug and non-debug versions of scripts. Makes sense — keeps you from having to duplicate all of the name/value pairs when providing a debug script and a non-debug script.

Finally, ScriptResourceHandler does one more nice little thing for you: it includes the “end-of-script script” that Atlas would like all script files to include. This is as follows:

<span style="color:#800000;">if (typeof(Sys) !== 'undefined') Sys.Application.notifyScriptLoaded();</span>

This script basically says, “If I am on an Atlas page, then let me call the Application.notifyScriptLoaded() method to tell Atlas that my script has finished loading.”

Summary

So, the ScriptResourceHandler has found your JavaScript from within your assembly, served it with compression and proper configuration for HTTP caching, and also extracted all the resource name/value pairs that your script needs and made those values available to your JavaScript on the page.

Note that circumventing the ScriptResourceHandler can be one of the causes of “Sys.Res.xxx is not an object” errors. Why? Because “Sys.Res” is the name of the Atlas client-side object that contains the resource values that Atlas wants to use. If you bypass the ScriptResourceHandler, then nothing will create the Sys.Res object for you. (Here’s a post that touches on that a little bit, because it talks about what you have to do to use the MS Ajax client-side framework (ClientFx) without using ASP.NET on the server).

July 5, 2006

Generics and XPATH — a beautiful match

Filed under: C# — Tags: , , , — charlieflowers @ 8:19 pm

Generics are sweet. Here’s a simple little example that let me cut down on repetitious code when working with XPath.
If you know a little XPath, then you know that it lets you specify a string that contains an “XPath query”, and that query will return you one or more Xpath nodes that match your query. The nodes might be Xml attributes or an Xml elements (or, of course, any of the other types of Xml nodes), depending on your query.
I found myself needing to write code that obtains a single “required” node in an Xml document. By “required”, I mean that I wanted to do an XPath query for the node, and if no match was found, I wanted to throw an exception saying “The xpath query ‘/whatever’ has no match, but exactly one match was expected.”
And of course, the kinds of Xpath queries I commonly needed were those to get either a required Element or a required Attribute. Without generics, I would have had to do something like this.

public static XmlAttribute GetRequiredXmlAttribute(XmlDocument doc, string xpath)
{
	XmlNode node = doc.SelectSingleNode(xpath);

	if (node == null)
	{
		throw new Exception(“The xpath ‘” + xpath + “’ has no matches, but exactly one match is required.”);
	}

	XmlAttribute attribute = node as XmlAttribute;

	if (attribute == null)
	{
		// There was a match, but it is not an XmlAttribute.
		throw new Exception(“The xpath ‘” + xpath + “’ matches a node of type ‘” + node.GetType().FullName + “’, which is not an XmlAttribute.”);
	}

	return attribute;
}

This code is an absolute POSTER BOY for generics. More than half the battle in learning a new technology is in understanding the motivation for it. If you understand that certain XPath queries will always match an XmlAttribute and other XPath queries will always match an XmlElement, and you know that you normally have to do a lot of type-checking and casting to figure out which kind of Xml node you’ve got, then you are looking at one of the key motivations behind generics.
Here’s the generic version of the code – very nice!

public static T GetRequiredNodeFromSourceNode<T>(XmlNode sourceNode, string requiredXpath) where T : XmlNode
{
	XmlNode node = sourceNode.SelectSingleNode(requiredXpath);

	if (node == null)
	{
		throw new ArgumentException("Tried to extract the path '" + requiredXpath + "', but nothing was found for that xpath.");
	}

	T result = node	as T;

	if (result == null)
	{
		throw new ArgumentException("The xpath you provided points to a node of type " + node.GetType().FullName +
		", which cannot be cast to type " + typeof(T).FullName + ".");
	}

	return result;
}

See, generics lets you express something that you always knew about, but were previously unable to express. You knew that some XPaths returned elements while others returned attributes – but .Net 1.x did not give you a way to express that in your code. Now, generics does.
Here’s some code that uses the above generic method:

XmlDocument doc = new XmlDocument();
doc.LoadXml(@"C:\someFile.xml");

XmlAttribute attribute = GetRequiredNodeFromSourceNode<XmlAttribute>(doc, "/root/@someAttribute");
XmlElement element = GetRequiredNodeFromSourceNode<XmlElement>(doc, "/root/someElement");

// This will give one of our exceptions, because this xpath syntax always returns an element.
XmlAttribute attributeFail = GetRequiredNodeFromSourceNode<XmlAttribute>(doc, "/root");

// This will give one of our exceptions, because this xpath syntax always returns an attribute.
XmlElement elementFail = GetRequiredNodeFromSourceNode<XmlElement>(doc, "/root/@hello");

What it boils down to:
Some XPath queries are “typed” by nature — certain queries always return an XmlAttribute while others always return an XmlElement. However, before generics C# gave you no way to express that fact without resorting to the common base class, XmlNode. Generics addresses this exact problem. So you can write less code and have it cover more ground (for example, this code works for XmlComments, processing instructions, and whatever other kinds of Xml nodes you might need to deal with in the future).
What’s also interesting about this example is that it lets you be strongly typed even when you don’t know what your return type will be. If you’ve programmed in C# for a while, this is probably something you “felt the need for” at one time or another, but it couldn’t be acheived before generics.

Blog at WordPress.com.