Over the years working with C#, I've seen a little bit of everything. One of my pet peeves are a set of issues I've seen way too many times involving incorrect use of casts, the is/as operators and the ToString() methods. Most of the times these things are misused in C# is clearly a matter of either a) not understanding the language features (hardly forgivable) or b) not understanding the code you're writing (utterly unforgivable).

Let's look then at some of the patterns and anti-patterns surrounding these topics.

The is operator

The is operator in C# is used to check if a given object is compatible with a given type using only reference conversions or boxing/unboxing conversions (i.e. no user-defined conversions are considered). Not much wrong can be done with this operator, fortunately, but there is one important pattern to consider performance-wise: Avoid following the use of the is operator with a cast, like this:

if ( obj is Employee )

{

   Employee employee = (Employee)obj;

   // do something

}

In the above sample, you're pretty much doing the work twice. it's less expensive to use the as operator instead followed by a proper check for null, like this:

Employee employee = obj as Employee;
if ( employee != null )
{
   // do something
}

Normally you only want to use the is operator when you want to make a type check but you do not need to do a cast to manipulate the object using the (the exception would be when testing against a value type, as this is not possible). Everytime you're tempted to use the is operator, consider carefully the context and see if using the as operator makes more sense instead.

After that, go one step further and consider whether the type check is at all required. In many cases, you can avoid it altogether by moving functionality into the object itself and rely on the type system to do the work for you.

The as operator

The as operator is a little bit more problematic in the field. It is basically a cast operation that just returns null if it fails instead of throwing an InvalidCastException (similar to dynamic_cast<> in C++ in this regard), and as such it is very useful. However, a lot of people use it incorrectly as well.

An instance of the as operator should always be followed by a check against null. If not, then you're using the wrong tool for the job. Our last code example shows a correct use of the as operator. The following one shows an incorrect use of it:

Employee employee = obj as Employee;
employee.Fire();

The problem with this code is that if the cast fail, you'll get a null reference which you'll try to use right away, causing a NullReferenceException to be thrown. NullReferenceExceptions are bad, they are usually hard to debug unless you got a full stack trace with full debugging information. In a case like this, the tool you'll want to be using is a direct cast instead:

Employee employee = (Employee)obj;
employee.Fire();

Can it still fail? Yes, for sure. However, you'll get an InvalidCastException instead of a NullReferenceException, which is not only more clear, but it is also a lot easier to diagnose in restricted scenarios (it's far easier to spot accurately a place where an InvalidCastException can occur using visual inspection of the code than it is a NullReferenceException.

Some people will argue that the use of the as operator vs. casts is a matter of style. My opinion is that's horse dodo; it's a matter of understanding the language and using the proper tool at the proper type. Saying you'll only use one to be consisten is like saying you'll only use 32-bit integers and avoiding bytes and shorts to be consistent.

Casts and ToString()

One last issue I've seen way too much is peopler confusing when a type conversion is necessary and when a simple type coercion (cast) should be enough.

The most common case of this is in accessing values in an untyped dataset. Many people seem to be under the impression that because they are dealing with an untyped datasets, the values stored in the rows have no type at all, and insist on doing unnecessary calls to ToString() to extract the values instead of using a simple type cast [1]. So, you might see a lot of code like this:

DataRow row = dataset.Tables[0].Rows[0];
string value = row["column"].ToString();

This is unnecessary if the value stored in the row already was a string! Furthermore, if the value is null you'll cause a likely unneeded NullReferenceException which would've been avoided by simply casting instead:

DataRow row = dataset.Tables[0].Rows[0];
string value = (string)row["column"];

Sometimes it goes even worse. Many people will follow the ToString() call by a completely unnecessary Parse() call, in order to extract a value type, when a simple unbox operation would've been enough:

DataRow row = dataset.Tables[0].Rows[0];
int value = Int32.Parse(row["column"].ToString());

Not only is this highly inneficient, it is also very error prone. Several things can go wrong besides the obvious:

  • The conversion process will be affected by the default culture currently in place
  • There's no guarantee that the value returned by ToString() is in a format that can be directly parsed by Parse() safely in several scenarios (roundtrip), which can lead to intermittent errors (or unexpected values).

As if this wasn't bad enough, I've seen several variations of this WTF, including:

  • Using ToString()/Parse() to convert between types of different precision, like converting a decimal to a double.
  • Not being aware of numeric type suffixes for literal values (like M, L, U and so on), and doing things like decimal.Parse("1.65") to get around it.
  • Using Enum.Parse() to convert between an integral value and the corresponding enumerated value and the other way around.

And yes, every single one of those I've seen around, as unbelievable as it may seem :-). I've had good luck using a simple regular expression search in Visual Studio to spot these, using this pattern: \.Parse\(.+\.ToString.

Conclusion

Please make sure you and your teammates understand the code and the implications of casts and conversion operators and make good use of them. You'll save yourself a lot of grief in the long run, believe me!

[1] Even if you're using an untyped dataset, most of the type the dataset columns will reflect the correct type because of the schema inference done by the data adapter when the dataset is filled; if you did things right, that is.

Technorati: ,


Tomas Restrepo

Software developer located in Colombia. Sr. PFE at Microsoft.