Flow Control Classifier Sample

June 9th, 2009

I’ve been doing my first experiments with extending Visual Studio 2010 beta 1 using the recently released SDK. It’s been lots of fun, but it can also be frustrating at times because the new extensibility model is huge and you don’t always know where to get started.

My first sample extension is a custom Classifier that changes how flow control keywords get rendered in the editor in C# and C/C++ so that they are easy to separate at a glance from other, more structural keywords. This is a common feature in other IDEs, but Visual Studio has always lumped all language keywords in a single category, which is a bit less useful.

Here’s how code might look with the Flow Control Classifier installed (C#):

vs10_fcext_1

One of the really nice things about how classifiers work in VS2010 is that they get recognized by the IDE automatically and new entries for any custom classificatios are added in the Fonts And Colors dialog so that the user can tweak the appearance. After installing the sample, you can adjust the settings for the “Flow Control Keyword” category to make the colors match your Visual Studio theme.

Much of the code in this sample is based on the Custom Classifier project template that comes with the VS2010 SDK, but I tweaked it quite a bit to make it easier to work with. You can find the code for the sample on the FlowControlClassifier repository at GitHub.

There are some interesting aspects about this extension that are worth commenting so that I remember them later on:

Controlling which Editors are Extended

The Visual Studio extensibility model provides an easy way to control on which editors your custom classifier runs, through the [ContentType] attribute on your Classifier Provider class.

The default project generated by the project template specifies the “text” content type, which causes it to run always. However, for this sample, I changed that and added the attribute twice; once for C# (which uses the “CSharp” content type), and once for C++ (which uses the “C/C++” content type).

The second aspect of this is, of course, checking the type of context you’re running in at runtime. For the FlowControlClassifier, I wanted to be able to use different lists of keywords to highlight for each language. Though C# and C/C++ are largely similar, this made it easier to support other languages later.

The way to do this is simple: When the classifier runs, I check the ContentType property of the TextBuffer instance associated to the span the classifier is currently classifying:

string[] keywords =

   GetKeywordsByContentType(span.Snapshot.TextBuffer.ContentType);

Overriding the Default Look

Another cool thing that the new classifier model provides is that the format specifications provided by classifications are merged together before rendering. That means that a single span in the editor might get classified two different ways by two different classifiers, and the editor can merge those together to produce a single look.

For example, one classifier might say it wants it in bold, while the other might only say it wants it in blue. The editor can then make the span bold and blue.

Fortunately, you can tell the editor how you want that merging process to work. In the FlowControlClassifier, I wanted to always override the default look already provided by the language service classification (i.e. the format specified in the Keyword option in the Fonts And Colors dialog). To do this, I tell VS that I want my classification format to be applied after the default classifier and with a high priority:

Parsing the Content

From what I can see, the way the editor interacts with the classifier is such that you will basically get a call to your GetClassificationSpans() method for each visible line in the buffer at a given time. Although the editor gives you a span covering a single line of text at a time while, you can ask for the text before/beyond the span given, though there’s good reason to avoid doing that as much as possible.

Obviously you can’t really rely that it will be exactly so (many things can affect when it gets called), but that does tell us a lot about how the editor works. In particular, only classifying the visible portion of the buffer should definitely keep the editor snappy when possible.

In my initial attempt at writing the classifier, I started manually parsing the text given (which can be retrieved by calling span.GetText()) to look for keywords and generate ClassificationSpans for each one I found.

This seemed to work at first, until I realized this was undesirable for a very simple reason: I couldn’t just look for the words in the text, without being aware of the context they were used in.

For example, “for” is a keyword I’m interested in, but obviously not if it appears inside a quoted string or a comment. I could work around that with a bit of work, but that even that fell apart real quick, since I couldn’t handle multi-line comments or multi-line string literals when I was just seeing the text one line at a time. And parsing back through the buffer to check for context was simply not an appealing option.

Fortunately, after wondering aloud on twitter if there might be a way to reuse the classifications already done by the underlying language service, Noah Richards came to the rescue and provided a fantastic sample on how to do just that: http://gist.github.com/125540

Basically the sample shows how to call use the Classifier Aggregator service provided by the IDE, which aggregates and merges all the classifications returned by the registered classifiers, and ask it to run the classifiers on the selected span of text. Now all you have to do is go through the classifications returned, look for those the language service marked as keywords, and filter those down based on our own list.

The only tricky part about it is some work on our classifier provider implementation so that when we’re calling into the aggregator ourselves, we exclude our own classifier from the process. Very important, otherwise we’d get the editor stuck in an infinite recursion and blow it up real quick.

All the credit for this goes to Noah, I wouldn’t even have known where to start to make this happen :).

Conclusion

All in all, I’ve been having lots of fun working on this little extension, and I’m looking forward to extend it to support a number of other things as well. It’s pretty cool that you can do things like this with relative ease in the new editor, and this one is one I sure plan on using day to day.

Everything is not rosy, of course. It takes a while to get used to the extensibility model, as it’s pretty big, and despite the support that MEF provides for hooking stuff together, you still have to be pretty careful about the values you provide in the attributes: If some things don’t match, you might find that things don’t work correctly and there’s no easy way to figure out what’s going on.

Update: I've renamed the project to Keyword Classifier and extended it to support new features. Details here.

Visual Studio (33)

Tomas Restrepo

Software developer located in Colombia.