Having Fun with WinDBG

I’ve been spending lots of quality time with WinDBG and the rest of the Windows Debugging Tools, and ran into something I thought was fun to do.

For the sake of keeping it simple, let’s say I have a sample console application that looks like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.CompilerServices;

class Program {
  static void Main(string[] args) {
    Program p = new Program();
    for ( int i = 0; i < 10; i++ ) {
      p.RunTest("Test Run No. " + i, i);
    }
  }
  [MethodImpl(MethodImplOptions.NoInlining)]
  public void RunTest(String msg, int executionNumber) {
    Console.WriteLine("Executing test");
  }
}

Now, imagine I’m debugging such an application and I’d like to figure out what is passed as parameters to the RunTest() method, seeing as how the application doesn’t actually print those values directly. This seems contrived, but a classic case just like this one is a method that throws an ArgumentException because of a bad parameter input but the exception message doesn’t specify what the parameter value itself was.

For the purposes of this post, I’ll be compiling using release x86 as the target and running on 32-bit Windows. Now, let’s start a debug session on this sample application. Right after running it in the debugger, it will break right at the unmanaged entry point:

Microsoft (R) Windows Debugger Version 6.11.0001.404 X86
Copyright (c) Microsoft Corporation. All rights reserved.

CommandLine: .\DbgTest.exe
Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path.           *
* Use .symfix to have the debugger choose a symbol path.                   *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Executable search path is:
ModLoad: 012f0000 012f8000   DbgTest.exe
ModLoad: 777f0000 77917000   ntdll.dll
ModLoad: 73cf0000 73d3a000   C:\Windows\system32\mscoree.dll
ModLoad: 77970000 77a4c000   C:\Windows\system32\KERNEL32.dll
(f9c.d94): Break instruction exception - code 80000003 (first chance)
eax=00000000 ebx=00000000 ecx=001cf478 edx=77855e74 esi=fffffffe edi=7783c19e
eip=77838b2e esp=001cf490 ebp=001cf4c0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntdll.dll -
ntdll!DbgBreakPoint:
77838b2e cc              int     3

Now let’s fix the symbol path and also make sure SOS is loaded at the right time:

0:000> .sympath srv*C:\symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*C:\symbols*http://msdl.microsoft.com/download/symbols
Expanded Symbol search path is: srv*c:\symbols*http://msdl.microsoft.com/download/symbols
0:000> .reload
Reloading current modules
......
0:000> sxe -c ".loadby sos clr" ld:mscorlib
0:000> g
(1020.12c8): Unknown exception - code 04242420 (first chance)
ModLoad: 70b80000 71943000   C:\Windows\assembly\NativeImages_v4.0.30319_32\mscorlib\246f1a5abb686b9dcdf22d3505b08cea\mscorlib.ni.dll
eax=00000001 ebx=00000000 ecx=0014e601 edx=00000000 esi=7ffdf000 edi=20000000
eip=77855e74 esp=0014e5dc ebp=0014e630 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
77855e74 c3              ret

At this point, managed code is not executing yet, but we’ve got SOS loaded. Now, what I’d like to do is set an initial breakpoint in the RunTest() method. Because it’s a managed method, we’d need to wait until it is jitted to be able to grab the generated code entry point. Instead of doing all that work, I’ll just use the !BPMD command included in SOS to set a pending breakpoint [1] on it, the resume execution:

0:000> !BPMD DbgTest.exe Program.RunTest
Adding pending breakpoints...
0:000> g
(110c.121c): CLR notification exception - code e0444143 (first chance)
(110c.121c): CLR notification exception - code e0444143 (first chance)
(110c.121c): CLR notification exception - code e0444143 (first chance)
JITTED DbgTest!Program.RunTest(System.String, Int32)
Setting breakpoint: bp 001600D0 [Program.RunTest(System.String, Int32)]
Breakpoint 0 hit
eax=000d37fc ebx=0216b180 ecx=0216b180 edx=0216b814 esi=0216b18c edi=00000000
eip=001600d0 esp=002fece0 ebp=002fecf4 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
001600d0 55              push    ebp

Now he debugger has stopped execution on the first call to RunTest, so we can actually examine the values of the method arguments:

0:000> !CLRStack -p
OS Thread Id: 0x121c (0)
Child SP IP       Call Site
002fece0 001600d0 Program.RunTest(System.String, Int32)
    PARAMETERS:
        this () = 0x0216b180
        msg () = 0x0216b814
        executionNumber (0x002fece4) = 0x00000000

So the first parameter is the this pointer, as this is a method call. The msg parameter is a string, so let’s examine that as well:

0:000> !dumpobj -nofields 0x0216b814
Name:        System.String
MethodTable: 70e9f9ac
EEClass:     70bd8bb0
Size:        42(0x2a) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String:      Test Run No. 0

Now let’s look at this at a slightly lower level:

0:000> kbn3
 # ChildEBP RetAddr  Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 002fecdc 001600ab 00000000 00000000 004aa100 0x1600d0
01 002fecf4 727221db 002fed14 7272e021 002fed80 0x1600ab
02 002fed04 72744a2a 002fedd0 00000000 002feda0 clr!CallDescrWorker+0x33
0:000> !IP2MD 0x1600d0
MethodDesc:   000d37fc
Method Name:  Program.RunTest(System.String, Int32)
Class:        000d1410
MethodTable:  000d3810
mdToken:      06000002
Module:       000d2e9c
IsJitted:     yes
CodeAddr:     001600d0
Transparency: Critical
0:000> !IP2MD 0x1600ab
MethodDesc:   000d37f0
Method Name:  Program.Main(System.String[])
Class:        000d1410
MethodTable:  000d3810
mdToken:      06000001
Module:       000d2e9c
IsJitted:     yes
CodeAddr:     00160070
Transparency: Critical

Here we see the top 3 stack frames including the first 3 parameters to the call, and from the !IP2MD calls you can see the first 2 are the calls to RunTest() and Main(), just as we would expect.

The parameters displayed by the kb command, however, seem a bit weird for the RunTest call: 00000000 00000000 004aa100. These are, literally, the values on the stack:

0:000> dd esp L8
002fece0  001600ab 00000000 00000000 004aa100
002fecf0  002fed20 002fed04 727221db 002fed14

Notice that at the top of the stack we have the return address to the place in Main() where the method call happened, followed by the “3 parameters” displayed by kb. However, this isn’t actually correct.

The CLR uses a calling convention that resembles the FASTCALL convention a bit. That means that in this case, the left-most parameter would be passed in the ECX register, the next one in EDX and the rest on the stack. In our case, this means that the value of the this pointer will go in ECX:

0:000> r ecx
ecx=0216b180
0:000> !dumpobj ecx
Name:        Program
MethodTable: 000d3810
EEClass:     000d1410
Size:        12(0xc) bytes
File:        C:\temp\DbgTest\bin\release\DbgTest.exe
Fields:
None

It also means that the msg argument will go in EDX:

0:000> r edx
edx=0216b814
0:000> !dumpobj -nofields edx
Name:        System.String
MethodTable: 70e9f9ac
EEClass:     70bd8bb0
Size:        42(0x2a) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String:      Test Run No. 0

So the value executionNumber argument will go in the stack, and we’ll find it at [esp+4]:

0:000> dd [esp+4] L1
002fece4  00000000

We could even disassemble the small piece of code in Main that calls RunTest, by backing up a bit before the current return address, and you’ll see how the value of i is pushed into the stack from the edi register and how the ecx and edx are likewise prepared for the call:

0:000> u 001600ab-12 L8
0016009b e87076b870      call    mscorlib_ni+0x2b7710 (70e37710)
001600a0 57              push    edi
001600a1 8bd0            mov     edx,eax
001600a3 8bcb            mov     ecx,ebx
001600a5 ff1504380d00    call    dword ptr ds:[0D3804h]
001600ab 47              inc     edi
001600ac 83ff0a          cmp     edi,0Ah

Knowing all this, if we wanted to print out the values of the msg and executionNumber parameters on all remaining calls to RunTest, we could replace the breakpoint setup by the !BPMD command with a regular breakpoint that executes a command and then continues execution. This would look something like this:

0:000> * remove existing breakpoint
0:000> bc 0
0:000> * check start address of RunTest
0:000> !name2ee DbgTest.exe Program.RunTest
Module:      000d2e9c
Assembly:    DbgTest.exe
Token:       06000002
MethodDesc:  000d37fc
Name:        Program.RunTest(System.String, Int32)
JITTED Code Address: 001600d0
0:000> * set breakpoint
0:000> bp 001600d0 "!dumpobj -nofields edx; dd [esp+4] L1; g"
0:000> g
Name:        System.String
MethodTable: 70e9f9ac
EEClass:     70bd8bb0
Size:        42(0x2a) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String:      Test Run No. 1
002fece4  00000001
Executing test
Name:        System.String
MethodTable: 70e9f9ac
EEClass:     70bd8bb0
Size:        42(0x2a) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String:      Test Run No. 2
002fece4  00000002
Executing test
Name:        System.String
MethodTable: 70e9f9ac
EEClass:     70bd8bb0
Size:        42(0x2a) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String:      Test Run No. 3
002fece4  00000003
...

As you can see, we’re indeed getting the values of both our arguments without problems in the debugger log (which we could easily write to a file using the .logopen command). This is a simple scenario, but can still prove useful sometimes. Of course, you could argue that going through all these contortions might be over the top, given that the !ClrStack -p command can give you parameters to each function in the call stack. The answer is that !ClrStack doesn’t make it easy to dump just the first frame, not does it combine with other commands so that you can easily use !DumpObj on the parameter values.

[1] If !BPMD doesn’t seem to work, it’s likely because the CLR debugger notifications are disabled. See this post on how to fix it (for .NET 4.0, just remember to replace mscorwks for clr).



                  
                          

InitializeSecurityContext() and SEC_E_INSUFFICIENT_MEMORY

I’ve been doing some work recently with the Security Support Provider Interface (SSPI) API in Windows; particularly with the Kerberos package. I had used SSPI a lot before (see my WSSPI library for stuff I did with it years ago), but it had been mostly done with NTLM and not much of Kerberos.

Overall, using Kerberos is not much different from NTLM. However, one thing I kept running into that kept me chasing ghosts was getting a SEC_E_INSUFFICIENT_MEMORY error code from my calls to InitializeSecurityContext().

All the documentation says about the error is “There is not enough memory available to complete the requested action”, but I was pretty sure all my buffer handling was correct and had enough space to receive the generated token.

After beating my head against this and coming up to it time and time and again, I realized one condition that would trigger this: An incorrect Service Principal Name! Particularly, in my case, it would be an SPN without the Fully Qualified Domain Name of the target service host.

I’m putting this on just so that I have a clue what to watch out for next time it comes up!

Why GitHub?

Nick Heppleston, a fellow BizTalk blogger and user of my PipelineTesting library, left a comment on a recent post asking why I chose to put the library code on GitHub instead of CodePlex. I think it's a fair question, so let me provide some context.

As many of you are probably aware of it by now, there has been much talk lately about Distributed Version Control Systems (DVCS), as an alternative to the more traditional, centralized control systems that have been common in the past (and still are). DVCS has gained a lot of traction lately, particularly with Open Source projects because it really suits the already distributed nature of Open Source development.

For a long time I remained fairly skeptical of DVCS tools. To be honest, I just didn't get what the fuzz was about and centralized systems had worked just fine for me. I use CVS, Subversion and Team Foundation Server on a regular basis, and you can use all of them successfully with your projects. Obviously, each one has its strengths and issues, but they are all very usable tools.

However, during last year I've been working on a bunch of different projects where the workflow that suited my work style and my requirements best started to make using a centralized source control system a bit harder than it used to be before.

This made me realize that for some of the things I do, a centralized control system just doesn't cut it anymore. In other words, I crossed some invisible threshold where the SCCS stopped being an asset and started becoming a liability. Instead of having source control be a painless, frictionless process, it was becoming something I dread to deal with. And that's when I finally understood what DVCS was all about.

Why GIT?

git-logo So at that point I started to look into DVCS tools and playing a bit with them. There's a good discussion of some of the most important DVCS tools around, but in the end I finally settled for GIT, using the msysgit installation on my windows machines.

So far, I haven't really run into any really significant issues when running msysgit; the core stuff seems pretty solid, at least in my experience. I know there are some issues with git-svn in current builds, but I haven't used it yet so I can't comment on that.

I'm still very much a newbie at this, but I'm slowly getting the hang of it, and so far, I'm really liking it. Some aspects of git I really like are:

  • Speed: It's pretty fast in most operations, even with large source code files (like tool-generated ones).
  • Local branches: I love being able to create and switch between branches very easily and fast. Once you realize how easy it is to use them, you start taking advantage of branching a lot more than on regular, centralized version control systems.
  • Single work-tree: Not having to maintain N-copies at the same time of your work directory when you're dealing with N-branches is a real plus in many cases. Of course, you can choose to do so if you like, but it's no necessary, like with other tools.

Why PipelineTesting?

I've always shared the code of my PipelineTesting library through this website. However, I was only publishing snapshots of the code, and while that was fine given how few people use it, it was sometimes a drag. I really did want to share the code more broadly and make it easier to get to some of the changes I was working on even when I had not explicitly released a new official version of the library.

Last year I even commented a bit on this topic and asked for feedback about what the best place to host the code for some of my projects might be, but in the end I didn't make any decision about it.

Why not CodePlex?

codeplex CodePlex is a fine site for publishing and hosting your open source projects. I was skeptical about it at first, but it really took off and has a number of things going for it.

The greatest strength that CodePlex has is precisely that it's a Microsoft-technology oriented site. This means that it is a natural choice both when publishing projects that explicitly target the MS stack, and when you're looking for open source projects based on said technology.

I think that, overall, the CodePlex team has done a great job of keeping the site running and making sure it became a viable and valuable service to the community (and Microsoft itself).

The downside of CodePlex is, unfortunately, the technology it is based on: Team Foundation System. TFS is a fine, robust, centralized source control tool. But it also has a few things that manage to take the fun out of using it:

  • The support for working disconnected from the centralized server is just not good enough. Sure, it has improved a bit since the initial release, but it is far from a friction-less experience.
  • The TFS Client, integrated into Visual Studio. This is supposed to be an asset, but, honestly, I don't always want my source control integrated into my IDE. It can be good sometimes, but it can also be very painful.

Just to set the record straight: Yes, I am aware of the command line tools for driving TFS, and that's certainly an option. Yes, I'm also aware of SvnBridge, which I haven't used myself yet, and it is a really good option and addition to CodePlex, but means running yet another tool.

Why GitHub?

github The surest way to get proficient at something is to do it. I want to learn more about DVCS so that I can improve my workflow, and that means using my tool of choice.

For the time being, I'm choosing to stick with git for my personal projects (and some of my work). Given this choice, GitHub was a natural choice as to host my public stuff.

There are several aspects about GitHub that I like, but most of all, its that it is very simple overall, easy to get started with, and mostly stays out of my way. I also find the social aspects of it very intriguing, though naturally I'm not using those yet.

Of course, not everything is perfect in GitHub-land. Some will argue that it doesn't offer as many features as CodePlex in some aspects (like no forums) but that doesn't bother me at this point, as I don't really need that for now.

A bigger issue, however, could be that GitHub is not yet a very visible site among the .NET/BizTalk communities. Heck, I'm pretty sure PipelineTesting is the only BizTalk-related project on it :-). I think that anyone looking for my library is probably going to find it through this weblog first, so I'm not that worried about it, and the BizTalk community itself isn't all that large (it has grown enormously, but it's still small by comparison).

What's next?

I plan to contin ue working on PipelineTesting and I have a few features in mind for the next release. If anyone wants to contribute or has suggestions/bugs, please let me know about it!

I will continue to offer a copy of the library for download that includes a snapshot of the source code and a pre-compiled copy of the library, like I've been doing so far. People shouldn't have to install git just to get a copy of the library and use it, unless they need something in the code that's not yet in an "official" build. Of course, I'm a nice guy, so if you really really need it, just ask :-).

I also plan to start taking advantage of some GitHub features. In particular, I want to migrate some of the "documentation" that I've written over time as blog posts to a more appropriate format that's easier to maintain and to use. For this, I want to put the GitHub Wiki to use and also add a proper readme file to make it easier to get started with the library.

Windows SDK Features

The Windows SDK Team blog has posted an open question about some potential features of future SDK releases and is gathering our opinions about it. I'd like to comment a bit on them and provide some feedback.

There are 9 areas the team is looking for feedback on. Let's consider each one:

1. A new, small download of “Core SDK” components is made available to customers, with only the basics.  What components should be included in this product in addition to the Windows headers and libraries?

I think the idea of offering smaller download options is a good one overall, and one that a lot of people (myself included) have asked for a number of times in the past. It's true that the special web downloader/installer we had in the past allowed you to download just parts of the SDK, but it just didn't work sometimes at all, and it was too rigorous in enforcing dependencies that might, or might have not, applied to your specific needs.

However, I disagree on what the contents of the "Core SDK" should be. To me, the most important part of the SDK nowadays is not the Windows headers/libraries, but the SDK tools. Hence, I want the minimum download option to be just the tools. The headers/libraries package should be built on top of the tools package, not the other way around (which is what has happened some times in the past).

My justification for this is that the SDK tools package today is very relevant to all SDK users; both .NET and Win32/64 developers. The headers/libraries package, however, is only relevant most of the time to the second camp.

2. PowerShell Build environment. The Windows SDK would include a script similar to SetEnv that users could run, which will set up an SDK build environment under PowerShell.

I've already said in the past I think this is an excellent idea, which I fully support. For now, I make do with my own custom version tailored to some particular needs of my dev environment.

3. Quick method to install “only .NET” or “only Win32” components.  For example, Win32 developers could quickly choose to receive only Win32 resources in the documents, tools, samples, etc.  Developers focused on managed code could choose to receive only .NET Framework resources in the documents, tools, samples, etc.

I guess this might be a good choice for many people, but to be honest, it's not one I'd particularly use much.

4. Improvements to documentation: better Table of Contents, better filters, search, etc.

Improved documentation is always welcome. The table of contents is the main entry point into it for a lot of people, so organizing it better would definitely be beneficial.

As for the rest, search and filtering really depends a lot on the underlying medium that hosts the documentation itself (the offline help viewer or the online MSDN site), so not sure what the SDK team can actually do to address those issues.

On the face of this, I'd say just focus on improving the content itself.

This does remind me of something: The current local help viewer MSDN uses (and many of the dev-side tools) is not something I'm particularly fan of. I was around when the original HtmlHelp engine was introduced and replaced the old Infoviewer that MSDN/Technet used, and was one of those that complained a lot about it.

I now long for the good old days of HtmlHelp. For example, I've seen the BizTalk SDK docs (which usually ship in the current help technology) in the old HtmlHelp format, and damn, it was way better/faster than what we have now. That sucks.

5. Improvements to documentation: integrate the SDK docs with MSDN docs.

We've had integration between the SDK docs into the MSDN docs in the past. In fact, I'm somewhat surprised it isn't now (I haven't installed either MSDN or the SDK docs in years!).

I used to think this was a good idea; I'm not so sure anymore. Two things come to mind:

  • MSDN is already huge, and somewhat bloated. Adding the SDK docs on top isn't going to improve the situation. In fact, considering point 4 above, integrating them is going to make search and filtering them even harder/worse.
  • The MSDN/SDK docs integration we had in the past had one significant flaw, in my humble opinion: It tried to integrate the SDK table of contents into separate sections in the MSDN TOC. So if you opened the standalone SDK help, you'd see all the SDK content grouped together, but if you launched the MSDN docs, there wasn't a single place in the TOC where the entire SDK content was hanged of. Annoying as hell.

The second point, in particular, brings up one question: Do we really want them integrated, or merely accessible from a single place?

6. Ship a newTools Explorer” to group the tools and provide a more friendly and efficient way for users to search and use the SDK tools.

Not interested. Keep the SDK lean and mean and get it back into shape.

7. Ship a newSamples Explorer” to group the samples and provide a more friendly and efficient way for users to search and use the SDK samples.

Not interested on an offline tool for this. I think there might be some use for this as an online tool where I could look for a sample and quickly get the code just for that specific sample, instead of having to download the entire SDK samples package (which is huge) to find it.

8. Windows SDK in non-Visual studio IDEs. Provide additional support for other IDEs, such as Windows SDK integration with non-Microsoft development environments, links to Windows SDK documentation from within other IDEs (Eclipse, IBM VisualAge for C++, Borland’s C++ Builder), among other possible integration scenarios.

Haven't been using them at all, so I'm not qualified to really talk about it. I do believe that, for the most part, the SDK should be completely independent of VS and should be usable in a standalone fashion.

9. Create a new Download portal with SDK ‘nuggets’ so that you can download small packages – perhaps a popular tool or file that shipped broken or was missing from a released SDK.

Might be a good idea, but if you're going to do it; do it right and stick with it. The SDK has seen as many changes in installer/downloaders as name changes, and it's a drag.

Don't get me wrong, it's OK to change installers if they bring a significant improvement to the SDK user, but I've never really found that obvious during the several last installer changes. I'm sure each one might have improved something for the SDK team, but from where I'm sitting, it's just been an annoyance to me without bringing any benefits.

It does bring up the question of just what exactly the SDK needs an installer for, besides just registering environment variables and creating the start menu entries...

Technorati tags:

Programming with Large Fonts

Every once in a while someone voices their surprise when they try out one of my Visual Studio Color Schemes and find that all those very large fonts I leave by default on most of them [1].

I've been using Damien Guard's excellent Envy Code R font, usually at 15pt size, which I'm sure looks huge on some screens. It took some time for me to get used to it, as well, but I wouldn't change it now.

 VSFonts

The reason I started using large fonts when writing was readability. I've been mostly a laptop user for a number of years now, but a couple years back I started using one with a high-resolution (1920x1200) screen.

Having that much screen real state is fantastic, particularly on a large screen, but it can be a bit hard on a 15.4" laptop screen because most things will look pretty tiny. I have good eyesight, but looking at really small text for prolonged hours can really make your eyes (and your brain!) sore.

Instead of just fiddling around with the Visual Studio fonts, the ideal solution would've probably been to adjust the screen's DPI settings accordingly (120 DPI), but the truth is that, right now, that's not really an option on windows if you don't want a lot of applications to (a) look like crap or (b) become unusable because not all the controls fit into the display. Unfortunately, most Windows applications aren't really ready to work with high DPI settings and/or large system fonts, so this isn't as good an option.

However, it turns out that you don't really spend most of your time looking carefully and closely at all applications were text might be small. Instead, you spend most of your time in a few key applications, and simply increasing the font size for those can really make a difference in how you feel at the end of the day.

In my case, those applications are Visual Studio, Vim, Console and Firefox. The first three are easy to set up, and that's why most of my VS settings files include large font sizes for all the key elements (the editor and tool windows).

Tip: If you like/need large fonts in VS, increasing the font size of your tooltips can also make a big difference. This is something I used to do but always forgot to change and make sure my theme files included this setting. The screenshot above uses 11pt Tahoma for the tooltips.

Firefox is a bit harder to set up right, because changing the default fonts used can break a bunch of sites. Fortunately most of the web sites I visit regularly are very friendly to the use of Ctrl-+ to increase the font size.

The downside of using large fonts

There's obviously a downside of using large fonts when programming: You can fit less lines on a screen. Usually, the increased width of text doesn't matter much, particularly on a screen with wide format because there's so much horizontal space. Vertical space, however, is a different matter because it means you can display a lot less lines at at time.

For example, on a full screen VS session using Envy Code R, I can view 55 lines at a time at size 10pt, but only 40 at 15pt. That's 27% less lines!

It's not that I write methods with hundreds of lines, mind you. It's just that even if all your methods are short, sometimes you do have to deal with longer code files. It might be that you have classes with a lot of short methods, or maybe you're dealing with legacy code that's not so nicely factored.

Having a large screen with smallish fonts can make it a lot easier to work on these kinds of scenarios, because navigating and making certain changes/refactorings is a lot easier (even with automated tools for that). For example, one scenario were I really wish I had a larger screen is when dealing with source files that have several shorter classes defined in it (not all people are fans of the "one class per source file" convention).

The solution, of course, is to get a larger screen. For this reason, one of my goals for this year is to get a new desktop machine (after several years desktop-less) with a decent, large LCD. I'm aiming for a 24" display so that I can keep using the 1920x1200 resolution. The downside, of course, is that they are pretty expensive around here, so I still need to save some money before I can afford one alongside with a decent desktop machine.

[1] If you were also surprised about the "15pt Courier New" font; look closely and make sure you install the right font, or replace by your favorite :-)

Technorati tags: ,

Windows Server 2008 on MSDN

I had seen the RTM announcement but no mention on when it might be in MSDN. Took a look yesterday, and it was already available, cool :-). At least the Datacenter/Enterprise/Standard DVD for x86 and x64, Itanium (those are still around?) and Web Server seem to be available.

I'll need to spend some time installing it on a VM to take a good look.

Technorati tags:

User Home Folders

Jeff Atwood has an interesting piece about applications polluting the user space, forcing their own folder and file structures and other crap into the user's home folder. He's right, but I think he stops too early.

Jeff's right that the classical Unix practice is a bit better than what Windows has always provided (even on Vista). In a lot of ways, Unix is in a better position because of things that are not directly related, but that do end up helping the issue in the end, like a better (simpler!) overall file system layout, and not having to deal with drive letters (or the stupid registry, for that matter).

However, even Unix suffers from user space pollution quite a bit, though in a different way than Windows user suffer. On Unix, the issue is mostly applications storing temporary and configuration data all over the user's home folder in hidden directories.

Different Kinds of Data

There are several problems that affect how Windows and Windows Applications handle the users directory. For me, however, the lack of good, clear guidelines and the way that application developers abuse the user's home folder are secondary issues.

The real problem to me is that very few developers (or the OS providers, for that matter) realize that there's different kinds of user-specific data, each one with different requirements and storage needs.

I see at least three different kinds of data you might want to store in some user-specific location:

  1. The user's own data (documents, videos, music, projects, etc.). Applications will produce and manipulate this data but they should not dictate where and how it has to be stored; that's the user's problem.

    I hate it when applications insist on forcing a specific folder structure on me or on creating their own folders in my documents folder. Exactly because of this is that I've completely given up on using the "My Documents" folder on windows; and consider it pretty much a storage for temporary or throw-away data I don't care about (too many applications putting their crap there).

    I should point out that this category, for me, is, for the most part, about data the user is explicitly aware he is creating, not about other crap the application might be producing as a side effect. In other words, if you're thinking about saving a new file without asking me for a location, don't even dream about storing it here.

    Everything in this location is stuff I want to backup easily, without having to go around excluding folders created by tyrant applications.

  2. Application configuration data: This is basically your typical application user-specific settings. To me, data belonging here should be, unless something extraordinary is required, machine independent. This is the settings I simply want to take with me when I switch machines, or that I want to keep synchronized between machines to keep my applications working the same everywhere. Again, this is stuff I want to be able to backup easily without carrying extra loads of crap, but at the same time I don't want to have to deal with it (like navigate through it) unless I explicitly want to modify something here.

    I very much prefer applications store their settings in the Unix-fashion: using simple text files (whatever the format). Even though windows stores the user registry hive file inside the user's profile directory, backing it up is mostly an exercise in futility and it is non-portable for the most part.

  3. Other stuff: This will be mostly temporary or cache data used by applications, that I should never have to backup or care about. If it's gone, the application should be able to recreate it as needed or simply work without it. Classical example: a browser's cache and history files.

    Some Unix applications, for example, intermingle this data with their applications settings, which is a big no-no. You don't want to have to worry about backing up your browser settings and ending up with 200MB of its cache in it.

There are probably other kinds of data I'm missing, but the key point is that they are different and should be stored separately. I really wish more application and OS developers started using something like the above discrimination. It would make our lives oh, so much simpler.

Technorati tags:

What Irks Me About Visual DSLs

There's a lot of talk about Domain Specific Languages lately. The exact definition of what a DSL is, however, might change depending on who you ask. Microsoft itself tends to favor significantly Visual DSLs, that is, domain specific languages that are made of visual components (as opposed to Text-based DSLs that are made of some kind of text driven representation).

Frankly, I don't expect MS to change their direction, nor am I sure it would be the wisest decision given their target audience, but I do tend to favor text-based DSLs myself, for several reasons:

  1. Text-based DSLs work best during development. We have a significant amount of experience and a rich set of tools available to deal with text in an effective fashion: Source control and comparison tools, good editors, diff'ing and merging, and so on.
  2. A text-based DSL is illustrative in and of itself. Anyone with a text editor can look at it, so it only requires special tooling during execution, unlike their visual counterparts.
  3. If you're spending significant time using a DSL to create new things (versus, say, simply visualizing existing stuff), then a textual DSL is usually more effective.

I should say at this point that XML-based languages don't necessarily fit this descriptions. XML can be clunky at times, and a lot of people hate having to manually crank XML to do something. For example, many people dislike manually editing NAnt or MSBuild build files.

What's not to like about Visual DSLs

Many Visual DSLs are very appealing at first to create new things when you're unfamiliar with the language, as they can be very didactic. But once you're familiar with the language, Visual DSLs, as implemented by most tooling out there, will usually get in the way instead of boosting your productivity.

Don't get me wrong; there are a lot of things to like about Visual languages. In particular, they can be great tools for visualizing things. In some cases, they are great tools to editing existing things and occasionally, even creating new things.

The last one, however, is pretty rare. I've been thinking a lot about this, and I've started to think that one of the reasons this is so is that there's a fundamental disconnect in how we usually think about Visual Languages and Tools.

The disconnect is that we tend to assume that the visual representation of the underlying domain that is best for visualizing and describing the language is actually an acceptable choice for "writing" in that language.

For example, let's consider Windows Workflow Foundation workflows or BizTalk Orchestrations. Both could be consider DSLs for building processes and workflows, and they are actually pretty effective at that. Both use a visual representation that feels more or less natural to people used to working with processes (or state machines, in the case of WF). Both of those representations are great for working with existing processes, as they allow the reader to quickly grasp the flow of it, and it even works very well when debugging a running process.

But, to be honest, both leave a lot to be desired when you're actually sitting down to create and define a new process, and both tend to get a lot in the way. I personally feel that WF is a lot worse in this respect.

XAML

I should mention that I do not consider XAML a text-based DSL (even if it is "just text"). Fundamentally, XAML is just a serialization format, and that shows in a number of places. It is build to be created and consumed by software tools, not the human developer (though it is possible to do so, as many people found out with WPF in the early days).

More importantly, these kinds of XML/XAML languages that are aimed at tools don't necessarily work great with the tooling we have for dealing with text (see the all-important point 1 above). For example, a lot of people have found the hard way that trying to do a diff or trying to merge two copies of a tool-generated XML/XAML file can be nearly impossible at times.

It's pretty evident that Microsoft is working on a lot more tools based on XAML, so that's here to stay, but it remains to be seen yet how that is going to work out. I'm sure there's going to be good Visual tooling around it, but, as usual, the problem is that it just isn't enough.

What about Oslo?

A lot of my fellow MVPs and a bunch of people that attended the recent SOA and Business Process Conference have mentioned Microsoft's Oslo initiative that was announced at the conference.

From what little I know of it, it is a far reaching initiative, touching multiple key products in the Microsoft development platform. A significant component of this effort is investment in models and, you guessed it, modeling tools around them. I think it's obvious to everyone by now that a substantial set of those tools will be built around visual DSLs and visual designer tools (that XAML's in there somewhere is probably also a safe bet). Some people will think this is a key advantage, others will probably hate it.

The one conclusion I've reached so far regarding Oslo is that will likely mean a significant shift in how we do development on the MS platform (at least for those of us involved in connected systems). However, I'm holding my thoughts on what will be good or bad about those changes until we know more precisely what the Oslo technologies will be delivering and we have a clearer picture of how we will interact with them. Also, it's clear that this is an initiative that will be gradually rolled out, so it will probably be a long transition period around it (which is both good and bad in itself).

As customers, and users of those technologies, however, we have a big opportunity, and a big responsibility in letting Microsoft know what kind of tooling we want/need to have around the modeling tools and other technologies. Like I told someone at MS recently: "I don't expect MS to shift its position on visual tooling and Visual DSLs, but I do wish the hooks and infrastructure was there for us in the community, creating our own, non-visual DSL tools around it that allow us to work more effectively with the technology". Hopefully, that little thought will not be forgotten.

Technorati tags: , ,

Functional Programming

I've been trying to learn more about functional programming in general lately, and there's a lot of good stuff around to read on it. Unfortunately for me, a lot of the content on the web on the topic uses Haskell to explain the concepts, and I find the Haskell syntax to be somewhat intrusive when trying to get a grip on the core concepts. Still, I probably should make a bigger effort in understanding Haskell :-).

While reading up on this topic, I ran across the Functional Javascript library, which I think is pretty cool. I'm constantly surprised just by how flexible and powerful JS is, and it's a real pity that such a great programming language has been so closely associated with the browser throughout its history. I still suck at JS programming, but that doesn't mean I can't appreciate the elegance and power some of the libraries and constructs for the language have to offer.

By the way, I've also become a great fan of programming.reddit.com; it's been a very valuable source of interesting links, blogs and articles focusing on dynamic and functional programming, among other topics.

(While on the topic of functional programming, it's good to know I wasn't the only one confused by wikipedia's entry on currying).

Functional Programming on .NET

Now that .NET 3.5 has RTM, some of the constructs in C# 3.0 make it possible to adopt a more functional style of programming on C#. Some of it was possible in C# 2.0, but the anonymous delegate syntax was still a bit too cumbersome. The new lambda syntax makes it a lot nicer, easier to read and easier to code (though because of the static typed nature of the language it can still be a bit cumbersome at times). Eric white has a lot of interesting tutorials on functional programming with C# 3.5 and LINQ on his blog, in case you haven't yet run across them.

Obviously F# is the functional programming on .NET for now, and it has tons of really good stuff, though personally I'm still trying to wrap my head around it.

I think that, overall, there's ton of good things to learn from other programming paradigms that can helpo make you a better developer even if you're writing your code in a language favoring a more traditional approach. It's certainly very valuable to understand and learn different approaches to problem-solving and incorporate them into your arsenal.

Technorati tags:

Where do you put your braces?

If you're writing code in a programming language derived from C (i.e. one of those with pesky curly braces), where do you like putting your braces?

Some people like putting the opening brace on a line by itself right after the declaration/statement:

 NextLineBraces

I'm one of those that usually does this. It just was something I got used to from my C/C++ days and since it was the common convention in .NET back when I started fiddling around with it in 2000/2001. So this is usually what I use when coding on C#.

However, other people like to put the opening brace inline with the declaration/statement:

 InlineBraces

This question is brought to you courtesy of me noticing that IronRuby favors the "inline opening brace" style.

Actually, I use the inline-opening-brace style myself when I'm working with Java. I don't mind it, but it somehow seems wrong sometimes. I mean, using different styles when working across two languages so similar? I mean, I already have enough trouble remembering stuff like string vs. String and ToLower() vs toLowerCase() to worry about how the code looks ;-).

Historical Note: A few years back I used to be weirder and used a mixed style where I used braces-on-next-line for some things and inline-opening-brace for others depending on what it was and how many lines were in between the opening and closing braces. Yes, I was (am?) anal like that. I finally gave up on it when I grew tired of fighting the auto-formatting rules in IDEs.

What style do you favor, and why?

Technorati tags: