Archives for category: WCF

I spent some time yesterday evening tracking down an issue with a custom WCF transport channel. This particular channel would support the IInputChannel channel shape, but with a twist: it would return a custom implementation of System.ServiceModel.Channels.Message instead of one of the built-in WCF implementations.

There’s nothing wrong with that and it was working just fine in most cases, until I ran into a few scenarios where the OnClose() method of my custom Message class wasn’t being called at all.

After some digging, I discovered that the specific messages this was happening to were not being processed normally by the ServiceHost infrastructure. In particular, they were not being dispatched because no method in the service contract matched the action in the messages, so instead the UnknownMessageReceived event in the ServiceHost instance was being raised.

Our UnknownMessageReceived implementation wasn’t closing messages explicitly, but that was easily corrected, so no problems, right? Wrong. Turns out, that it appears that in WCF 3.0/5 (haven’t checked 4.0 yet), if you do not have an event handler for UnknownMessageReceived, then the messages won’t get closed either.

This seems like a bad bug to me, because since Message implements IDisposable, there’s obviously the expectation that it will hold resources that should be released as soon as possible, so not calling Dispose() or Closed() will leak resources and potentially cause trouble.

Attaching an event handler to UnknownMessageReceived just isn’t an option always, and even if it were, there’s no reason why WCF itself shouldn’t be guaranteeing that messages are closed as soon as they aren’t needed.

A WCF question came up recently on StackOverflow about using a binding for a custom transport through configuration instead of code. My initial reply was that you needed all the right configuration classes associated with your binding so that you could register the right binding extensions for the usage.

The question from there evolved into something more specific: Once I’ve got my binding collection element created and registered as an extension, what if I want to use it on an endpoint by specifying only the binding and not a bindingConfiguration (like you can do with the standard bindings). In other words, do this:

<system.serviceModel>
   <extensions>
      <bindingExtensions>
         <add name="myBinding"
              type="MyNamespace.MyBindingCollectionElement, MyBinding" />
      </bindingExtensions>
   </extensions>
   <services>
      <service name="server">
         <endpoint address="my://localhost:8484" binding="myBinding">
         </endpoint>
      </service>
   </services>
</system.serviceModel>

In theory this should just work, but it didn’t for the original poster. I had a sample app set up for one of my custom WCF transport channels, and had tested it, so I knew something was amiss, but didn’t notice it at first. That is, until the original poster uploaded a sample app and I could reproduce the problem.

The solution was simple, but unexpected: It turns out that WCF won’t recognize your custom binding name unless you’ve got a <bindings> section in your configuration file. Mind you, it doesn’t need for you to actually add a configuration for your custom binding there at all, just that an empty bindings section exists. In other words, just adding this would fix the problem:

<bindings/>

A question I recently ran into asked about comparing message sizes when using the default Text Message Encoder used by most bindings in WCF with the Binary Message Encoder used by the TcpBinding, without going through the entire WCF stack and having to use a protocol analyzer.

Here’s a sample application that just serializes an object using the DataContractSerializer and then runs it through both message encoders and presents the results:


using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.Serialization;
using System.ServiceModel;
using System.ServiceModel.Channels;
using System.Text;

namespace BinSerializationTest {
   [DataContract]
   class SampleDataContract {
      [DataMember]
      public string Member1;
      [DataMember]
      public int Member2;
      [DataMember]
      public string Member3;
   }

   class Program {
      static void Main(string[] args) {
         try {
            SampleDataContract dc = new SampleDataContract {
               Member1 = "Sample string 1",
               Member2 = 142234,
               Member3 = "Different data here"
            };
            byte[] xml = SerializeToXml(dc);
            Console.WriteLine("DC serialized to XML resulted in {0} bytes", xml.Length);
            byte[] bin = SerializeToBin(dc);
            Console.WriteLine("DC serialized to binary resulted in {0} bytes", bin.Length);
         } catch ( Exception ex ) {
            Console.WriteLine(ex);
         }
      }

      static byte[] SerializeToXml<T>(T obj) {
         Message msg = ObjectToMessage(obj);
         MessageEncodingBindingElement mebe = new TextMessageEncodingBindingElement();
         mebe.MessageVersion = MessageVersion.Soap12;
         return Serialize(msg, mebe.CreateMessageEncoderFactory());
      }
      static byte[] SerializeToBin<T>(T obj) {
         Message msg = ObjectToMessage(obj);
         MessageEncodingBindingElement mebe = new BinaryMessageEncodingBindingElement();
         mebe.MessageVersion = MessageVersion.Soap12;
         return Serialize(msg, mebe.CreateMessageEncoderFactory());
      }
      static byte[] Serialize(Message msg, MessageEncoderFactory factory) {
         MessageEncoder enc = factory.Encoder;
         MemoryStream stream = new MemoryStream();
         enc.WriteMessage(msg, stream);
         return stream.ToArray();
      }
      static Message ObjectToMessage<T>(T obj) {
         DataContractSerializer ser = new DataContractSerializer(typeof(T));
         return Message.CreateMessage(MessageVersion.Soap12, "", obj, ser);
      }
   }
}

Hope it’s useful!

I’m going over some WCF custom transport channels I had worked on recently and noticed something that made me suspicious right away that something was wrong: Using my custom channel listener, only a single message would be processed at a time, even though several messages were being fed to WCF.

That something was wrong became more evident when I started messing around with the concurrency and instancing modes and realized that while they were being obeyed, it didn’t change message processing at all.

Now, WCF has some pretty important options for throttling message processing on the service side, which are configurable through the ServiceThrottling behavior. Wanting to confirm my fears, I started tweaking the MaxConcurrentInstances setting and noticed that indeed, it didn’t seem to make any difference.

Something interesting happened then: I rebuild my test service class implementation to use the Async pattern instead, and what happened? Multiple messages started being processed concurrently, in accordance with the settings I had put into the InstanceContextMode and the ServiceThrottlingBehavior. Obviously, something was wrong my channel listener implementation!

Digging further, and after a bunch of different tests and lots of printf debugging, I found out what the problem was:

In my tests, I was using a normal ServiceHost to run my custom channel and the test service implementation. The way the host interacts with the channels is such that WCF was eventually using the BeginTryReceive() and EndTryReceive() methods to receive the messages that were queued in the channels. All that implementation was working perfectly, except for a single detail:

The code that handled dispatching any waiters that had called BeginTryReceive(), was firing all the notifications using the same thread. That means that the dispatcher was waking up all waiters by going through the waiter list and calling the corresponding AsyncCallback for each one synchronously.

This, turns out, is a big mistake: It essentially causes WCF to invoke your service implementation using a single thread, thus causing the behavior I was seeing of only seeing one message processed at a time. Changing the service to use the async pattern pretty much meant that WCF would pick up the message call the service and be able to pick up another message right away, making the issue go away (at least up to a point). That’s why I was seeing different behaviors between the two modes.

The solution was simply to make sure the dispatcher would delegate notifying each waiter to a thread pool task, thus making each one independent.

Thinking about this after the fact, I can see this should’ve been just obvious from the start, and was a really stupid mistake on my part. However, I can’t help but wonder if the WCF model gives way too much responsability to the underlying channel. Just look how a simple mistake here can make the transport channel completely change the WCF execution model and make all the throttling go all wrong.

I’ve been working a lot with low-level contracts in Windows Communication Foundation that send and receive raw System.ServiceModel.Channels.Message objects. These are fairly flexible, though the API can be a bit confusing.

One of the things I was doing was making sure I was doing proper streaming of large messages, which in WCF relies on the XmlReader and XmlWriter classes.

What is not be very obvious at first is that it is possible to receive a Message object and still do streaming: Basically when the received message is created, it takes an XmlReader that is only consumed when you ask WCF to write the message to an XmlWriter.

Internally, the message will read each node from the XmlReader and write it to the XmlWriter, which works very well for the streaming. But it can also trip you sometimes, like what happened to me recently:

Basically, my WCF service was receiving an streamed XML file, went up to the Message-based contract, and then I streamed the message out to a file. With some message payloads, I’d sometimes get the following error:

System.InvalidOperationException: Token Text in state EndRootElement would result in an in valid XML document. Make sure that the ConformanceLevel setting is set to ConformanceLevel.Fragment or ConformanceLevel.Auto if you want to write an XML fragment.

I opened up the original message and as far as I could see it was perfectly well-formed XML. Until I opened the file in a hex editor and realized it had a Line Feed (\n) character right at the end after the closing tag!

You can, of course, fix the error by making sure the source doesn’t have any characters after the closing tag. However the exception also suggest another workaround which can be more useful when you don’t control the input, like in a WCF service: Set the XmlWriter to the Fragment conformance level, which won’t give you trouble with the extra \n at the end.

Here's a simple console application that exhibits the problem:

using System.IO;
using System.ServiceModel;
using System.ServiceModel.Channels;
using System.Xml;

class Program {
   public static void Main() {
      string xml = "<doc><part1>Start</part1><part2>End</part2></doc>\n";
      MessageEncoder encoder = CreateEncoder();
      Message msg = encoder.ReadMessage(ToStream(xml), Int32.MaxValue);

      try {
         XmlWriterSettings set = new XmlWriterSettings();
         // Uncomment this line to avoid the exception:
         //set.ConformanceLevel = ConformanceLevel.Fragment;
         XmlWriter xmlWriter = XmlWriter.Create(Console.Out, set);
         XmlDictionaryWriter xmlDictWriter =
            XmlDictionaryWriter.CreateDictionaryWriter(xmlWriter);
         msg.WriteBodyContents(xmlDictWriter);
         xmlDictWriter.Flush();
      } catch ( Exception ex ) {
         Console.WriteLine("\n****************");
         Console.WriteLine(ex.ToString());
         Console.WriteLine("****************");
      }
   }

   static Stream ToStream(String text) {
      MemoryStream stream = new MemoryStream();
      StreamWriter writer = new StreamWriter(stream);
      writer.Write(text);
      writer.Flush();
      stream.Position = 0;
      return stream;
   }

   static MessageEncoder CreateEncoder() {
      TextMessageEncodingBindingElement tmebe =
         new TextMessageEncodingBindingElement();
      tmebe.MessageVersion = MessageVersion.None;

      MessageEncoderFactory factory = tmebe.CreateMessageEncoderFactory();
      return factory.Encoder;
   }
}

While working on a custom WCF transport channel implementation I ran into a problem that baffled me for a while. As usual, the problem ended being caused by my own fault, but it was hard to figure what was actually going wrong :-).

The problem revolved around a the ChannelListener for the service side of the transport, which was getting executed through a normal WCF-provided ServiceHost (as opposed to using the channel model directly). The puzzling behavior I was seeing while testing and lots of stepping around in the debugger was like this:

  1. The ServiceHost would open my ChannelListener implementation, which in turn would mean my OnOpen() method would get executed.
  2. The host would then do a BeginAcceptChannel() call on my listener with an infinite timeout. The listener gladly processed the request and returned the corresponding IAsyncResult object.
  3. Meanwhile, the channel listener was busy doing it’s thing, and, within a few hundred milliseconds, would produce a new IInputChannel and signal the WaitHandle returned in the BeginAcceptChannel() call.
  4. The host would wake up after the wait handle was signal and call the listener’s EndAcceptChannel(), which promptly returned the new queued input channel instance.
  5. … And that was it. Nothing else happened. The host would not call any method on the IInputChannel returned. Not Open(), much less posting a wait or a receive for a message. Obviously, the whole thing would grind up to a halt at this point, but no errors would get raised, the service host did not fault and there wasn’t anything suspicious logged to the ServiceModel activity traces.

What was going on? Well, it seems that the OnOpen() implementation in my ChannelListener was actually blocking, instead of returning control right away. I’m not sure if this was an effect of how the async variant of the call was working, but the end result was that as far as the service host was concerned, the listener had, not completed initializing, so it just didn’t do anything else.

What really confused me while troubleshooting this problem was that because BeginAcceptChannel() was getting called, it didn’t even occur to me to look into the opening of the listener; I just assumed that if the host was trying to accept a channel, the listener had successfully opened. Looks like this isn’t as clear cut as this.

Once I fixed the blocking open of the channel listener, the host started calling my input channel methods again and everything went back to normal. Lesson learned.

I’ve been working the past week on some custom WCF transport channels. It’s been pretty instructive, though it can be a bit of a frustrating experience at times.

The past couple of days, I’ve been busy getting metadata/policy export/import working correctly, and ran into an error when using SvcUtil.exe to import my service metadata:

Attempting to download metadata from 'http://localhost:6466/Metadata/' using WS-Metadata Exchange or DISCO.
Error: An error occurred in the tool.

Error: The binding on the service endpoint cannot be configured.
Parameter name: endpoint.Binding

This is a pretty generic error message that, well, doesn’t really tell you anything. SvcUtil.exe (and it’s related VS feature) is a pretty key piece of the WCF developer toolkit, but, unfortunately, it’s (a) full of bugs and (b) has horrible error handling, which makes diagnosing problems with metadata import incredibly frustrating.

As usual, the problem turned out to be mine: I had registered my IPolicyImportExtension/IWsdlImportExtension implementation in a custom app.config file that I was feeding to SvcUtil using the /svcutilConfig argument. This worked fine and the importer was getting correctly invoked and executed just fine.

However, I forgot to register my bindingExtension & bindingElementExtension for my custom binding and custom transport binding element, which prevented SvcUtil from serializing the custom binding to the configuration XML.

Once I fixed that, it worked right away. I just wish that SvcUtil could have generated a clearer error message pointing out what the problem actually was.

I’ve been working this week on implementing some custom WCF transport channels. The one I was working on initially is a two-way send channel, so, naturally, I decided to begin by having my channel implement the IRequestChannel shape.

This was working fine until I decided to try a one-way operation on top of my two-way channel, which was interesting because I was not implementing an equivalent part of the transport supporting IOutputChannel.

I was expecting it to fail right from the start, but, surprisingly enough, it almost worked. That is, the transport channel was called, executed, and only bombed when the WCF stack ran into the unexpected response message. The error message I was seeing was something like this:

System.ServiceModel.ProtocolException: The one-way operation returned a non-null message with Action=''.

Looking around, I realized a couple of things: First of all, the built-in HTTP transports in WCF also implement only IRequestChannel, and yet, somehow, they still manage to work just fine in this case.

The second thing that came to mind was that I was using a custom MessageEncoder. Granted, it was a very rough and crappy encoder, but it was mostly enough for me to test a few things about my custom transport.

That last part sparked something deep in my memory: I seemed to vaguely remember something about encoders that might be related, so I turned right away to the best source for everything WCF related: Nicholas Allen’s blog.

Fortunately, Dr. Nick did not disappoint! In one of his articles about writing a custom POX Message Encoder, he included this little gem:

“One of the properties of the standard message encoders is that they only surface up messages that have a non-empty message body.”

That was indeed what I was seeing: My custom transport was indeed returning a message from IRequestChannel.Request(), but it had no content at all (something particular to the kind of request I was making over the one-way channel).

Once I changed my custom encoder to ignore the empty message, it worked just fine.

The Net.MSMQ Binding in Windows Communication Foundation has some built-in support for detecting and handling "Poison Messages".

One of the most useful ways of handling Poison Messages in queuing systems is to get them out of the way so that following messages can still be processed (if you're not doing in-order processing), but still keep the messages around so that they are not lost.

Unfortunately, this behavior is only supported by WCF in MSMQ 4.0 with the use of a poison sub-queue, which is new in the Vista and WinServer2k8 versions of MSMQ. It's really a shame because, honestly, this could've been implemented on MSMQ 3.0 just by using a separate queue instead of sub-queues.

Instead, we're stuck with the PoisonErrorHandler sample included in the SDK. The sample provides a way to mimic this behavior by introducing an IErrorHandler implementation that detects the poison message and moves it to another queue. To be able to do this, you need to set the binding properties so that the ServiceHost listening to the queue faults when the poison message is detected. The error handler then moves the message to a new queue and starts a new service host instance to resume processing of following messages.

The Sample itself isn't all that bad, but does provide insight into a rather significant feature left out of the way the Net.MSMQ Binding reports poison messages: All it provides the error handler with is the MSMQ LookupId of the poison message. Unfortunately, this LookupId is queue-specific and the exception does not provide information on which queue or at least which endpoint was the one that received the poison message, which is a huge gaping hole in this feature.

Because of that the LookupId is useless without more information provided out-of-band [1]. The sample works around this limitation by adding an entry in the <appSettings/> section of the app.config file with the path to the queue where messages are being received and the path to the queue to move poison messages to.

This works OK, until you need a single service host listening on multiple endpoints, and then the sample won't really work anymore because you don't know which of the endpoints caused the error.

Working around the limitation

I spent some time recently playing around with WCF and ran into this same problem, so I tried to find a way of working around this limitation and still use the basic functionality the sample provided.

I found one possible workaround, which seems to work so far in my limited tests. I'm not sure yet how well it will hold, as it seems to me there's always the possibility of some race conditions here, but let me illustrate at least the basic mechanics.

First we keep the basic code of the PoisonErrorHandler class. However, instead of using appSettings to keep track of the queue we're listening to, we just try to find out dynamically which endpoint configured for the service is causing the error, and extract the path to the queue from there based on the URI of the endpoint address.

To do this, in my implementation of IErrorHandler.HandleError() I grab the current OperationContext, reach out to the ServiceHost associated with it, and then iterate over the ChannelDispatchers attached to it. The one that's in the Faulted state is very likely the one that caused the MsmqPoisonMessageException in the first place.

Code could be something like this:

private String FindFaultedQueue() {
   ServiceHostBase host = OperationContext.Current.Host;
   Uri faultedQueue = null;
   // the queue that fired the exception will be the faulted one
   foreach ( ChannelDispatcher chd in host.ChannelDispatchers ) {
      if ( chd.State == CommunicationState.Faulted ) {
         faultedQueue = chd.Listener.Uri;
      }
   }
   // translate URI into a regular MSMQ format name 
   StringBuilder fn = new StringBuilder("FormatName:DIRECT=OS:");
   fn.Append(faultedQueue.Host).Append("\\");
   string path = faultedQueue.PathAndQuery;
   if ( path.StartsWith("/private/") ) {
      fn.Append("private$\\");
      path = path.Substring("/private".Length);
   }
   path = path.Substring(1);
   fn.Append(path);
   return fn.ToString();
}

This lacks some error handling and a few other things, but it illustrates the point. Now that we have the FormatName of the queue where the poison message came from, we can try to move it to the poison message  queue. We could handle this by convention: for queue Q1, the corresponding poison message queue could be Q1Poison. The rest of the sample is pretty much the same.

I probably wouldn't use this in production, as it doesn't feel very reliable given the basic limitations imposed by the WCF implementation, but it was interesting to take a stab at. YMMV.

[1] I'll leave to you, dear reader, to wonder about the sense of providing a sample that so clearly points out such a giant hole in a product feature rather than actually fixing it. Oh wait, guess they did fix it by changing MSMQ for Vista and WinServer2k8.

Technorati tags: , , ,

BizTalk Server has some pretty strong facilities for making a "best effort" at ensuring message delivery, by using the automatic retry facilities available to BizTalk Send Ports.

This functionality is available when calling Web Services as well either with the WCF or SOAP adapters, and it's very useful. However, it's not always enough by itself.

One such scenario comes up when the Web Service you're calling does not return SOAP Faults when it fails, but instead returns some kind of status/result indicator as part of a normal response message. In that case, the SOAP/WCF adapter can't tell on its own that the service call failed and retry it as necessary.

There are a few ways you can deal with this.

For example, you can handle retries manually from an orchestration, using a Loop shape to call the service, check the result and retry the call as necessary. It's not complex (though a bit cumbersome) and in many scenarios works very well.

Unfortunately, this isn't an option if you are in a messaging-only scenario, or have complex routing rules leading to the service invocation (common with direct-binding scenarios).

One possible workaround for this if you're using the WCF Adapter in R2 that I've been experimenting lately with is to use the WCF-Custom adapter and configure a custom Endpoint Behavior that adds an IClientMessageInspector implementation that checks the response message and transforms it to a fault if it indicates.

BTSWCFFault

A few things to keep in mind if you decide to try this:

  1. For your behavior to be listed on the BizTalk Administration console, you'll need to implement a BehaviorExtensionElement for it, deploy it to the GAC and register it in machine.config (or at least this was the way I did it).
  2. Inside your IClientMessageInspector.AfterReceiveReply() implementation, you might need to create a buffered copy of the message and return a new Message instance built from it to the WCF Adapter runtime; otherwise BizTalk might not be able read the message because you've already consumed it and it will be in the closed state by then.
  3. Make sure you leave the "Propagate Fault Message" option unchecked; otherwise your fault will just be submitted to BizTalk as a regular message and won't cause the send port to retry the call.

My friend Carlos suggested that another way to deal with this issue (particularly if you're not using the WCF adapter) might be by building a custom pipeline component and using it in a receive pipeline configured in the Send Port. The component would then read the response from the Web Service and throw an exception to force a failure during the response processing if necessary, thus causing the entire exchange to fail. I haven't tried this myself, but sounds like another useful idea!