I've just posted a new sample (very simple one, by the way) custom pipeline component for BizTalk Server: The FixEncodingComponent. What this sample does is allow you to tell the XmlDisassembler exactly using which encoding or Charset it should interpret the message in.
This alleviates the problem sometimes where you get an incoming XML message on which the XmlDisassembler barfs because it cannot figure out on it's own what the message's encoding is to process it. According to the BizTalk documentation, here's the rules the XmlDisassembler uses to figure out a message's encoding:
- If a byte order mark exists in the data, encoding information is determined from it.
- Otherwise, if the IBaseMessagePart.Charset property is set, the encoding specified there is used.
- Otherwise if the XML declaration is present in the XML document, the encoding specified there is used, provided the XML declaration is ANSI.
- Otherwise, UTF-8 encoding is used.
Option 2 is what the FixEncodingComponent does, since it is by far the simplest one and doesn't require fiddling around with the message stream, which is both slower and error prone. Here's how to use it:
- Create a new custom receive pipeline
- Add the XmlDisassembler to the pipeline and configure it to your need.
- Add the FixEncodingComponent to the pipeline in the Decode stage, so that it is run before the XmlDisassembler.
- Set the Encoding parameter of the component in the Property Explorer to the encoding you want to use. To make this easier, the component implements a custom UITypeEditor that pops up a list of all the encodings supported by the .NET framework:
I've included in the sample a test project that uses routing between ports and includes a test XML message encoded in IBM EBCDIC (International) encoding (the File -> Advanced Save Options command in Visual Studio is pretty helpful for this). To see the pipeline component in action, just configure the receive location with the standard XmlReceive pipeline and with the custom pipeline included in the sample and notice how one fails and how the other one works :)