The XML Specification allows attribute values to be wrapped with either double quotes (") or single quotes ('). Normally, BizTalk Server, as a good standard-abiding citizen, has no problem with this. However, I had been seeing messages on the BizTalk newsgroup claiming that BizTalk would refuse to process XML documents using single quotes when the XML Disassembler component tried to parse the incoming message.

I spent a few minutes researching this claim and found out that indeed there is some truth to this statement. Indeed it appears there is a bug in the XML Disassembler when it runs into documents using single quotes, but it only affects some documents.

The first thing the XML Disassembler component does when it receives a message is probe it to see if it is indeed and XML message and try to "guess" the encoding it is in. As part of that it will look not only for a BOM (Byte Order Mark), but also for an declaration containing an encoding attribute.

Here's the problem: if the encoding attribute in the xml declaration is wrapped in single quotes, the parsing fails. In other words: BizTalk has a real issue with a document that begins like this:

It does't have a problem with this, however:

In fact, as long as the encoding attribute is missing or is wrapped with double quotes, BizTalk will happily accept and correctly parse the message. If not, BizTalk will fail with the following error:

There was a failure executing the receive pipeline: "Microsoft.BizTalk.DefaultPipelines.XMLReceive, Microsoft.BizTalk.DefaultPipelines, Version=3.0.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" Source: "XML disassembler" Receive Port: "ReceivePort1" URI: "C:\temp\BizTalk\TestIn\*.xml" Reason: Length cannot be less than zero.
Parameter name: length

Looking a bit throught the XML Disassembler code using reflector, it becomes very clear that the Probe() method of the XmlDasmComp class (i.e. the disassembler component) is fatally flawed, as it will attemtp to do a manual parsing of the  declaration and explicitly expects the encoding attribute value to be wrapped in double quotes. Here's the relevant bit of code:

if (text2.Contains("encoding"))
{
text1
= text2.Substring(text2.IndexOf("encoding"));
text1
= text1.Substring(text1.IndexOf('"') + 1);
text1
= text1.Substring(0, text1.IndexOf('"'));
}

Unfortunately, there's no easy workaround without actually modifying the incoming message either by hand or through a custom decoding pipeline component that can fix it..

Technorati: ,


Tomas Restrepo

Software developer located in Colombia.