A few months ago I wrote about an issue with writing fully streaming custom decoding components for BizTalk pipelines, which originated in limitations imposed by the disassembler components (the XmlDisassembler component in particular) on the streams it received from the decoding stage.
While doing some further work on custom decoding component I ran into what seems another reason why you'll need to do in memory buffering in some scenarios:
I think that for most purposes, all the components care about is that the position of the stream can be queried; however, because of the mixed COM/.NET nature of the BizTalk Messaging engine and some of the existing components, custom streams need to be "partially" seekable. This is not so much because they are going to actually try to move the position of the stream, but because they use the Seek() method of the stream to discover the current position of it (instead of using the Position property) as described here.
It does appear, unfortunately, that the disassembler will also require that your custom stream class is able to determine the stream length; that is, that the Length property is implemented and returns the correct value. If you don't implement it, then an error will be thrown during the disassembling. Furthermore, it appears that at some point during the disassembling process something like this is tried:
int length = (int)stream.Length;
byte[] buffer = new byte[length];
stream.Read(buffer, 0, length);
Something that hints at this is that if you just return 0 from your Length implementation, the disassembler tries to read 0 bytes from your stream (though I haven't verified this 100%, it might be something to watch out for).Anyway, just wanted to mention that it appears it is important that your custom streams returned from your decoding components are able to accurately return the stream length as it appears to the disassembler. This can be hard in some scenarios without fully buffering the message in cases where the length of the decoded data is different from the encoded data, because the component cannot predict the resulting length before actually processing the entire stream.
Technorati: BizTalk, Pipelines, Components