I've been spending some time this week coding some changes to a custom PowerShell PSHost for an application. One of the changes I wanted to experiment with was changing the code that executed commands so that it used Pipeline.InvokeAsync() instead of Pipeline.Invoke().

There are a couple of things that need to be handled different in this case: How you process the results from the pipeline and how you handle errors. I'll concentrate on the first one, as it is the one that caused me a bit of trouble to get right.

To process the results from an asynchronous pipeline invocation, you need to use a PipelineReader object, which is what the Pipeline.Output property returns. This allows you to read objects generated by the pipeline execution as they are coming out (i.e. as soon as they are available) instead of waiting until the entire pipeline has executed to grab the results, so the idea is pretty nifty.

Unfortunately, the documentation on how to use this object correctly isn't very good. For example, you can't rely on the Count or IsOpen properties as boundary checks to detect how many items to attempt to read. In particular, the Count property isn't reliable if you're using Pipeline.InvokeAsync() because it only represents how many objects are currently available in the reader, not the total count of objects returned by the pipeline (this is natural once you realize it, but still).

Instead, you should really rely on the EndOfPipeline property of PipelineReader to detect when you've reached the end of the object stream generated by the pipeline execution.

The second issue that's not very obvious is that when you use Pipeline.Invoke(), but you don't need to feed inputs to the command, then the pipeline won't really start executing until you close the PipelineWriter object returned by Pipeline.Input. If you don't do this, then PipelineReader.Read() will simply block forever.


Phantom Objects

The one nasty issue I did run is what appears to be a synchronization issue inside PipelineReader itself. In my original attempt to use Pipeline.InvokeAsync(), I started getting some weird results: Ghost objects were coming out of the reader.

Ghost objects?

Pretty much, yes. Let's say I executed an "ls" command on my pipeline that should return 8 items. Sometimes, I'd indeed get the expected 8 items out of the pipeline before EndOfPipeline changed to true. Other times, however, I'd see 9 items come out of it.

The last item was a "ghost" object that was empty: a PSCustomObject with no properties at all. Where was it coming from?

The only good thing about this was that if it appeared at all, it always did it as the last place in the pipeline. This gave me a clue: Could this be a marker object inserted internally by PowerShell into the object stream to mark the end of the pipeline? It sure looked like some kind of null value.

. It is, in fact, AutomationNull.Value, which, although defined in System.Management.Automation.Internal, is a public type/property.

The reason I say this problem is a synchronization issue is that, for the user of PipelineReader, the use of this marker object should've been transparent. Instead, it is a leaking abstraction that sometimes (and just sometimes!) gets exposed and returned from PipelineReader.Read() when it should never happen!

In the end, I ended up rewriting my code like this to work around this problem:

PipelineReader results = pipeline.Output;
while ( !results.EndOfPipeline ) {
   PSObject obj = results.Read();
   // check that the object returned isn't
   // $null, signaling the end of the pipeline
   if ( obj != AutomationNull.Value )
      // do something with the object
}
Technorati tags: ,


Tomas Restrepo

Software developer located in Colombia.