Updated 2015/07/23: I've updated this post with some up-to-date information on Stream Analytics and some additional comments.
This blog post continues my series of blog posts about ETW logging in IIS 8.5. One of the things that I wanted to do from the start was using this as an excuse to dig into some Azure services. In the last entry, I updated the sample code to push ETW events collected from IIS in near real-time to an Azure Event Hub.
Now, I’d like to take that a bit further and actually process the events that make it to the hub. Naturally, I could write my own Event Hub listener that processes said events, but the Azure Stream Analytics service sounded like a pretty interesting way to do this, so I decided to give it a try!
The collector code serializes each ETW event into a flat JSON structure, using the header_ and event_ prefixes as necessary. Note, however, that Stream Analytics now supports hierarchical JSON structures (i.e. with nested objects), so you could use a different representation if you wanted
Configuring the Stream Analytics job.
The first step to use Stream Analytics is creating a job in the management portal:
Once your Stream Analytics job is created, we will want to connect it to our existing Event Hub that is receiving events from the collector service.
Creating an Input
Before doing this, we need to add a new SAS policy that Stream Analytics can use to read the events from the hub.
To do this, go to your Event Hub configuration page, look for the “shared access policies” section and add a new one with Manage permissions:
Then click on the Save button to persist the changes.
Now go back to the Stream Analytics instance and select the Add Inputs option. Here’re we will want to connect it to our existing Event Hub instance:
The type of the input should be Data Stream, Event Hub, and then enter an alias for the input and select the existing event hub and the right shared access policy:
Then select the serialization format and encoding:
Creating a Query
Now that we have a way to get data into our Stream Analytics job, we can define one or more queries on top of the data. For example, let’s say we want to create a summary of the successful hits within a 5 minute window:
Creating an Output
Now let’s define an output for our query. In this case, I want to store the job output into a SQL Database on azure, which I have already created. Notice that you will want your DB to be on the same region as the Stream Analytics job! In this database, we’ll create a simple table to store the results:
Go to the Output section of your Stream Analytics job and select the Add an Output option. Then select “SQL Database” as the destination, and enter your database details:
Note: If you use a database that has ‘-’ characters as part of the name, trying to create the output will result in an error, at least when I tried it.
Running the job
Now we can start our Stream Analytics job and test the results!
After the job has been running for a while, and we’re feeding data to the Event Hub, we start seeing the results in our database table:
Overall, getting this working was surprisingly easy. Since I did my original tests, Stream Analytics and the documentation has improved substantially, making it even easier!
Supporting other event types
While my original idea was to handle IIS log information, there is no reason the same technique cannot be used to collect and process other ETW events. I've recently updated the code to make the core code independent of the IIS log provider, and now you can add/replace the existing ETW providers/parsers used with your own using MEF. This was very easy since the collector is based on the excellent TraceEvent library to collect ETW events.