This blog post continues my series of blog posts about ETW logging in IIS 8.5. One of the things that I wanted to do from the start was using this as an excuse to dig into some Azure services. In the last entry, I updated the sample code to push ETW events collected from IIS in near real-time to an Azure Event Hub.
Now, I’d like to take that a bit further and actually process the events that make it to the hub. Naturally, I could write my own Event Hub listener that processes said events, but the Azure Stream Analytics service (in preview) sounded like a pretty interesting way to do this, so I decided to give it a try!
Before I actually detail how to make this happen, I had to make a change in the collector code. My original design used a “nice” JSON representation of the collected ETW events. That is, an event was represented by a complex JSON object that contained 3 properties (header, event, payload), of which the first two where objects containing properties themselves.
Unfortunately, this does not appear to be supported by Stream Analytics. Though the documentation doesn’t explain it as clearly, Stream Analytics requires that input JSON objects be “flat” (that is, they are simple name-value dictionaries). This is a very understandable requirement, and makes sense given that the other supported format is CSV, but it is something to keep in mind.
Based on this, I modified the collector code to use a essentially the same format, but flattening the header and event members using the header_ and event_ prefixes as necessary.
Configuring the Stream Analytics job.
The first step to use Stream Analytics is creating a job in the management portal:
Once your Stream Analytics job is created, we will want to connect it to our existing Event Hub that is receiving events from the collector service.
Creating an Input
Before doing this, we need to add a new SAS policy that Stream Analytics can use to read the events from the hub.
To do this, go to your Event Hub configuration page, look for the “shared access policies” section and add a new one with Manage permissions:
Then click on the Save button to persist the changes.
Now go back to the Stream Analytics instance and select the Add Inputs option. Here’re we will want to connect it to our existing Event Hub instance:
The type of the input should be Data Stream, Event Hub, and then enter an alias for the input and select the existing event hub and the right shared access policy:
Then select the serialization format and encoding:
Creating a Query
Now that we have a way to get data into our Stream Analytics job, we can define one or more queries on top of the data. For example, let’s say we want to create a summary of the successful hits within a 5 minute window:
Creating an Output
Now let’s define an output for our query. In this case, I want to store the job output into a SQL Database on azure, which I have already created. Notice that you will want your DB to be on the same region as the Stream Analytics job! In this database, we’ll create a simple table to store the results:
Go to the Output section of your Stream Analytics job and select the Add an Output option. Then select “SQL Database” as the destination, and enter your database details:
Note: If you use a database that has ‘-’ characters as part of the name, trying to create the output will result in an error, at least when I tried it.
Running the job
Now we can start our Stream Analytics job and test the results!
After the job has been running for a while, and we’re feeding data to the Event Hub, we start seeing the results in our database table:
Overall, this was surprisingly easier than I expected it to be, but at the same time, it took me longer than it should have to get it to work. Mostly, due to the fact that I wasted a lot of time trying to get it to accept my output database and understanding some of the limitations and the query syntax. However, I expect this will get a lot better once the documentation is improved!