Microsoft’s Cortana Intelligence Suite provides a seamless transition from raw data to intelligence: Real, meaningful data for real, meaningful business decisions.
With the rise of the Internet of Things (IoT), the need for real-time processing and data analytics has become paramount. As a part of the Cortana Intelligence Suite, Microsoft offers Azure Stream Analytics (ASA) as a fully-managed cloud service for analyzing complex event and data streams at near real time.
What is Stream Analytics?
Let’s start off simple. Think of some type of electronic sensor, such as a thermometer. Say that it’s measuring the temperature every second (creating a data stream) and then recording that data somewhere in a database. At the end of the day, you might want to know the number of times the temperature was over 85 degrees, so you run a count query against it.
But maybe you want to query the stream in real time. Stream processing allows you to get in the middle, between the sensor and the database, and run that same query. The query stays the same but is being run in real time on that data stream. Now you can be alerted the very second the temperature is reported as over 85 degrees.
Now let’s dive right into the analytics. Azure Stream Analytics is the event processing service that takes in events and data in real time. These events can come from a number of sources, including the sensor mentioned above, as well as devices, websites, social media, and applications. Realistically, any source that has event data can be transformed and consumed. Stream Analytics can handle data streams up into the magnitude of gigabytes of event data per second.
Here is a diagram showing where Stream Analytics fits into the real-time analytics suite of services.
How Does Stream Analytics work?
To set up a Stream Analytics job within Azure, you must have input data. Data to be consumed by Stream Analytics can come from two sources: Azure Event Hub or Azure Blob Storage. Event Hub is an Azure service that can combine many different sources of streaming data together into a single output. Blob storage is another Azure service that allows the getting and setting of BLOB objects. The data in both of these sources can be formatted in either JSON or CSV.
Once your job has input data, you have the ability to transform that data. To do this, you set up real-time queries using a SQL-like language, including all of your favorite functions and operators, allowing for the easy aggregation of the data. The language’s similarity to TSQL makes it very simple for any developer to write these transformation queries.
Now that the data has been transformed, you can set it up to be fed to another service. It can be dumped straight into a SQL database or Blob storage. It can also be sent to another Event Hub if more data processing is needed in a next step. However, the really neat part here is that the data can be sent to a real-time data visualization service, such as Microsoft’s Power BI platform. (Power BI will be discussed in more detail in an upcoming blog post.)
Like most Azure cloud service offerings, Stream Analytics is billed based on usage. This makes it easy for both small and large operations to use and scale this service. Stream Analytics is priced by volume of data processed and the number of streaming units required to process it:
- Volume of data processed by the streaming job: $0.001/GB
- Streaming unit (blended measure of CPU, memory, throughput): $0.31/hr
However, Stream Analytics cannot function on its own—it requires the use of other cloud services on either side of the processing, such as Event Hub, HDInsight, Blob storage, Power BI, or Azure Machine Learning.
Stream Analytics, in conjunction with the other services mentioned, form Microsoft’s Cortana Analytics Suite—a platform that we are extremely impressed with and will continue covering in upcoming blog posts.
The Cortana Intelligence Suite: