What is Microsoft Azure Stream Analytics?
Azure Stream Analytics is a managed event-processing engine set up to perform real-time analytic computations on streaming data. The data can come from devices, sensors, websites, social media feeds, applications, infrastructure systems, and more. +
Use Stream Analytics to examine high volumes of data streaming from devices or processes, extract information from that data stream, identify patterns, trends, and relationships. Use those patterns to trigger other processes or actions, like alerts, automation workflows, feed information to a reporting tool, or store it for later investigation.
Stock-trading analysis and alerts.
Fraud detection, data, and identify protections.
Embedded sensor and actuator analysis.
Web clickstream analytics.
How does Stream Analytics work?
This diagram illustrates the Stream Analytics pipeline, showing how data is ingested, analyzed, and then sent for presentation or action.
Stream Analytics starts with a source of streaming data. The data can be ingested into Azure from a device using an Azure event hub or IoT hub. The data can also be pulled from a data store like Azure Blob Storage. +
To examine the stream, you create a Stream Analytics job that specifies from where the data comes. The job also specifies a transformation; how to look for data, patterns, or relationships. For this task, Stream Analytics supports a SQL-like query language to filter, sort, aggregate, and join streaming data over a time period.
Finally, the job specifies an output for that transformed data. You control what to do in response to the information you’ve analyzed. For example, in response to the analysis, you might:
Send a command to change device settings.
Send data to a monitored queue for further action based on findings.
Send data to a Power BI dashboard.
Send data to storage like Data Lake Store, Azure SQL Database, or Azure Blob storage.
You can adjust the number of events processed per second while the job is running. You can also produce diagnostic logs for troubleshooting
Capabilities and benefits of Azure Analytics
Connect inputs and outputs
Stream Analytics connects directly to Azure Event Hubs and Azure IoT Hub for stream ingestion, and to Azure Blob storage service to ingest historical data. Combine data from event hubs with Stream Analytics with other data sources and processing engines. Job input can also include reference data (static or slow-changing data). You can join streaming data to this reference data to perform lookup operations the same way you would with database queries.
Route Stream Analytics job output in many directions. Write to storage like Azure Blob, Azure SQL Database, Azure Data Lake Stores, or Azure Cosmos DB. From there, you could run batch analytics with Azure HDInsight. Or send the output to another service for consumption by another process, such as event hubs, Azure Service Bus, queues, or to Power BI for visualization.
Easy to use
To define transformations, you use a simple, declarative Stream Analytics query language that lets you create sophisticated analyses with no programming. The query language takes streaming data as its input. You can then filter and sort the data, aggregate values, perform calculations, join data (within a stream or to reference data), and use geospatial functions. You can edit queries in the portal, using IntelliSense and syntax checking, and you can test queries using sample data that you can extract from the live stream.
Extensible query language
Stream Analytics can handle up to 1 GB of incoming data per second. Integration with Azure Event Hubs and Azure IoT Hub allows jobs to ingest millions of events per second coming from connected devices, clickstreams, and log files, to name a few. Using the partition feature of event hubs, you can partition computations into logical steps, each with the ability to be further partitioned to increase scalability.
As a cloud service, Stream Analytics is optimized to let you get going at a low cost. You pay as you go based on streaming-unit usage and the amount of data processed by the system. Usage is derived based on the volume of events processed and the amount of computing power provisioned within the cluster to handle Stream Analytics jobs.
Reliability, quick recovery, and repeatability
As a managed service in the cloud, Stream Analytics helps prevent data loss and provides business continuity. If failures occur, the service provides built-in recovery capabilities. With the ability to internally maintain state, the service provides repeatable results ensuring it is possible to archive events and reapply processing in the future, always getting the same results. This enables you to go back in time and investigate computations when doing root-cause analysis, what-if analysis, and so on.
Things you should know about Azure Analytics
Start in seconds, scale instantly, pay per job
Process big data jobs in seconds with Azure Data Lake Analytics. There is no infrastructure to worry about because there are no servers, virtual machines, or clusters to wait for, manage or tune. Instantly scale the processing power, measured in Azure Data Lake Analytics Units (AU), from one to thousands for each job. You only pay for the processing that you use per job.
Develop massively parallel programs with simplicity
U-SQL is a simple, expressive, and extensible language that allows you to write code once and have it automatically parallelized for the scale you need. Process petabytes of data for diverse workload categories such as querying, ETL, analytics, machine learning, machine translation, image processing, and sentiment analysis by leveraging existing libraries written in .NET languages, R, or Python. Watch the U-SQL query execution for Azure Data Lake video to see how we detect the type of objects in one million images using a U-SQL built-in cognitive library.
Debug and optimize your big data programs with ease
Debug failures in cloud distributed programs as easily as debugging a program in your personal environment. Our execution environment actively analyses your programs as they run and gives you recommendations to improve performance and reduce cost. For example, if you request 1000 AUs for your program and only 50 AUs are needed, the system recommends that you only use 50 AUs—reducing the cost by 95%.
Virtualize your analytics
Act on all of your data with optimized data virtualization of your relational sources such as Azure SQL Database and Azure SQL Data Warehouse. Your queries are automatically optimized by moving processing close to the source data without data movement, which maximizes performance and minimizes latency.
Enterprise-grade security, auditing, and support
Extend your on-premises security and governance controls to the cloud, and meet your security and regulatory compliance needs. Single sign-on (SSO), multi-factor authentication, and seamless management of millions of identities are built-in through Azure Active Directory. Role-based access control and the ability to audit all processing and management operations are on by default. We guarantee a 99.9% enterprise-grade SLA and 24/7 support for your big data solution.