What is data streaming?
Data streaming is the continuous processing of data as it is generated. It is also known as event processing or stream processing. The benefit of data streaming is that analytics, insights and actions can occur immediately. Data streaming is used across industries like financial services, retail, media and entertainment, and telecommunications.
 
            Data streaming explained
Data streams are never ending flows of data that can be transformed, analyzed or otherwise processed in real time whilst in motion. Data streams are generated by all sorts of events and systems, from network elements to financial transactions, website clicks and more.
                        Overview
                        
                        of stream data processing
                    
                Data streams can come from many sources and can be processed in different ways. Some example sources and forms of processing are listed below.
Sources
Sources for data streams can include:
- IoT sensors
- Server logs
- Applications
- Third party services
- Network elements
Processing
Processing performed on data streams can include:
- Ingestion
- Transformation
- Aggregation
- Combining data streams
                    Batch processing
                    
                    versus data streaming
                
            Batch processing is about processing sets of data. Batch processes are typically run on a schedule, for example once an hour, or once a week, which is different to data streaming, where processing typically takes place continuously as data arrives.
Batch processing can be an efficient way to process data in bulk, but might not offer as timely results versus data streaming.
Streaming use cases
Data streaming is widely used wherever immediate actions or insights are needed. Use cases are often found in online retail, financial fraud analysis, advertising and investment.
 
                    Real time advertising auctions
 
                    High frequency trading
 
                    Credit card fraud detection
 
                    Network intrusion detection and prevention
 
                    Security operations automated response (SOAR)
 
                    Predictive maintenance
 
                    Machine learning model optimization
Canonical solutions for data streaming
Questions? Get answers
Do you have a data streaming project in mind and want to get advice on implementing Kafka or Spark? Contact us now to discuss your needs.
Apache®, Apache Kafka, Kafka®, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.
Apache®, Apache Spark, Spark®, and the Spark logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.