A Practical Introduction to Stream Processing Training Course
Stream Processing involves the real-time analysis of "data in motion," where computations are performed on data as it is received. This data is continuously streamed from sources like sensor events, user activity on websites, financial transactions, credit card swipes, click streams, and more. Stream Processing frameworks are designed to handle vast amounts of incoming data and deliver valuable insights almost instantly.
In this instructor-led, live training session (available onsite or remotely), participants will learn how to set up and integrate various Stream Processing frameworks with existing big data storage systems, software applications, and microservices.
By the end of this training, participants will be able to:
- Install and configure different Stream Processing frameworks, such as Spark Streaming and Kafka Streaming.
- Understand and select the most suitable framework for specific tasks.
- Process data continuously, concurrently, and on a record-by-record basis.
- Integrate Stream Processing solutions with existing databases, data warehouses, data lakes, and more.
- Integrate the most appropriate stream processing library with enterprise applications and microservices.
Audience
- Developers
- Software architects
Format of the Course
- A mix of lectures, discussions, exercises, and extensive hands-on practice
Course Outline
Introduction
- Stream processing vs batch processing
- Analytics-focused stream processing
Overview Frameworks and Programming Languages
- Spark Streaming (Scala)
- Kafka Streaming (Java)
- Flink
- Storm
- Comparison of Features and Strengths of Each Framework
Overview of Data Sources
- Live data as a series of events over time
- Historical data sources
Deployment Options
- In the cloud (AWS, etc.)
- On premise (private cloud, etc.)
Getting Started
- Setting up the Development Environment
- Installing and Configuring
- Assessing Your Data Analysis Needs
Operating a Streaming Framework
- Integrating the Streaming Framework with Big Data Tools
- Event Stream Processing (ESP) vs Complex Event Processing (CEP)
- Transforming the Input Data
- Inspecting the Output Data
- Integrating the Stream Processing Framework with Existing Applications and Microservices
Troubleshooting
Summary and Conclusion
Requirements
- Programming experience in any language
- A basic understanding of Big Data concepts (e.g., Hadoop)
Need help picking the right course?
uzbekistan@nobleprog.com or +919818060888
A Practical Introduction to Stream Processing Training Course - Enquiry
A Practical Introduction to Stream Processing - Consultancy Enquiry
Testimonials (1)
Sufficient hands on, trainer is knowledgable
Chris Tan
Course - A Practical Introduction to Stream Processing
Related Courses
Administration of Confluent Apache Kafka
21 HoursConfluent Apache Kafka is a distributed event streaming platform built for high-throughput, fault-tolerant data pipelines and real-time analytics.
This instructor-led live training (available online or on-site) targets intermediate-level system administrators and DevOps professionals who want to learn how to install, configure, monitor, and troubleshoot Confluent Apache Kafka clusters.
By the end of this training, participants will be able to:
- Grasp the components and architecture of Confluent Kafka.
- Deploy and manage Kafka brokers, Zookeeper quorums, and essential services.
- Configure advanced features such as security, replication, and performance tuning.
- Utilize management tools to monitor and maintain Kafka clusters.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practice sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request customized training for this course, please contact us to arrange it.
Apache Kafka Connect
7 HoursThis instructor-led, live training in Uzbekistan (available online or on-site) is intended for developers who wish to integrate Apache Kafka with existing databases and applications for processing, analysis, and other use cases.
By the end of this training, participants will be able to:
- Use Kafka Connect to ingest large amounts of data from a database into Kafka topics.
- Ingest log data generated by application servers into Kafka topics.
- Make all collected data available for stream processing.
- Export data from Kafka topics into secondary systems for storage and analysis.
Big Data Streaming for Developers
14 HoursMaster the implementation of complete big data streaming scenarios. Gain skills in real-time data preparation and maintenance using Informatica, Edge, Kafka, and Spark. This training addresses software versions 10.2.1 and later.
Confluent Apache Kafka: Cluster Operations and Configuration
16 HoursConfluent Apache Kafka is an enterprise-grade distributed event streaming platform built on Apache Kafka. It supports high-throughput, fault-tolerant data pipelines and real-time streaming applications.
This instructor-led, live training (online or onsite) is aimed at intermediate-level engineers and administrators who wish to deploy, configure, and optimize Confluent Kafka clusters in production environments.
By the end of this training, participants will be able to:
- Install, configure, and operate Confluent Kafka clusters with multiple brokers.
- Design high-availability setups using Zookeeper and replication techniques.
- Tune performance, monitor metrics, and apply recovery strategies.
- Secure, scale, and integrate Kafka with enterprise environments.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Building Kafka Solutions with Confluent
14 HoursThis instructor-led, live training (online or onsite) is designed for engineers who wish to use Confluent (a distribution of Kafka) to build and manage a real-time data processing platform for their applications.
By the end of this training, participants will be able to:
- Install and configure Confluent Platform.
- Leverage Confluent's management tools and services to operate Kafka more efficiently.
- Store and process incoming stream data.
- Optimize and manage Kafka clusters.
- Secure data streams.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and hands-on practice.
- Real-world implementation in a live-lab environment.
Course Customization Options
- This course is based on the open source version of Confluent: Confluent Open Source.
- To request a customized training session for this course, please contact us to arrange.
Building Data Pipelines with Apache Kafka
7 HoursApache Kafka is a distributed streaming platform. It has become the de facto standard for building data pipelines and addresses a wide range of data processing use cases: it can function as a message queue, distributed log, stream processor, and more.
We will begin with foundational concepts of data pipelines in general, then delve into the core principles of Kafka. Additionally, we will explore key components such as Kafka Streams and Kafka Connect.
Distributed Messaging with Apache Kafka
14 HoursDesigned for enterprise architects, developers, system administrators, and professionals seeking to master high-throughput distributed messaging systems, this course provides comprehensive insights into Apache Kafka. If you have specific focus areas, such as exclusively system administration, the curriculum can be customized to align with your unique requirements.
Kafka for Administrators
21 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at beginner-level / intermediate-level / advanced-level system administrators and operations engineers who wish to use Apache Kafka to deploy, secure, monitor, and troubleshoot Kafka clusters.
By the end of this training, participants will be able to: explain Kafka architecture and KRaft mode, operate and secure Kafka clusters, monitor performance and reliability, and resolve common production issues.
Apache Kafka for Developers
21 HoursDesigned for intermediate-level developers aiming to build big data applications using Apache Kafka, this instructor-led live training in Uzbekistan (online or onsite) offers comprehensive guidance.
Upon completion of this course, participants will have the ability to:
- Create Kafka producers and consumers to send and retrieve data.
- Connect Kafka with external systems via Kafka Connect.
- Build streaming applications utilizing Kafka Streams & ksqlDB.
- Link a Kafka client application to Confluent Cloud for cloud-based Kafka deployments.
- Acquire practical skills through hands-on exercises and real-world use cases.
Apache Kafka for Python Programmers
7 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is designed for data engineers, data scientists, and programmers who wish to utilize Apache Kafka features for data streaming with Python.
By the end of this training, participants will be able to use Apache Kafka to monitor and manage conditions in continuous data streams using Python programming.
Kafka Fundamentals for Java Developers
14 HoursThis instructor-led, live training in Uzbekistan (online or onsite) targets intermediate-level Java developers who wish to integrate Apache Kafka into their applications for reliable, scalable, and high-throughput messaging.
Upon completion of this training, participants will be able to:
- Comprehend Kafka's architecture and its core components.
- Provision and configure a Kafka cluster.
- Produce and consume messages using Java.
- Deploy Kafka Streams for real-time data processing.
- Guarantee fault tolerance and scalability within Kafka applications.
Python and Spark for Big Data for Banking (PySpark)
14 HoursRenowned for its clear syntax and code readability, Python is a high-level programming language. Spark serves as a powerful engine for processing big data, enabling efficient querying, analysis, and transformation. PySpark bridges the two, allowing users to interface Spark with Python.
Target Audience: This course is designed for intermediate-level banking professionals who are already familiar with Python and Spark and wish to enhance their expertise in big data processing and machine learning.
PySpark and Machine Learning
21 HoursThis training offers a hands-on introduction to developing scalable data processing and Machine Learning workflows with PySpark. Participants will learn how Apache Spark functions within contemporary Big Data ecosystems and how to efficiently manage large datasets using distributed computing principles.
Python and Spark for Big Data (PySpark)
21 HoursIn this instructor-led live training in Uzbekistan, participants will learn how to leverage Python and Spark together to analyze big data while completing hands-on exercises.
By the end of this training, participants will be able to:
- Master the use of Spark with Python to analyze Big Data.
- Complete exercises that simulate real-world scenarios.
- Apply various tools and techniques for big data analysis using PySpark.
Stratio: Rocket and Intelligence Modules with PySpark
14 HoursStratio is a data-centric platform that integrates big data, AI, and governance into a single solution. Its Rocket and Intelligence modules enable rapid data exploration, transformation, and advanced analytics in enterprise environments.
This instructor-led, live training (online or onsite) is aimed at intermediate-level data professionals who wish to use the Rocket and Intelligence modules in Stratio effectively with PySpark, focusing on looping structures, user-defined functions, and advanced data logic.
By the end of this training, participants will be able to:
- Navigate and work within the Stratio platform using Rocket and Intelligence modules.
- Apply PySpark in the context of data ingestion, transformation, and analysis.
- Use loops and conditional logic to control data workflows and feature engineering tasks.
- Create and manage user-defined functions (UDFs) for reusable data operations in PySpark.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.