Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Overview of Big Data:
- Defining Big Data
- Reasons behind the rising popularity of Big Data
- Case studies involving Big Data
- Key characteristics of Big Data
- Solutions for managing Big Data
Hadoop and Its Components:
- Definition of Hadoop and its core components
- Hadoop architecture and its capability to handle specific data types
- A brief history of Hadoop, including the companies that use it and the reasons for adoption
- Detailed explanation of the Hadoop framework and its components
- Explanation of HDFS and the processes for reading from and writing to the Hadoop Distributed File System
- Procedures for setting up a Hadoop cluster in various modes: standalone, pseudo-distributed, and multi-node
(This section covers establishing a Hadoop cluster using VirtualBox, KVM, or VMware, addressing critical network configurations, starting Hadoop Daemons, and performing cluster testing).
- Explanation of the Map Reduce framework and its operational mechanics
- Executing Map Reduce jobs on a Hadoop cluster
- Comprehending replication, mirroring, and rack awareness within Hadoop clusters
Planning a Hadoop Cluster:
- Strategies for planning your Hadoop cluster
- Evaluating hardware and software requirements for cluster planning
- Analyzing workloads to plan the cluster effectively, preventing failures, and ensuring optimal performance
Introduction to MapR and Its Value:
- An overview of MapR and its architecture
- Understanding and utilizing the MapR Control System, MapR Volumes, snapshots, and mirrors
- Planning clusters specifically for MapR
- Comparing MapR with other distributions and Apache Hadoop
- Process of MapR installation and cluster deployment
Cluster Setup and Administration:
- Managing services, nodes, snapshots, mirror volumes, and remote clusters
- Understanding and managing nodes
- Comprehending Hadoop components and installing them alongside MapR Services
- Accessing data on the cluster, including via NFS, and managing services and nodes
- Managing data through volumes, handling users and groups, assigning roles to nodes, commissioning and decommissioning nodes, cluster administration, performance monitoring, configuring and analyzing metrics, and administering MapR security
- Understanding and working with M7, the native storage for MapR tables
- Configuring and tuning the cluster for optimum performance
Cluster Upgrades and Integration with Other Systems:
- Upgrading the MapR software version and types of upgrades
- Configuring the MapR cluster to access an HDFS cluster
- Setting up a MapR cluster on Amazon Elastic MapReduce
All topics include demonstrations and practice sessions to provide learners with hands-on experience.
Requirements
- Foundational knowledge of the Linux file system
- Basic understanding of Java
- Familiarity with Apache Hadoop (recommended)
28 Hours
Testimonials (1)
practical things of doing, also theory was served good by Ajay