Get in Touch

Course Outline

Introduction

This section offers a comprehensive overview of when to apply machine learning, key considerations, and its fundamental meaning, including advantages and limitations. Topics cover data types (structured, unstructured, static, and streamed), data validity and volume, data-driven versus user-driven analytics, statistical models versus machine learning models, challenges in unsupervised learning, the bias-variance trade-off, iteration and evaluation, cross-validation approaches, and the distinctions between supervised, unsupervised, and reinforcement learning.

MAJOR TOPICS

1. Understanding Naive Bayes

  • Basic concepts of Bayesian methods
  • Probability
  • Joint probability
  • Conditional probability using Bayes' theorem
  • The Naive Bayes algorithm
  • Naive Bayes classification
  • The Laplace estimator
  • Applying numeric features with Naive Bayes

2. Understanding Decision Trees

  • Divide and conquer
  • The C5.0 decision tree algorithm
  • Selecting the best split
  • Pruning the decision tree

3. Understanding Neural Networks

  • From biological to artificial neurons
  • Activation functions
  • Network topology
  • The number of layers
  • The direction of information flow
  • The number of nodes in each layer
  • Training neural networks using backpropagation
  • Deep Learning

4. Understanding Support Vector Machines

  • Classification with hyperplanes
  • Finding the maximum margin
  • Linearly separable data
  • Non-linearly separable data
  • Using kernels for non-linear spaces

5. Understanding Clustering

  • Clustering as a machine learning task
  • The k-means algorithm for clustering
  • Using distance to assign and update clusters
  • Selecting the appropriate number of clusters

6. Measuring Performance for Classification

  • Working with classification prediction data
  • A closer look at confusion matrices
  • Using confusion matrices to evaluate performance
  • Beyond accuracy – other performance metrics
  • The kappa statistic
  • Sensitivity and specificity
  • Precision and recall
  • The F-measure
  • Visualizing performance trade-offs
  • ROC curves
  • Estimating future performance
  • The holdout method
  • Cross-validation
  • Bootstrap sampling

7. Tuning Stock Models for Better Performance

  • Using caret for automated parameter tuning
  • Creating a simple tuned model
  • Customizing the tuning process
  • Improving model performance through meta-learning
  • Understanding ensembles
  • Bagging
  • Boosting
  • Random forests
  • Training random forests
  • Evaluating random forest performance

MINOR TOPICS

8. Understanding Classification Using Nearest Neighbors

  • The kNN algorithm
  • Calculating distance
  • Selecting an appropriate k
  • Preparing data for use with kNN
  • Why the kNN algorithm is considered lazy

9. Understanding Classification Rules

  • Separate and conquer
  • The One Rule algorithm
  • The RIPPER algorithm
  • Rules derived from decision trees

10. Understanding Regression

  • Simple linear regression
  • Ordinary least squares estimation
  • Correlations
  • Multiple linear regression

11. Understanding Regression Trees and Model Trees

  • Integrating regression into tree models

12. Understanding Association Rules

  • The Apriori algorithm for association rule learning
  • Measuring rule interest – support and confidence
  • Building a set of rules using the Apriori principle

Extras

  • Spark, PySpark, MLlib, and Multi-armed bandits

Requirements

Knowledge of Python

 21 Hours

Number of participants


Price per participant

Testimonials (7)

Upcoming Courses

Related Categories