Applied Supervised Learning with R

Course Description Overview

Course Number:
035463
Course Length:
3 days
Course Description Overview:
Applied Supervised Learning with R will make you a pro at identifying your business problem, selecting the best-supervised machine learning algorithm to solve it, and fine-tuning your model to exactly deliver your needs without overfitting itself.

R provides excellent visualization features that are essential to explore data before using it in any automated learning.


Applied Supervised Learning with R covers the complete process of using R to develop applications using supervised machine learning algorithms that cater to your business needs. Your learning curve starts with developing your analytical thinking towards creating a problem statement using business inputs or domain research. You will learn many evaluation metrics that compare various algorithms and you can then use these metrics to select the best algorithm for your problem. After finalizing the algorithm, you want to use, you will study the hyperparameter optimization technique to fine tune your set of optimal parameters. To avoid overfitting your model, you will also be shown how to add various regularization terms.

Course Objectives:

When you complete the course, you will find yourself to be an expert at modeling a supervised machine learning algorithm that precisely fulfills your business need. After completing this course, you will be able to:

  • Develop analytical thinking to precisely identify a business problem
  • Wrangle data with dplyr, tidyr, and reshape2
  • Visualize data with ggplot2
  • Validate your supervised machine learning model using the k-fold algorithm
  • Optimize hyperparameters with grid and random search and Bayesian optimization
  • Deploy your model on AWS Lambda with Plumber
  • Improve a model's performance with feature selection and dimensionality reduction
Target Student:
Applied Supervised Learning with R perfectly balances theory and exercises. Each module is designed to build on the learnings of the previous module. The course contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context.
This course is specially designed for novice and intermediate data analysts, data scientists, and data engineers who want to explore various methods of supervised machine learning and its various use cases. Some background in statistics, probability, calculus, linear algebra, and programming will help you thoroughly understand and follow the content of this course.
Prerequisites:
-
Course-specific Technical Requirements Software:

You'll need the following software installed in advance:

  • Windows 7, 8.1, or 10, Ubuntu 14.04 or later, or macOS Sierra or later
  • Browser: Google Chrome or Mozilla Firefox
  • RStudio
  • RStudio Cloud
Course-specific Technical Requirements Hardware:

For the optimal student experience, we recommend the following hardware configuration:

  • Processor: Intel or AMD 4-core or better
  • Memory: 8 GB RAM
  • Hard disk: 20 GB available space
Certification reference (where applicable)
-
Course Content:

Lesson 1: R for Advanced Analytics

  • Working with Real-World Datasets
  • Reading Data from Various Formats of Data
  • Data Structures in R
  • Data Processing and Transformation
  • The Apply Family of Functions
  • Data Visualization

 

Lesson 2: Exploratory Analysis of Data

  • Univariate Analysis
  • Bivariate Analysis
  • Multivariate Analysis
  • Categorical Dependent and Numeric/Continuous Independent Variables
  • Categorical Dependent and Categorical Independent Variable

 

 

Lesson 3: Introduction to Supervised Learning

  • Regression and Classification Problems
  • Machine Learning Workflow
  • Regression
  • Classification
  • Evaluation Metrics

 

Lesson 4: Regression

  • Linear Regression
  • Model Diagnostics
  • Quantile Regression
  • Polynomial Regression
  • Ridge Regression
  • Lasso Regression
  • Elastic Net Regression
  • Poisson Regression
  • Cox Proportional-Hazards Regression Model

 

Lesson 5: Classification

  • Classification
  • Techniques for Supervised Learning
  • Logistic Regression
  • Evaluating Classification Models
  • Evaluating Logistic Regression
  • Decision Trees
  • XGBoost
  • Deep Neural Networks

 

Lesson 6: Feature Selection and Dimensionality Reduction

  • Feature Engineering
  • One-Hot Encoding
  • Feature Selection
  • Feature Reduction
  • Variable Clustering
  • Linear Discriminant Analysis for Feature Reduction

 

Lesson 7: Model Improvements

  • Bias-Variance Trade-off
  • Underfitting and Overfitting
  • Cross-Validation
  • K-Fold Cross-Validation
  • Hold-One-Out Validation
  • Hyperparameter Optimization
  • Grid Search Optimization
  • Random Search Optimization
  • Bayesian Optimization

 

Lesson 8: Model Deployment

  • Introduction to plumber
  • Docker
  • Amazon Web Services
  • Introducing AWS SageMaker
  • What is Amazon Lambda?
  • What is Amazon API Gateway?
  • Building Serverless ML Applications


Lesson 9: Capstone Project - Based on Research Papers

  • The mlr Package
  • Implementing Multilabel Classifier using the mlr and OpenML Packages
  • Constructing a Learner
  • Predictions
Registration
Register Now