Applied Supervised Learning with R
R provides excellent visualization features that are essential to explore data before using it in any automated learning.
Applied Supervised Learning with R covers the complete process of using R to develop applications using supervised machine learning algorithms that cater to your business needs. Your learning curve starts with developing your analytical thinking towards creating a problem statement using business inputs or domain research. You will learn many evaluation metrics that compare various algorithms and you can then use these metrics to select the best algorithm for your problem. After finalizing the algorithm, you want to use, you will study the hyperparameter optimization technique to fine tune your set of optimal parameters. To avoid overfitting your model, you will also be shown how to add various regularization terms.
When you complete the course, you will find yourself to be an expert at modeling a supervised machine learning algorithm that precisely fulfills your business need. After completing this course, you will be able to:
- Develop analytical thinking to precisely identify a business problem
- Wrangle data with dplyr, tidyr, and reshape2
- Visualize data with ggplot2
- Validate your supervised machine learning model using the k-fold algorithm
- Optimize hyperparameters with grid and random search and Bayesian optimization
- Deploy your model on AWS Lambda with Plumber
- Improve a model's performance with feature selection and dimensionality reduction
You'll need the following software installed in advance:
- Windows 7, 8.1, or 10, Ubuntu 14.04 or later, or macOS Sierra or later
- Browser: Google Chrome or Mozilla Firefox
- RStudio
- RStudio Cloud
For the optimal student experience, we recommend the following hardware configuration:
- Processor: Intel or AMD 4-core or better
- Memory: 8 GB RAM
- Hard disk: 20 GB available space
Lesson 1: R for Advanced Analytics
Working with Real-World Datasets
Reading Data from Various Formats of Data
Data Structures in R
Data Processing and Transformation
The Apply Family of Functions
Data Visualization
Lesson 2: Exploratory Analysis of Data
Univariate Analysis
Bivariate Analysis
Multivariate Analysis
Categorical Dependent and Numeric/Continuous Independent
Variables
Categorical Dependent and Categorical Independent Variable
Lesson 3: Introduction to Supervised Learning
Regression and Classification Problems
Machine Learning Workflow
Regression
Classification
Evaluation Metrics
Lesson 4: Regression
Linear Regression
Model Diagnostics
Quantile Regression
Polynomial Regression
Ridge Regression
Lasso Regression
Elastic Net Regression
Poisson Regression
Cox Proportional-Hazards Regression Model
Lesson 5: Classification
Classification
Techniques for Supervised Learning
Logistic Regression
Evaluating Classification Models
Evaluating Logistic Regression
Decision Trees
XGBoost
Deep Neural Networks
Lesson 6: Feature Selection and Dimensionality Reduction
Feature Engineering
One-Hot Encoding
Feature Selection
Feature Reduction
Variable Clustering
Linear Discriminant Analysis for Feature Reduction
Lesson 7: Model Improvements
Bias-Variance Trade-off
Underfitting and Overfitting
Cross-Validation
K-Fold Cross-Validation
Hold-One-Out Validation
Hyperparameter Optimization
Grid Search Optimization
Random Search Optimization
Bayesian Optimization
Lesson 8: Model Deployment
Introduction to plumber
Docker
Amazon Web Services
Introducing AWS SageMaker
What is Amazon Lambda?
What is Amazon API Gateway?
Building Serverless ML Applications
Lesson 9: Capstone Project - Based on Research Papers
The mlr Package
Implementing Multilabel Classifier using the mlr and OpenML
Packages
Constructing a Learner
Predictions