Applied Supervised Learning with R
Course Description Overview
R provides excellent visualization features that are essential to explore data before using it in any automated learning.
Applied Supervised Learning with R covers the complete process of using R to develop applications using supervised machine learning algorithms that cater to your business needs. Your learning curve starts with developing your analytical thinking towards creating a problem statement using business inputs or domain research. You will learn many evaluation metrics that compare various algorithms and you can then use these metrics to select the best algorithm for your problem. After finalizing the algorithm, you want to use, you will study the hyperparameter optimization technique to fine tune your set of optimal parameters. To avoid overfitting your model, you will also be shown how to add various regularization terms.
When you complete the course, you will find yourself to be an expert at modeling a supervised machine learning algorithm that precisely fulfills your business need. After completing this course, you will be able to:
- Develop analytical thinking to precisely identify a business problem
- Wrangle data with dplyr, tidyr, and reshape2
- Visualize data with ggplot2
- Validate your supervised machine learning model using the k-fold algorithm
- Optimize hyperparameters with grid and random search and Bayesian optimization
- Deploy your model on AWS Lambda with Plumber
- Improve a model's performance with feature selection and dimensionality reduction
You'll need the following software installed in advance:
- Windows 7, 8.1, or 10, Ubuntu 14.04 or later, or macOS Sierra or later
- Browser: Google Chrome or Mozilla Firefox
- RStudio
- RStudio Cloud
For the optimal student experience, we recommend the following hardware configuration:
- Processor: Intel or AMD 4-core or better
- Memory: 8 GB RAM
- Hard disk: 20 GB available space
Lesson 1: R for Advanced Analytics
- Working with Real-World Datasets
- Reading Data from Various Formats of Data
- Data Structures in R
- Data Processing and Transformation
- The Apply Family of Functions
- Data Visualization
Lesson 2: Exploratory Analysis of Data
- Univariate Analysis
- Bivariate Analysis
- Multivariate Analysis
- Categorical Dependent and Numeric/Continuous Independent Variables
- Categorical Dependent and Categorical Independent Variable
Lesson 3: Introduction to Supervised Learning
- Regression and Classification Problems
- Machine Learning Workflow
- Regression
- Classification
- Evaluation Metrics
Lesson 4: Regression
- Linear Regression
- Model Diagnostics
- Quantile Regression
- Polynomial Regression
- Ridge Regression
- Lasso Regression
- Elastic Net Regression
- Poisson Regression
- Cox Proportional-Hazards Regression Model
Lesson 5: Classification
- Classification
- Techniques for Supervised Learning
- Logistic Regression
- Evaluating Classification Models
- Evaluating Logistic Regression
- Decision Trees
- XGBoost
- Deep Neural Networks
Lesson 6: Feature Selection and Dimensionality Reduction
- Feature Engineering
- One-Hot Encoding
- Feature Selection
- Feature Reduction
- Variable Clustering
- Linear Discriminant Analysis for Feature Reduction
Lesson 7: Model Improvements
- Bias-Variance Trade-off
- Underfitting and Overfitting
- Cross-Validation
- K-Fold Cross-Validation
- Hold-One-Out Validation
- Hyperparameter Optimization
- Grid Search Optimization
- Random Search Optimization
- Bayesian Optimization
Lesson 8: Model Deployment
- Introduction to plumber
- Docker
- Amazon Web Services
- Introducing AWS SageMaker
- What is Amazon Lambda?
- What is Amazon API Gateway?
- Building Serverless ML Applications
Lesson 9: Capstone Project - Based on Research Papers
- The mlr Package
- Implementing Multilabel Classifier using the mlr and OpenML Packages
- Constructing a Learner
- Predictions