Practical Machine Learning with R
Practical Machine Learning with R gives you the complete knowledge to solve your business problems - starting by forming a good problem statement, selecting the most appropriate model to solve your problem, and then ensuring that you do not over-train the model. With huge amounts of data being generated every moment, businesses need applications that apply complex mathematical calculations to data repeatedly and at speed. With machine learning techniques and R, you can easily develop these kinds of applications in an efficient way.
Practical Machine Learning with R begins by helping you grasp the basics of machine learning methods, while also highlighting how and why they work. You will understand how to get these algorithms to work in practice, rather than focusing on mathematical derivations. As you progress from one chapter to another, you will gain hands-on experience of building a machine learning solution in R. Next, using R packages such as rpart, random forest, and multiple imputation by chained equations (MICE), you will learn to implement algorithms including neural net classifier, decision trees, and linear and non-linear regression. As you progress through the course, you’ll delve into various machine learning techniques for both supervised and unsupervised learning approaches. In addition to this, you’ll gain insights into partitioning the datasets and mechanisms to evaluate the results from each model and be able to compare them.
By the end of this course, you will have gained expertise in solving your business problems, starting by forming a good problem statement, selecting the most appropriate model to solve your problem, and then ensuring that you do not over-train it.
After completing this course, you will be able to:
- Define a problem that can be solved by training a machine learning model
- Obtain, verify and clean data before transforming it into the correct format for use
- Perform exploratory analysis and extract features from data
- Build models for regression, classification and clustering
- Evaluate the performance of a model with the right metrics
- Solve a classification problem using the neuralnet package
- Implement a decision tree using the random forest library
We also recommend that you have the following software installed in advance:
- OS: Windows 7 SP1 64-bit, Windows 8.1 64-bit or Windows 10 64-bit, Ubuntu Linux, or the latest version of OS X
- Browser
- R Studio
- R version 3.6 or later
- R libraries as needed (mice, caret, rpart, groupdata2, cvms, neuralnet, NeuralNetTools, rPref, mlbench, knitr, interplot, doParallel, car, and so on)
For the optimal student experience, we recommend the following hardware configuration:
- Processor: Intel Core i5 or equivalent
- Memory: 4GB RAM (8 GB Preferred)
- Storage: 16 GB available space
Lesson 1: An Introduction to Machine Learning
The Machine Learning Process
Introduction to R
Machine Learning Models
Regression
Lesson 2: Data Cleaning and Pre-processing
Advanced Operations on Data Frames
Identifying the Input and Output Variables
Identifying the Category of Prediction
Handling Missing Values, Duplicates, and Outliers
Handling Outliers
Lesson 3: Feature Engineering
Types of Features
Time Series Features
Handling Categorical Variables
Derived Features or Domain-Specific Features
Adding Features to a Data Frame
Handling Redundant Features
Feature Selection
Lesson 4: Introduction to neuralnet and Evaluation
Methods
Classification
Model Selection
Multiclass Classification Overview
Lesson 5: Linear and Logistic Regression Models
Regression
Linear Regression
Logistic Regression
Regression and Classification with Decision Trees
Model Selection by Multiple Disagreeing Metrics
Lesson 6: Unsupervised Learning
Overview of Unsupervised Learning (Clustering)
DIANA
Applications of Clustering
k-means Clustering