Machine Learning Fundamentals
As the use of machine learning algorithms becomes popular for solving problems in a number of industries, so does the development of new tools for optimizing the process of programming such algorithms. This course aims to explain the scikit-learn API, which is a package created to facilitate the process of building machine learning applications. By explaining the difference between supervised and unsupervised models, as well as by applying algorithms to real-life datasets, this course will help beginners to start programming machine learning algorithms.
Software:
- Sublime Text (latest version), Atom IDE (latest version), or other similar text editor applications.
- Python 3 installed
- The following Python libraries installed: NumPy, SciPy, scikit-learn, Matplotlib, Pandas, pickle, jupyter, and seaborn
Installation and Setup
Before you start this course, you'll need to install Python 3.6, pip, scikit-learn, and the other libraries used in this course. You will find the steps to install these here:
Installing Python
- Install Python 3.6 by following the instructions at this link: https://realpython.com/installing-python/.
Installing pip
- To install pip, go to the following link and download the get-pip.py file: https://pip.pypa.io/en/stable/installing/.
- Then, use the following command to install it: python get-pip.py
You might need to use the python3 get-pip.py command, due to previous versions of Python on your computer are already using use the python command.
Installing libraries
Using the pip command, install the following libraries:
- python -m pip install --user numpy scipy matplotlib jupyter pandas seaborn
Installing scikit-learn
- Install scikit-learn using the following command: pip install -U scikit-learn
- Processor: Intel Core i5 or equivalent
- Memory: 4GB RAM or higher
Lesson 1: Introduction to scikit-learn
- scikit-learn
- Data Representation
- Data Preprocessing
- scikit-learn API
- Supervised and Unsupervised Learning
Lesson 2: Unsupervised Learning: Real-life Applications
- Clustering
- Exploring a Dataset: Wholesale Customers Dataset
- Data Visualization
- k-means Algorithm
- Mean-Shift Algorithm
- DBSCAN Algorithm
- Evaluating the Performance of Clusters
Lesson 3: Supervised Learning: Key Steps
- Model Validation and Testing
- Evaluation Metrics
- Error Analysis
Lesson 4: Supervised Learning Algorithms: Predict Annual Income
- Exploring the Dataset
- Naïve Bayes Algorithm
- Decision Tree Algorithm
- Support Vector Machine Algorithm
- Error Analysis
Lesson 5: Artificial Neural Networks: Predict Annual Income
- Artificial Neural Networks
- Applying an Artificial Neural Network
- Performance Analysis
Lesson 6: Building your own Program
- Program Definition
- Saving and Loading a Trained Model
- Interacting with a Trained Model