Applied Unsupervised Learning with Python
Course Description Overview
Unsupervised learning is a useful and practical solution in situations where labeled data is not available.
Applied Unsupervised Learning with Python guides you on the best practices for using unsupervised learning techniques in tandem with Python libraries and extracting meaningful information from unstructured data. The course begins by explaining how basic clustering works to find similar data points in a set. Once you are well versed with the k-means algorithm and how it operates, you’ll learn what dimensionality reduction is and where to apply it. As you progress, you’ll learn various neural network techniques and how they can improve your model. While studying the applications of unsupervised learning, you will also understand how to mine topics that are trending on Twitter and Facebook and build a news recommendation engine for users. You will complete the course by challenging yourself through various interesting activities such as performing a Market Basket Analysis and identifying relationships between different merchandises.
By the end of this course, you will have the skills you need to confidently build your own models using Python.
After completing this course, you will be able to:
- Understand the basics and importance of clustering
- Build k-means, hierarchical, and DBSCAN clustering algorithms from scratch with built-in packages
- Explore dimensionality reduction and its applications
- Use scikit-learn (sklearn) to implement and analyse principal component analysis (PCA)on the Iris dataset
- Employ Keras to build autoencoder models for the CIFAR-10 dataset
- Apply the Apriori algorithm with machine learning extensions (Mlxtend) to study transaction data
- OS: Windows 7 SP1 64-bit, Windows 8.1 64-bit, or Windows 10 64-bit; Linux (Ubuntu, Debian, Red Hat, or Suse); or the latest version of OS X
- Python (3.6.5 or later, preferably 3.7; available through https://www.python.org/downloads/release/python-371/)
- Anaconda (This is for the basemap module of mlp_toolkits; go to https://www.anaconda.com/distribution/, download the 3.7 version, and follow the instructions to install it.)
For the optimal student experience, we recommend the following hardware configuration:
- Processor: Intel Core i5 or equivalent
- Memory: 4 GB RAM
- Storage: 5 GB available space
- An internet connection
Lesson 1: Introduction to Clustering
- Introduction
- Unsupervised Learning versus Supervised Learning
- Clustering
- Introduction to k-means Clustering
Lesson 2: Hierarchical Clustering
- Introduction
- Clustering Refresher
- The Organization of Hierarchy
- Introduction to Hierarchical Clustering
- Linkage
- Agglomerative versus Divisive Clustering
- k-means versus Hierarchical Clustering
Lesson 3: Neighborhood Approaches and DBSCAN
- Introduction
- Introduction to DBSCAN
- DBSCAN Versus k-means and Hierarchical Clustering
Lesson 4: Dimension Reduction and PCA
- Introduction
- Overview of Dimensionality Reduction Techniques
- PCA
Lesson 5: Autoencoders
- Introduction
- Fundamentals of Artificial Neural Networks
- Autoencoders
Lesson 6: t-Distributed Stochastic Neighbor Embedding (t-SNE)
- Introduction
- Stochastic Neighbor Embedding (SNE)
- Interpreting t-SNE Plots
Lesson 7: Topic Modeling
- Introduction
- Cleaning Text Data
- Latent Dirichlet Allocation
- Non-Negative Matrix Factorization
Lesson 8: Market Basket Analysis
- Introduction
- Market Basket Analysis
- Characteristics of Transaction Data
- Apriori Algorithm
- Association Rules
Lesson 9: Hotspot Analysis
- Introduction
- Kernel Density Estimation
- Hotspot Analysis