Need Help Getting Started with Applied Machine Learning?
These are the Step-by-Step Guides that You’ve Been Looking For!
What do you want help with?
Foundations
Beginner
Intermediate
Advanced
How Do I Get Started?
The most common question I’m asked is: “how do I get started?”
My best advice for getting started in machine learning is broken down into a 5-step process:
- Step 1: Adjust Mindset. Believe you can practice and apply machine learning.
- Step 2: Pick a Process. Use a systemic process to work through problems.
- Step 3: Pick a Tool. Select a tool for your level and map it onto your process.
- Beginners: Weka Workbench.
- Intermediate: Python Ecosystem.
- Advanced: R Platform.
- Best Programming Language for Machine Learning
- Step 4: Practice on Datasets. Select datasets to work on and practice the process.
- Step 5: Build a Portfolio. Gather results and demonstrate your skills.
For more on this top-down approach, see:
Many of my students have used this approach to go on and do well in Kaggle competitions and get jobs as Machine Learning Engineers and Data Scientists.
Applied Machine Learning Process
The benefit of machine learning are the predictions and the models that make predictions.
To have skill at applied machine learning means knowing how to consistently and reliably deliver high-quality predictions on problem after problem. You need to follow a systematic process.
Below is a 5-step process that you can follow to consistently achieve above average results on predictive modeling problems:
- Step 1: Define your problem.
- Step 2: Prepare your data.
- Step 3: Spot-check algorithms.
- Step 4: Improve results.
- Step 5: Present results.
For a good summary of this process, see the posts:
- Applied Machine Learning Process
- How to Use a Machine Learning Checklist to Get Accurate Predictions
Probability for Machine Learning
Probability is the mathematics of quantifying and harnessing uncertainty. It is the bedrock of many fields of mathematics (like statistics) and is critical for applied machine learning.
Below is the 3 step process that you can use to get up-to-speed with probability for machine learning, fast.
- Step 1: Discover what Probability is.
- Step 2: Discover why Probability is so important for machine learning.
- Step 3: Dive into Probability topics.
You can see all of the tutorials on probability here. Below is a selection of some of the most popular tutorials.
Probability Foundations
- Introduction to Joint, Marginal, and Conditional Probability
- Intuition for Joint, Marginal, and Conditional Probability
- Worked Examples of Different Types of Probability
Bayes Theorem
Probability Distributions
- A Gentle Introduction to Probability Distributions
- Discrete Probability Distributions for Machine Learning
- Continuous Probability Distributions for Machine Learning
Information Theory
Statistics for Machine Learning
Statistical Methods an important foundation area of mathematics required for achieving a deeper understanding of the behavior of machine learning algorithms.
Below is the 3 step process that you can use to get up-to-speed with statistical methods for machine learning, fast.
- Step 1: Discover what Statistical Methods are.
- Step 2: Discover why Statistical Methods are important for machine learning.
- Step 3: Dive into the topics of Statistical Methods.
You can see all of the statistical methods posts here. Below is a selection of some of the most popular tutorials.
Summary Statistics
- Introduction to the 5 Number Summary
- Introduction to Data Visualization
- Correlation to Understand the Relationship Between Variables
- Introduction to Calculating Normal Summary Statistics
Statistical Hypothesis Tests
Resampling Methods
- Introduction to Statistical Sampling and Resampling
- Introduction to the Bootstrap
- Introduction to Cross-Validation
Estimation Statistics
Linear Algebra for Machine Learning
Linear algebra is an important foundation area of mathematics required for achieving a deeper understanding of machine learning algorithms.
Below is the 3 step process that you can use to get up-to-speed with linear algebra for machine learning, fast.
- Step 1: Discover what Linear Algebra is.
- Step 2: Discover why Linear Algebra is important for machine learning.
- Step 3: Dive into Linear Algebra topics.
You can see all linear algebra posts here. Below is a selection of some of the most popular tutorials.
Linear Algebra in Python
Matrices
Vectors
Matrix Factorization
Optimization for Machine Learning
Optimization is the core of all machine learning algorithms. When we train a machine learning model, it is doing optimization with the given dataset.
You can get familiar with optimization for machine learning in 3 steps, fast.
- Step 1: Discover what Optimization is.
- Step 2: Discover the Optimization Algorithms.
- Step 3: Dive into Optimization Topics.
You can see all optimization posts here. Below is a selection of some of the most popular tutorials.
Local Optimization
- Function Optimization With SciPy
- Basin Hopping Optimization in Python
- Stochastic Hill Climbing in Python from Scratch
Global Optimization
Gradient Descent
- How to Implement Gradient Descent Optimization from Scratch
- Code Adam Optimization Algorithm From Scratch
- Gradient Descent Optimization With Nadam From Scratch
Applications of Optimization
Calculus for Machine Learning
Calculus is the hidden driver for the success of many machine learning algorithms. When we talk about the gradient descent optimization part of a machine learning algorithm, the gradient is found using calculus.
You can get familiar with calculus for machine learning in 3 steps.
- Step 1: Discover what Calculus is about.
- Step 2: Discover the rules of differentiation.
- Step 3: Dive into Calculus Topics.
You can see all calculus posts here. Below is a selection of some of the most popular tutorials.
Basic Calculus
- Calculus in Machine Learning: Why it Works
- Key Concepts in Calculus: Rate of Change
- A Gentle Introduction to Slopes and Tangents
- The Power, Product, and Quotient Rules
Multivariate Calculus
Calculus for Optimization
- A Gentle Introduction to Optimization / Mathematical Programming
- A Gentle Introduction to Method of Lagrange Multipliers
Applications of Calculus
Python for Machine Learning
Python is the lingua franca of machine learning projects. Not only a lot of machine learning libraries are in Python, but also it is effective to help us finish our machine learning projects quick and neatly. Having good Python programming skills can let you get more done in shorter time!
You can get familiar with Python for machine learning in 3 steps.
- Step 1: Learn the language.
- Step 2: Learn how to work with the language.
- Step 3: Learn what you can do in Python ecosystem.
You can see all Python posts here. But don’t miss Python for Machine Learning (my book). Below is a selection of some of the most popular tutorials.
Basic Language
- Some Language Features in Python
- More Special Features in Python
- Python Classes and Their Use in Keras
Troubleshooting
Language Techniques
- Command Line Arguments for Your Python Script
- A Gentle Introduction to Decorators in Python
- Techniques to Write Better Python Code
Libraries
Understand Machine Learning Algorithms
Machine learning is about machine learning algorithms.
You need to know what algorithms are available for a given problem, how they work, and how to get the most out of them.
Here’s how to get started with machine learning algorithms:
- Step 1: Discover the different types of machine learning algorithms.
- Step 2: Discover the foundations of machine learning algorithms.
- Step 3: Discover how top machine learning algorithms work.
You can see all machine learning algorithm posts here. Below is a selection of some of the most popular tutorials.
Linear Algorithms
Nonlinear Algorithms
Ensemble Algorithms
How to Study/Learn ML Algorithms
- 5 Ways To Understand Machine Learning Algorithms
- How to Learn a Machine Learning Algorithm
- How to Study Machine Learning Algorithms
- How to Research a Machine Learning Algorithm
- How To Investigate Machine Learning Algorithm Behavior
- Take Control By Creating Lists of Machine Learning Algorithms
- 6 Questions To Understand Any Machine Learning Algorithm
Weka Machine Learning (no code)
Weka is a platform that you can use to get started in applied machine learning.
It has a graphical user interface meaning that no programming is required and it offers a suite of state of the art algorithms.
Here’s how you can get started with Weka:
- Step 1: Discover the features of the Weka platform.
- Step 2: Discover how to get around the Weka platform.
- Step 3: Discover how to deliver results with Weka.
You can see all Weka machine learning posts here. Below is a selection of some of the most popular tutorials.
Prepare Data in Weka
- How To Load CSV Machine Learning Data in Weka
- How to Better Understand Your Machine Learning Data in Weka
- How to Normalize and Standardize Your Machine Learning Data in Weka
- How To Handle Missing Values In Machine Learning Data With Weka
- How to Perform Feature Selection With Machine Learning Data in Weka
Weka Algorithm Tutorials
Python Machine Learning (scikit-learn)
Python is one of the fastest growing platforms for applied machine learning.
You can use the same tools like pandas and scikit-learn in the development and operational deployment of your model.
Below are the steps that you can use to get started with Python machine learning:
- Step 1: Discover Python for machine learning
- Step 2: Discover the ecosystem for Python machine learning.
- Step 3: Discover how to work through problems using machine learning in Python.
You can see all Python machine learning posts here. Below is a selection of some of the most popular tutorials.
Prepare Data in Python
Machine Learning in Python
- Evaluate the Performance of Machine Learning Algorithms
- Metrics To Evaluate Machine Learning Algorithms in Python
- Spot-Check Classification Machine Learning Algorithms in Python with scikit-learn
- Spot-Check Regression Machine Learning Algorithms in Python with scikit-learn
- How To Compare Machine Learning Algorithms in Python with scikit-learn
R Machine Learning (caret)
R is a platform for statistical computing and is the most popular platform among professional data scientists.
It’s popular because of the large number of techniques available, and because of excellent interfaces to these methods such as the powerful caret package.
Here’s how to get started with R machine learning:
- Step 1: Discover the R platform and why it is so popular.
- Step 2: Discover machine learning algorithms in R.
- Step 3: Discover how to work through problems using machine learning in R.
You can see all R machine learning posts here. Below is a selection of some of the most popular tutorials.
Data Preparation in R
Applied Machine Learning in R
Code Algorithm from Scratch (Python)
You can learn a lot about machine learning algorithms by coding them from scratch.
Learning via coding is the preferred learning style for many developers and engineers.
Here’s how to get started with machine learning by coding everything from scratch.
- Step 1: Discover the benefits of coding algorithms from scratch.
- Step 2: Discover that coding algorithms from scratch is a learning tool only.
- Step 3: Discover how to code machine learning algorithms from scratch in Python.
- Machine Learning Algorithms From Scratch (my book)
You can see all of the Code Algorithms from Scratch posts here. Below is a selection of some of the most popular tutorials.
Prepare Data
Linear Algorithms
Algorithm Evaluation
Nonlinear Algorithms
Introduction to Time Series Forecasting (Python)
Time series forecasting is an important topic in business applications.
Many datasets contain a time component, but the topic of time series is rarely covered in much depth from a machine learning perspective.
Here’s how to get started with Time Series Forecasting:
- Step 1: Discover Time Series Forecasting.
- Step 2: Discover Time Series as Supervised Learning.
- Step 3: Discover how to get good at delivering results with Time Series Forecasting.
You can see all Time Series Forecasting posts here. Below is a selection of some of the most popular tutorials.
Data Preparation Tutorials
Forecasting Tutorials
- How to Make Baseline Predictions for Time Series Forecasting with Python
- How to Check if Time Series Data is Stationary with Python
- How to Create an ARIMA Model for Time Series Forecasting with Python
- How to Grid Search ARIMA Model Hyperparameters with Python
- How to Work Through a Time Series Forecast Project
Data Preparation for Machine Learning (Python)
The performance of your predictive model is only as good as the data that you use to train it.
As such data preparation may the most important parts of your applied machine learning project.
Here’s how to get started with Data Preparation for machine learning:
- Step 1: Discover the importance of data preparation.
- Step 2: Discover data preparation techniques.
- Step 3: Discover how to get good at delivering results with data preparation.
You can see all Data Preparation tutorials here. Below is a selection of some of the most popular tutorials.
Data Cleaning
- How to delete Duplicate Rows and Useless Features
- How to identify and Delete Outliers
- How to impute Missing Values
Feature Selection
Data Transforms
- How to use Normalization and Standardization
- How to use Ordinal and One Hot Encoding
- How to use Power Transforms
Dimensionality Reduction
XGBoost in Python (Stochastic Gradient Boosting)
XGBoost is a highly optimized implementation of gradient boosted decision trees.
It is popular because it is being used by some of the best data scientists in the world to win machine learning competitions.
Here’s how to get started with XGBoost:
- Step 1: Discover the Gradient Boosting Algorithm.
- Step 2: Discover XGBoost.
- Step 3: Discover how to get good at delivering results with XGBoost.
You can see all XGBoosts posts here. Below is a selection of some of the most popular tutorials.
XGBoost Basics
XGBoost Tuning
- How to Configure the Gradient Boosting Algorithm
- Tune Learning Rate for Gradient Boosting with XGBoost in Python
- Stochastic Gradient Boosting with XGBoost and scikit-learn in Python
- How to Tune the Number and Size of Decision Trees with XGBoost in Python
- How to Best Tune Multithreading Support for XGBoost in Python
Imbalanced Classification
Imbalanced classification refers to classification tasks where there are many more examples for one class than another class.
These types of problems often require the use of specialized performance metrics and learning algorithms as the standard metrics and methods are unreliable or fail completely.
Here’s how you can get started with Imbalanced Classification:
- Step 1: Discover the challenge of imbalanced classification
- Step 2: Discover the intuition for skewed class distributions.
- Step 3: Discover how to solve imbalanced classification problems.
You can see all Imbalanced Classification posts here. Below is a selection of some of the most popular tutorials.
Performance Measures
- Tour of Evaluation Metrics for Imbalanced Classification
- Failure of Classification Accuracy
- How to Calculate Precision, Recall, and F-Measure
Cost-Sensitive Algorithms
Data Sampling
- Tour of Data Sampling Methods for Imbalanced Classification
- Random Oversampling and Undersampling
- SMOTE Oversampling for Imbalanced Classification
Advanced Methods
Deep Learning (Keras)
Deep learning is a fascinating and powerful field.
State-of-the-art results are coming from the field of deep learning and it is a sub-field of machine learning that cannot be ignored.
Here’s how to get started with deep learning:
- Step 1: Discover what deep learning is all about.
- Step 2: Discover the best tools and libraries.
- Step 3: Discover how to work through problems and deliver results.
You can see all deep learning posts here. Below is a selection of some of the most popular tutorials.
Background
- Crash Course On Multi-Layer Perceptron Neural Networks
- Crash Course in Convolutional Neural Networks for Machine Learning
- Crash Course in Recurrent Neural Networks for Deep Learning
Multilayer Perceptrons
Convolutional Neural Networks
- Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras
- Object Recognition with Convolutional Neural Networks in the Keras Deep Learning Library
- Predict Sentiment From Movie Reviews Using Deep Learning
Recurrent Neural Networks
Deep Learning (PyTorch)
Besides Keras, PyTorch is another library for deep learning with a huge market-share. It is important to know about PyTorch and become familiar with its syntax.
Here’s how to get started with deep learning in PyTorch:
- Step 1: Discover what deep learning is all about.
- Step 2: Discover PyTorch
- Step 3: Discover how to work through problems and deliver results.
You can see all PyTorch deep learning posts here. Below is a selection of some of the most popular tutorials.
Background
Multilayer Perceptrons
Model Building Techniques
- Use PyTorch Deep Learning Models with scikit-learn
- Save and Load Your PyTorch Models
- Using Dropout Regularization in PyTorch Models
Advanced Networks
Machine Learning in OpenCV
OpenCV is the most popular library for image processing but its machine learning module is less well-known.
If you are already using OpenCV, adding machine learning to your project should be at no additional cost. You can make use of the experiences you learned in scikit-learn or Keras to bring your image processing project to the next level.
Below are the steps that you can use to get started with machine learning in OpenCV:
- Step 1: Refresher on what OpenCV offers
- Step 2: Discover how to present images for the consumption by machine learning models
- Step 3: Discover how to use machine learning in OpenCV
You can see all OpenCV machine learning posts here. Below is a selection of some of the most popular tutorials.
Foundations on OpenCV and Image Processing
- How to Read, Write, Display Images in OpenCV and Converting Color Spaces
- How to Transform Images and Create Video with OpenCV
- Image Datasets for Practicing Machine Learning in OpenCV
- Image Feature Extraction in OpenCV: Keypoints and Description Vectors
- Image Vector Representation for Machine Learning Using OpenCV
Machine Learning in Python
Better Deep Learning Performance
Although it is easy to define and fit a deep learning neural network model, it can be challenging to get good performance on a specific predictive modeling problem.
There are standard techniques that you can use to improve the learning, reduce overfitting, and make better predictions with your deep learning model.
Here’s how to get started with getting better deep learning performance:
- Step 1: Discover the challenge of deep learning.
- Step 2: Discover frameworks for diagnosing and improving model performance.
- Step 3: Discover techniques that you can use to improve performance.
You can see all better deep learning posts here. Below is a selection of some of the most popular tutorials.
Better Learning (fix training)
- How to Control Model Capacity With Nodes and Layers
- How to Choose Loss Functions When Training Neural Networks
- Understand the Impact of Learning Rate on Model Performance
- How to Fix Vanishing Gradients Using the ReLU
Better Generalization (fix overfitting)
Better Predictions (ensembles)
- Ensemble Methods for Deep Learning Neural Networks
- How to Develop Model Averaging Ensembles
- How to Develop a Cross-Validation and Bagging Ensembles
- How to Develop a Stacking Deep Learning Ensemble
Tips, Tricks, and Resources
Ensemble Learning
Predictive performance is the most important concern on many classification and regression problems. Ensemble learning algorithms combine the predictions from multiple models and are designed to perform better than any contributing ensemble member.
Here’s how to get started with getting better ensemble learning performance:
- Step 1: Discover ensemble learning.
- Step 2: Discover ensemble learning algorithms.
- Step 3: Discover techniques that you can use to improve performance.
You can see all ensemble learning posts here. Below is a selection of some of the most popular tutorials.
Ensemble Basics
Stacking Ensembles
Bagging Ensembles
Boosting Ensembles
Long Short-Term Memory Networks (LSTMs)
Long Short-Term Memory (LSTM) Recurrent Neural Networks are designed for sequence prediction problems and are a state-of-the-art deep learning technique for challenging prediction problems.
Here’s how to get started with LSTMs in Python:
- Step 1: Discover the promise of LSTMs.
- Step 2: Discover where LSTMs are useful.
- Step 3: Discover how to use LSTMs on your project.
You can see all LSTM posts here. Below is a selection of some of the most popular tutorials using LSTMs in Python with the Keras deep learning library.
Data Preparation for LSTMs
- How to Reshape Input Data for Long Short-Term Memory Networks
- How to One Hot Encode Sequence Data
- How to Remove Trends and Seasonality with a Difference Transform
- How to Scale Data for Long Short-Term Memory Networks
- How to Prepare Sequence Prediction for Truncated BPTT
- How to Handle Missing Timesteps in Sequence Prediction Problems
LSTM Behaviour
- A Gentle Introduction to Backpropagation Through Time
- Demonstration of Memory with a Long Short-Term Memory Network
- How to Use the TimeDistributed Layer for Long Short-Term Memory Networks
- How to use an Encoder-Decoder LSTM to Echo Sequences of Random Integers
- Attention in Long Short-Term Memory Recurrent Neural Networks
Modeling with LSTMs
- Generative Long Short-Term Memory Networks
- Stacked Long Short-Term Memory Networks
- Encoder-Decoder Long Short-Term Memory Networks
- CNN Long Short-Term Memory Networks
- Diagnose Overfitting and Underfitting of LSTM Models
- How to Make Predictions with Long Short-Term Memory Models
LSTM for Time Series
Deep Learning for Natural Language Processing (NLP)
Working with text data is hard because of the messy nature of natural language.
Text is not “solved” but to get state-of-the-art results on challenging NLP problems, you need to adopt deep learning methods
Here’s how to get started with deep learning for natural language processing:
- Step 1: Discover what deep learning for NLP is all about.
- Step 2: Discover standard datasets for NLP.
- Step 3: Discover how to work through problems and deliver results.
You can see all deep learning for NLP posts here. Below is a selection of some of the most popular tutorials.
Bag-of-Words Model
- What is the Bag-of-Words Model?
- How to Prepare Text Data for Machine Learning with scikit-learn
- How to Develop a Bag-of-Words Model for Predicting Sentiment
Language Modeling
- Gentle Introduction to Statistical Language Modeling and Neural Language Models
- How to Develop a Character-Based Neural Language Model in Keras
- How to Develop a Word-Level Neural Language Model and Use it to Generate Text
Text Summarization
- A Gentle Introduction to Text Summarization
- How to Prepare News Articles for Text Summarization
- Encoder-Decoder Models for Text Summarization in Keras
Text Classification
Word Embeddings
- What are Word Embeddings?
- How to Develop Word Embeddings in Python with Gensim
- How to Use Word Embedding Layers for Deep Learning with Keras
Photo Captioning
- How to Automatically Generate Textual Descriptions for Photographs with Deep Learning
- A Gentle Introduction to Deep Learning Caption Generation Models
- How to Develop a Deep Learning Photo Caption Generator from Scratch
Text Translation
Deep Learning for Computer Vision
Working with image data is hard because of the gulf between raw pixels and the meaning in the images.
Computer vision is not solved, but to get state-of-the-art results on challenging computer vision tasks like object detection and face recognition, you need deep learning methods.
Here’s how to get started with deep learning for computer vision:
- Step 1: Discover what deep learning for Computer Vision is all about.
- Step 2: Discover standard tasks and datasets for Computer Vision.
- Step 3: Discover how to work through problems and deliver results.
You can see all deep learning for Computer Vision posts here. Below is a selection of some of the most popular tutorials.
Image Data Handling
- How to Load and Manipulate Images With PIL/Pillow
- How to Load, Convert, and Save Images With the Keras API
- Introduction to hannels First and Channels Last Image Formats
Image Data Augmentation
- How to Load Large Datasets From Directories
- How to Configure and Use Image Data Augmentation
- Introduction to Test-Time Data Augmentation
Image Classification
Image Data Preparation
- How to Manually Scale Image Pixel Data for Deep Learning
- How to Evaluate Pixel Scaling Methods for Image Classification
- How to Normalize, Center, and Standardize Images in Keras
Basics of Convolutional Neural Networks
- Gentle Introduction to Convolutional Layers in CNNS
- Gentle Introduction to Padding and Stride in CNNs
- Gentle Introduction to Pooling Layers in CNNs
Object Recognition
Deep Learning for Time Series Forecasting
Deep learning neural networks are able to automatically learn arbitrary complex mappings from inputs to outputs and support multiple inputs and outputs.
Methods such as MLPs, CNNs, and LSTMs offer a lot of promise for time series forecasting.
Here’s how to get started with deep learning for time series forecasting:
- Step 1: Discover the promise (and limitations) of deep learning for time series.
- Step 2: Discover how to develop robust baseline and defensible forecasting models.
- Step 3: Discover how to build deep learning models for time series forecasting.
You can see all deep learning for time series forecasting posts here. Below is a selection of some of the most popular tutorials.
Forecast Trends and Seasonality (univariate)
- Grid Search SARIMA Models for Time Series Forecasting
- Grid Search Exponential Smoothing for Time Series Forecasting
- Develop Deep Learning Models for Univariate Forecasting
Human Activity Recognition (multivariate classification)
- How to Model Human Activity From Smartphone Data
- How to Develop CNN Models for Human Activity Recognition
- How to Develop RNN Models for Human Activity Recognition
Forecast Electricity Usage (multivariate, multi-step)
Models Types
- How to Develop MLPs for Time Series Forecasting
- How to Develop CNNs for Time Series Forecasting
- How to Develop LSTMs for Time Series Forecasting
Time Series Case Studies
- Indoor Movement Time Series Classification
- Probabilistic Forecasting Model to Predict Air Pollution Days
- Predict Room Occupancy Based on Environmental Factors
- Predict Whether Eyes are Open or Closed Using Brain Waves
Forecast Air Pollution (multivariate, multi-step)
Generative Adversarial Networks (GANs)
Generative Adversarial Networks, or GANs for short, are an approach to generative modeling using deep learning methods, such as convolutional neural networks.
GANs are an exciting and rapidly changing field, delivering on the promise of generative models in their ability to generate realistic examples across a range of problem domains, most notably in image-to-image translation tasks.
Here’s how to get started with deep learning for Generative Adversarial Networks:
- Step 1: Discover the promise of GANs for generative modeling.
- Step 2: Discover the GAN architecture and different GAN models.
- Step 3: Discover how to develop GAN models in Python with Keras.
You can see all Generative Adversarial Network tutorials listed here. Below is a selection of some of the most popular tutorials.
GAN Fundamentals
- How to Code the GAN Training Algorithm and Loss Functions
- How to use the UpSampling2D and Conv2DTranspose Layers
- How to Implement GAN Hacks in Keras to Train Stable Models
GAN Loss Functions
Develop Simple GAN Models
- How to Develop a 1D GAN From Scratch
- How to Develop a GAN for Generating MNIST Digits
- How to Develop a GAN to Generate CIFAR10 Photos
GANs for Image Translation
Attention and Transformers
Attention mechanisms are the techniques invented to mitigate the issue where recurrent neural networks failed to work well with long sequences of input. We learned that the attention mechanism itself can be used as a building block of neural networks and therefore we now have the transformer architecture.
Attention mechanisms and transformer models are shown to deliver amazing results, especially in natural language processing. There are examples of using transformer models in one way or another that make computers understand human language and perform tasks such as translation or summarizing a paragraph, in human-like quality.
Here’s how to get started to understand attention mechanisms and transformers:
- Step 1: Learn about what attention is and what it can do.
- Step 2: Discover how to use attention in a neural network model.
- Step 3: Learn how the transformer model is built from the attention mechanism.
You can see all Attention and Transformer tutorials listed here. Below is a selection of some of the most popular tutorials.
Attention Fundamentals
- What Is Attention?
- A Bird’s Eye View of Research on Attention
- A Tour of Attention-Based Architectures
- The Luong Attention Mechanism
Transformer Fundamentals
Building a Transformer Model from Scratch
- How to Implement Scaled Dot-Product Attention from Scratch in TensorFlow and Keras
- How to Implement Multi-Head Attention from Scratch in TensorFlow and Keras
- Implementing the Transformer Encoder from Scratch in TensorFlow and Keras
- Implementing the Transformer Decoder from Scratch in TensorFlow and Keras
- Joining the Transformer Encoder and Decoder Plus Masking
- Training the Transformer Model
Need More Help?
I’m here to help you become awesome at applied machine learning.
If you still have questions and need help, you have some options:
- Ebooks: I sell a catalog of Ebooks that show you how to get results with machine learning, fast.
- Blog: I write a lot about applied machine learning on the blog, try the search feature.
- Frequently Asked Questions: The most common questions I get and their answers
- Contact: You can contact me with your question, but one question at a time please.