Archive | Data Preparation

Why One-Hot Encode Data in Machine Learning?

Why One-Hot Encode Data in Machine Learning?

Getting started in applied machine learning can be difficult, especially when working with real-world data. Often, machine learning tutorials will recommend or require that you prepare your data in specific ways before fitting a machine learning model. One good example is to use a one-hot encoding on categorical data. Why is a one-hot encoding required? […]

Continue Reading 236
How to Handle Missing Values with Python

How to Handle Missing Data with Python

Real-world data often has missing values. Data can have missing values for a number of reasons such as observations that were not recorded and data corruption. Handling missing data is important as many machine learning algorithms do not support data with missing values. In this tutorial, you will discover how to handle missing data for […]

Continue Reading 113
Data Leakage in Machine Learning

Data Leakage in Machine Learning

Data leakage is a big problem in machine learning when developing predictive models. Data leakage is when information from outside the training dataset is used to create the model. In this post you will discover the problem of data leakage in predictive modeling. After reading this post you will know: What is data leakage is […]

Continue Reading 67
feature selection

An Introduction to Feature Selection

Which features should you use to create a predictive model? This is a difficult question that may require deep knowledge of the problem domain. It is possible to automatically select those features in your data that are most useful or most relevant for the problem you are working on. This is a process called feature […]

Continue Reading 206