Search results for "Value At Risk"

Line Plot Showing Single Model Accuracy (blue dots) vs Accuracy of Ensembles of Varying Size With a Horizontal Voting Ensemble

How to Develop a Horizontal Voting Deep Learning Ensemble to Reduce Variance

By Jason Brownlee on August 25, 2020 in Deep Learning Performance 10

Predictive modeling problems where the training dataset is small relative to the number of unlabeled examples are challenging. Neural networks can perform well on these types of problems, although they can suffer from high variance in model performance as measured on a training or hold-out validation datasets. This makes choosing which model to use as […]

A Gentle Introduction to Dropout for Regularizing Deep Neural Networks

By Jason Brownlee on August 6, 2019 in Deep Learning Performance 48

Deep learning neural networks are likely to quickly overfit a training dataset with few examples. Ensembles of neural networks with different model configurations are known to reduce overfitting, but require the additional computational expense of training and maintaining multiple models. A single model can be used to simulate having a large number of different network […]

A Gentle Introduction to Weight Regularization to Reduce Overfitting for Deep Learning Models

Use Weight Regularization to Reduce Overfitting of Deep Learning Models

By Jason Brownlee on August 6, 2019 in Deep Learning Performance 24

Neural networks learn a set of weights that best map inputs to outputs. A network with large network weights can be a sign of an unstable network where small changes in the input can lead to large changes in the output. This can be a sign that the network has overfit the training dataset and […]

How to Develop Machine Learning Models for Multivariate Multi-Step Air Pollution Time Series Forecasting

How to Develop Multivariate Multi-Step Time Series Forecasting Models for Air Pollution

By Jason Brownlee on August 28, 2020 in Deep Learning for Time Series 156

Real-world time series forecasting is challenging for a whole host of reasons not limited to problem features such as having multiple input variables, the requirement to predict multiple time steps, and the need to perform the same type of prediction for multiple physical sites. The EMC Data Science Global Hackathon dataset, or the ‘Air Quality […]

MAE by Forecast Lead Time via Local Median

How to Develop Baseline Forecasts for Multi-Site Multivariate Air Pollution Time Series Forecasting

By Jason Brownlee on August 28, 2020 in Deep Learning for Time Series 11

Depiction of CNN Model for Accelerompter Data

Deep Learning Models for Human Activity Recognition

By Jason Brownlee on August 5, 2019 in Deep Learning for Time Series 78

Human activity recognition, or HAR, is a challenging time series classification task. It involves predicting the movement of a person based on sensor data and traditionally involves deep domain expertise and methods from signal processing to correctly engineer features from the raw data in order to fit a machine learning model. Recently, deep learning methods […]

The Role of Randomization to Address Confounding Variables in Machine Learning

By Jason Brownlee on July 31, 2020 in Statistics 8

A large part of applied machine learning is about running controlled experiments to discover what algorithm or algorithm configuration to use on a predictive modeling problem. A challenge is that there are aspects of the problem and the algorithm called confounding variables that cannot be controlled (held constant) and must be controlled-for. An example is […]

A Gentle Introduction to Statistical Power and Power Analysis in Python

By Jason Brownlee on April 24, 2020 in Statistics 70

The statistical power of a hypothesis test is the probability of detecting an effect, if there is a true effect present to detect. Power can be calculated and reported for a completed experiment to comment on the confidence one might have in the conclusions drawn from the results of the study. It can also be […]

A Gentle Introduction to Effect Size Measures in Python

By Jason Brownlee on August 8, 2019 in Statistics 14

Statistical hypothesis tests report on the likelihood of the observed results given an assumption, such as no association between variables or no difference between groups. Hypothesis tests do not comment on the size of the effect if the association or difference is statistically significant. This highlights the need for standard ways of calculating and reporting […]

Statistical Significance Tests for Comparing Machine Learning Algorithms

By Jason Brownlee on August 8, 2019 in Statistics 160

Comparing machine learning methods and selecting a final model is a common operation in applied machine learning. Models are commonly evaluated using resampling methods like k-fold cross-validation from which mean skill scores are calculated and compared directly. Although simple, this approach can be misleading as it is hard to know whether the difference between mean […]

← Previous 1 2 3 4 5 Next →