It is really important to have a performance baseline on your machine learning problem.

It will give you a point of reference to which you can compare all other models that you construct.

In this post you will discover how to develop a baseline of performance for a machine learning problem using Weka.

After reading this post you will know:

- The importance in establishing a baseline of performance for your machine learning problem.
- How to calculate a baseline performance using the Zero Rule method on a regression problem.
- How to calculate a baseline performance using the Zero Rule method on a classification problem.

**Kick-start your project** with my new book Machine Learning Mastery With Weka, including *step-by-step tutorials* and clear *screenshots* for all examples.

Let’s get started.

## Importance of Baseline Results

You cannot know which algorithm will perform the best for your problem before hand so you must try a suite of algorithms and see what works best, then double down on it.

As such, it is critically important to develop a baseline of performance when working on a machine learning problem.

A baseline provides a point of reference from which to compare other machine learning algorithms.

You can get an idea of both the absolute performance increases you can achieve over the baseline as well as lift ratios that show you relatively how much better you are doing.

Without a baseline you do not know how well you are doing on your problem. You have no point of reference to consider whether or not you have or are continuing to add value. The baseline defines the hurdle that all other machine learning algorithms must cross to demonstrate “skill” on the problem.

### Need more help with Weka for Machine Learning?

Take my free 14-day email course and discover how to use the platform step-by-step.

Click to sign-up and also get a free PDF Ebook version of the course.

## Zero Rule For Baseline Performance

The baseline for both classification and regression problems is called the Zero Rule algorithm. Also called ZeroR or 0-R.

Let’s take a closer look at how the Zero Rule algorithm can be used on classification and regression problems with some examples.

### Baseline Performance For Regression Problems

For a regression predictive modeling problem where a numeric value is predicted, the Zero Rule algorithm predicts the mean of the training dataset.

For example, let’s demonstrate the Zero Rule algorithm on the Boston House Price prediction problem. You can download the ARFF for the Boston House Price prediction dataset from the Weka datasets webpage. It is located in the *datasets-numeric.jar* package in the file *housing.arff*.

- Start the Weka GUI Chooser.
- Click the “Explorer” button to open the Weka Explorer interface.
- Load the Boston house price dataset
*housing.arff*file. - Click the “Classify” tab to open the classification tab.
- Select the ZeroR algorithm (it should be selected by default).
- Select the “Cross-validation” Test options (it should be selected by default).
- Click the “Start” button to evaluate the algorithm on the dataset.

The ZeroR algorithm predicts the mean Boston House price value of 22.5 (in thousands of dollars) and achieves a RMSE of 9.21.

For any machine learning algorithm to demonstrate that it has skill on this problem, it must achieve an RMSE better than this value.

### Baseline Performance for Classification Problems

For a classification predictive modeling problem where a categorical value is predicted, the Zero Rule algorithm predicts the class value that has the most observations in the training dataset.

For example, let’s demonstrate the Zero Rule algorithm on the Pima Indians onset of diabetes problem. This dataset should be located in your *data/* directory of your Weka installation. If not, you can download the default Weka installation from the Weka Download webpage targeted for “Other platforms” with a .zip extension, unzip it and locate the *diabetes.arff* file.

- Start the Weka GUI Chooser.
- Click the “Explorer” button to open the Weka Explorer interface.
- Load the Pima Indians dataset
*diabetes.arff*file. - Click the “Classify” tab to open the classification tab.
- Select the ZeroR algorithm (it should be selected by default).
- Select the “Cross-validation” Test options (it should be selected by default).
- Click the “Start” button to evaluate the algorithm on the dataset.

The ZeroR algorithm predicts the *tested_negative* value for all instances as it is the majority class, and achieves an accuracy of 65.1%.

For any machine learning algorithm to demonstrate that it has skill on this problem, it must achieve an accuracy better than this value.

## Summary

In this post you have discovered how to calculate a baseline performance for your machine learning problems using Weka.

Specifically, you learned:

- The importance of calculating a baseline of performance on your problem.
- How to calculate a baseline performance for a regression problem using the Zero Rule algorithm.
- How to calculate a baseline performance for a classification problem using the Zero Rule algorithm.

Do you have any questions about calculating a baseline of performance or about this post? Ask your questions in the comments and I will do my best to answer them.

does it always have to be ZeroR as the classifier? what about other point of reference like NaiveBayes?

and which Test Options needs to be choose for baseline? does it always have to be Cross-Validation?

Great questions. I like to use ZeroR, but you can baseline off whatever you like.

I would advise using the same test harness/test options as you use to evaluate all methods on your problem.

thank you so much Dr. Jason Brownlee

my qustion is can I use ZeroR algorithm in my resarch to predicts bankruptcy?

what is the benefit over other algorithms?

thank u

ZeroR is a baseline method to which all other methods can be compared.

I’m confused. If I’m using your guidance in this post here: https://machinelearningmastery.com/dont-use-random-guessing-as-your-baseline-classifier/

….then my baseline performance or ZeroR would be 54.6% using the following calculations:

P(class is No) * P(you guess No) + P(class is Yes) * P(you guess Yes) = =0.651*0.651+0.349*0.349 = 0.546.

What am I missing here?

-Thanks-

Zero Rule is not random guessing.

The equation is for random guessing.

Zero Rule will guess the mode for classification and the mean for regression.

Hi Jason, is there a way to look at the program generated by Weka based on our selections or actions in the weka interface?

Thanks

If you are a Java programmer, you can use the API to develop a program.

Hi Jason, another question, I see these responses from running a dataset against ZeroR algorithm, but not sure against what exact data it is coming up with these predictions:

ZeroR predicts class value: yes

ZeroR predicts class value: FALSE

— for what data class value is false?

ZeroR predicts class value: sunny

— when is it sunny?

ZeroR predicts class value: 22.532806324110698

Perhaps double check the data file you are using?

Hi. Thanks for a great guide to ZeroR.

I am having trouple interpreting the results of my ZeroR.

I have run a ZeroR on my dataset for regression, and I get -0 in correlation coefficient, what does that mean? Is that a mistake? I get the RMSE to 85029231,6432 – that is also very high, right? What does that mean?

It suggests the variables are not correlated.

Perhaps try other algorithms?

Can I get baseline results using WEKA, for a dataset that has lot of missing values? I already tried this in WEKA, and I got results. I just want to know if I’m making any mistake here or do I have to handle the missing values to get an appropriate baseline performance?

Thanks

San

Yes, impute using mean or median (if needed) and calculate a baseline using ZeroR.

I tried running Weka Classifiers and all the classifiers yield same results as the baseline classifier. In this sense, which classifier is better?

It may be that the model requires further tuning of its configuration.

It may be that you you need to try an alternate model.

It may be that your problem is not predictable.

When you run the ZeroR model and it correctly classifies 68%, then you run the J48 for a classification tree and you only have 67% accuracy, what would you do next?

Hi Brandi…You may want to consider deep learning models to compare the performance:

https://machinelearningmastery.com/multi-label-classification-with-deep-learning/