What is the Weka Machine Learning Workbench

Machine learning is an iterative process rather than a linear process that requires each step to be revisited as more is learned about the problem under investigation. This iterative process can require using many different tools, programs and scripts for each process.

A machine learning workbench is a platform or environment that supports and facilitates a range of machine learning activities reducing or removing the need for multiple tools.

Some statistical and machine learning work benches like R provide very advanced tools but require a lot of manual configuration in the form of scripts and programming. The tools can also be fragile, written by and for academics rather than written to be robust and used in production environments.

Kick-start your project with my new book Machine Learning Mastery With Weka, including step-by-step tutorials and clear screenshots for all examples.

What is Weka 

The Weka machine learning workbench is a modern platform for applied machine learning. Weka is an acronym which stands for Waikato Environment for Knowledge Analysis. It is also the name of a New Zealand bird the Weka.

weka

Five features of Weka that I like to promote are:

  • Open Source: It is released as open source software under the GNU GPL. It is dual licensed and Pentaho Corporation owns the exclusive license to use the platform for business intelligence in their own product.
  • Graphical Interface: It has a Graphical User Interface (GUI). This allows you to complete your machine learning projects without programming.
  • Command Line Interface: All features of the software can used from the command line. This can be very useful for scripting large jobs.
  • Java API: It is written in Java and provides a API that is well documented and promotes integration into your own applications. Note that the GNU GPL means that in turn your software would also have to be released as GPL.
  • Documentation: There books, manuals, wikis and MOOC courses that can train you how to use the platform effectively.

The main reason I promote Weka is because a beginner can go through the process of applied machine learning using the graphical interface without having to do any programming. This is a big deal because getting a handle on the process, handling data and experimenting with algorithms is what a beginner should be learning about, not learning yet another scripting language.

Introduction to the Weka GUI

Now I want to show of the graphical user interface a bit and encourage you to download and have a play with Weka. The workbench provides three main ways to work on your problem: The Explorer for playing around and trying things out, the Experimenter for controlled experiments, and the KnowledgeFlow for graphically designing a pipeline for your problem.

Weka Loader Interface

Weka Loader Interface

Weka Explorer

The explorer is where you play around with your data and think about what transforms to apply to your data, what algorithms you want to run in experiments.

weka explorer

Weka Explorer Interface

The Explorer interface is divided into 5 different tabs:

  • Preprocess: Load a dataset and manipulate the data into a form that you want to work with.
  • Classify: Select and run classification and regression algorithms to operate on your data.
  • Cluster: Select and run clustering algorithms on your dataset.
  • Associate: Run association algorithms to extract insights from your data.
  • Select Attributes: Run attribute selection algorithms on your data to select those attributes that are relevant to the feature you want to predict.
  • Visualize: Visualize the relationship between attributes.

Weka Experimenter

This interface is for designing experiments with your selection of algorithms and datasets, running experiments and analyzing the results.

Weka Experimenter Interface

Weka Experimenter Interface

The tools for analyzing results are very powerful, allowing you to consider and compare results that are statistically significant over multiple runs.

Need more help with Weka for Machine Learning?

Take my free 14-day email course and discover how to use the platform step-by-step.

Click to sign-up and also get a free PDF Ebook version of the course.

Knowledge Flow

Applied machine learning is a process and the Knowledge Flow interface allows you to graphically design that process and run the designs that you create. This includes the loading and transforming of input data, running of algorithms and the presentation of results.

Weka Knowledge Flow Interface

Weka Knowledge Flow Interface

It’s a powerful interface and metaphor for solving complex problems graphically.

Tips for Getting Started

Here are some tips for getting up and running fast:

Download Weka Right Now

It supports the three main platforms: Windows, OS X and Linux. Find the distribution for your platform, download it, install it and start it up. You might have to install Java first. The installation includes many standard experimental datasets (in the data directory) that you can load and practice on.

Read the Weka Documentation

The download includes a PDF manual (WekaManual.pdf) that can get you up to speed very quickly. It is very details and comprehensive with screenshots. There is plenty of supplemetry documentation online, check out:

Don’t forget the book. If you get into Weka, then buy the book. It provides an introduction to applied machine learning as well as an introduction to the Weka platform itself. Highly recommended.

Extensions and Plugins for Weka

There are a lot of plugin algorithm, extends and even platforms that build on Weka:

Online Courses on Weka

There are two online courses that teach data mining with Weka:

Rushdi Shams has an amazing Channel of YouTube videos showing you how to do lots of specific tasks in Weka. Check out his Weka YouTube channel here.

Have you used Weka? Leave a comment and share your experiences.

Discover Machine Learning Without The Code!

Master Machine Learning With Weka

Develop Your Own Models in Minutes

...with just a few a few clicks

Discover how in my new Ebook:
Machine Learning Mastery With Weka

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, build models, tuning, and much more...

Finally Bring The Machine Learning To Your Own Projects

Skip the Academics. Just Results.

See What's Inside

15 Responses to What is the Weka Machine Learning Workbench

  1. Avatar
    Swapnil April 9, 2016 at 4:41 pm #

    Thanks Jason for the detailed information! I am a beginner to ML and just gone through the 4 steps that you mentioned to get started with ML. I am going to use Weka as you suggested and also bought the ‘Data Mining’ book you’ve recommended. Looking forward to gain expertise on ML. Thanks for posting the wonderful articles on your blog and motivating beginners like me to build a career in ML.

  2. Avatar
    Anne March 17, 2017 at 10:15 am #

    Hi Jason, I am also a beginner to ML. I have started to use R to implement some ML for classification. I am not really good at R– I just took intro course online. I usually just copy the code from textbooks to analyze my data (e.g., Introduction to Statistical Learning and Applied Predictive Modeling). I am wondering if it would be much easier for me to use Weka instead. What are the advantages and disadvantages of Weka over R?

    • Avatar
      Jason Brownlee March 18, 2017 at 7:44 am #

      Hi Anne,

      Your approach of copy-pasting code is a great way to get started!

      Weka is excellent for one-off projects and does not require any code to work through problems and run experiments. R requires you to code in a strange language but offers some of the most powerful/flexible tools available for analysis.

      I hope that helps as a start.

  3. Avatar
    V April 30, 2017 at 4:51 am #

    Hey Jason,

    How do filtered classifiers in Weka work? I understand the more basic ones, but not the idea behind any of these.

    V

  4. Avatar
    nomi August 6, 2017 at 8:28 pm #

    hello,

    i have started PhD in data mining on project “data mining in healthcare”, i am beginner and much confused on which way to go with with Weka or Python/R. can you suggest me which will be help full to achieve my goals on data mining in healthcare project?

  5. Avatar
    Rakesh Agrawal September 15, 2017 at 1:42 am #

    Hey, Jason, I started ML with R now after a little time what to do next with R means what strategy you will suggest making a good leap.

    Is it a good to work with R or Python.

    Some weka like Gui here with R?

  6. Avatar
    Biswa October 10, 2017 at 1:49 am #

    Hi Jason,

    Hope you are fine. Really thanks for good notes. I am a JAVA developer planning to learn machine learning. do i need to stick java itself or i need to start learn python for ML. Because I know java but not much knowledge in python. As java 8 introduced functional prgmming can we continue with java itself. Whenever I am searching ML jobs they are asking for R or Python knowledge

  7. Avatar
    Jesús Martínez February 26, 2018 at 11:57 pm #

    Wow! Weka seems fun and useful. I’ve got a question: Does it only support classical machine learning algorithms such as decision trees and regressions? Or does it allow the creation of bigger deep learning ones like CNNs? If not, do you know about a similar GUI for creating deep learning workflows?

    Thanks in advance for your time and attention! Keep up the good work!

    • Avatar
      Jason Brownlee February 27, 2018 at 6:32 am #

      Only small in memory methods as far as I know. There may be 3rd party plugins to do deep learning but I’m not across them.

  8. Avatar
    Aurelio March 22, 2018 at 12:51 pm #

    Hello Jason, Im trying to make sense of some of the text and have a quest, what is referred to as the “Classifiaction problem” and ” Regression problem” ?? I understand that classification is for nominal data and regression is for continuous data, but I stumped at how does this become a problem. Secondly, how does a user arrive at defining a problem ?

    -aj

Leave a Reply