Using OpenCV, Python and Template Matching to play “Where’s Waldo?”

This is a guest post by Adrian Rosebrock from PyImageSearch, a blog all about computer vision, image processing, and building image search engines.

Where's Waldo

Figure 1: How long does it take you to find Waldo in this puzzle?

Take a look at the Where’s Waldo puzzle above. How long does it take you to find Waldo? 10 seconds? 30 seconds? Over a minute?

Waldo is the ultimate game of hide and seek for the human eye. He’s actually “hiding” in plain sight — but due to all the noise and distraction, we can’t pick him out immediately!

At the core, Waldo is just a visual pattern. He wears glasses. A hat. And his classic white and red horizontally striped shirt. It might take us a little bit of time to scan up and down and left to right across the page, but our brain is able to pick out this pattern, even amongst all the distraction.

The question is, can computers do better? Can we create a program that can automatically find Waldo?

In fact, we can.

Using computer vision techniques we can find Waldo in under a second, much faster than any of us could!

In this blog post I’ll show you how to use the OpenCV and template matching functions to find that pesky Waldo who is always hiding in plain sight.

Here’s a quick overview of what we’re going to do:

  • What we’re going to do: Build a Python script using OpenCV that can find Waldo in a “Where’s Waldo?” puzzle.
  • What you’ll learn: How to utilize Python, OpenCV, and template matching using cv2.matchTemplate and cv2.minMaxLoc. Using these functions we will be able to find Waldo in our puzzle image.
  • What you need: Python, NumPy, and OpenCV. A little knowledge of basic image processing concepts would help, but is definitely not a requirement. This how-to guide is meant to be hands on and show you how to apply template matching using OpenCV. Don’t have these libraries installed? No problem. I created a pre-configured virtual machine with all the necessary computer vision, image processing, and machine learning packages pre-installed. Click here to learn more.
  • Assumptions: I’ll assume that you have NumPy and OpenCV installed in either the python2.6 or python2.7 environment. Again, you can download a pre-configured virtual machine with all the necessary packages installed here.

Need help with Machine Learning in Python?

Take my free 2-week email course and discover data prep, algorithms and more (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

The Goal:

So what’s the overall goal of the Python script we are going to create?

The goal, given a query image of Waldo and the puzzle image, is to find Waldo in in the puzzle image and highlight his location.

As you’ll see later in this post, we’ll be able to accomplish this in only two lines of Python code. The rest of the code simply handles logic such as argument parsing and displaying the solved puzzle to our screen.

Our Puzzle and Query Image

We require two images to build our Python script to perform template matching.

The first image is the Where’s Waldo puzzle that we are going to solve. You can see our puzzle image in Figure 1 at the top of this post.

The second image is our query image of Waldo:

Our Waldo query image

Figure 2: Our Waldo query image

Using our Waldo query image we are going to find him in the original puzzle.

Unfortunately, here is where the practicality of our approach breaks down.

In order to find Waldo in our puzzle image, we first need the image of Waldo himself. And you may be asking, if I already have the image of Waldo, why I am I playing the puzzle?

Good point.

Using computer vision and image processing techniques to find Waldo in a image is certainly possible.

However, it requires some slightly more advanced techniques such as:

  1. Filtering out colors that are not red.
  2. Calculating the correlation of a striped pattern to match the red and white transitions of Waldo’s shirt.
  3. Binarization of the regions of the image that have high correlation with a striped pattern.

This post is meant to be an introduction to basic computer vision techniques such as template matching. Later on we can dive into more advanced techniques. Where’s Waldo was just a cool and simple way to perform template matching that I just had to share with you!

Getting Our Hands Dirty

Ready to see some code? Alright, let’s do this:

Lines 1-13 simply imports the packages we are going to use and configures our argument parser. We’ll use NumPy for array manipulations, argparse to parse our command line arguments, and cv2 for our OpenCV bindings. The package imutils is actually a set of convenience functions to handle basic image manipulations such as rotation, resizing, and translation. You can read more about these types of basic image operations here.

From there, we need to setup our two command line arguments. The first, --puzzle is the path to our Where’s Waldo puzzle image and --waldo is the path to Waldo query image.

Again, our goal here is to find the query image in the puzzle image using template matching.

Now that we have the paths to our images, we load them off of disk on Line 16 and 17 using the cv2.imread function — this method simply reads the image off disk and then stores it as a multi-dimensional NumPy array.

Since images are represented as NumPy arrays in OpenCV, we can easily access the dimensions of the image. On Line 18 we grab the height and the width of the Waldo query image, respectively.

We are now ready to perform our template matching:

We accomplish our template matching on Line 21 by using the cv2.matchTemplate function. This method requires three parameters. The first is our puzzle image, the image that contains what we are searching for. The second is our query image, waldo. This image is contained within the puzzle image and we are looking to pinpoint its location. Finally, the third argument is our template matching method. There are a variety of methods to perform template matching, but in this case we are using the correlation coefficient which is specified by the flag cv2.TM_CCOEFF.

So what exactly is the cv2.matchTemplate function doing?

Essentially, this function takes a “sliding window” of our waldo query image and slides it across our puzzle image from left to right and top to bottom, one pixel at a time. Then, for each of these locations, we compute the correlation coefficient to determine how “good” or “bad” the match is. Regions with sufficiently high correlation can be considered “matches” for our waldo template.

From there, all we need is a call to cv2.minMaxLoc on Line 22 to find where our “good” matches are.

That’s really all there is to template matching!

And realistically, it only took us two lines of code.

The rest of our source code involves extracting the region that contains Waldo and then highlighting him in the original puzzle image:

Line 26 grabs the top-left (x, y) coordinates of the image that contains the best match based on our sliding window. Then, we compute the bottom-right (x, y) coordinates based on the width and height of our waldo image on Line 27. Finally we extract this roi (Region of Interest) on Line 28.

The next step is to construct a transparent layer that darkens everything in the image but Waldo. We do this by first initializing a mask on Line 32 with the same shape as our puzzle filled with zeros. By filling the image with zeros we are creating an image filled with black.

In order to create the transparent effect, we use the cv2.addWeighted function on Line 33. The first parameter is our puzzle image, and the second parameter indicates that we want it to contribute to 25% of our output image.  We then supply our mask as the third parameter, allowing it to contribute to 75% of our output image. By utilizing the cv2.addWeighted function we have been able to create the transparency effect.

However, we still need to highlight the Waldo region! That’s simple enough:

Here we are just placing the Waldo ROI back into the original image using some NumPy array slicing techniques on Line 37. Nothing to it.

Finally, Lines 40-42 display the results of our work by displaying our Waldo query and puzzle image on screen and waiting for a key press.

To run our script, fire up your shell and execute the following command:

When your script is finished executing you should see something like this on your screen:

We have successfully found Waldo

Figure 3: We have successfully found Waldo!

We have found Waldo at the bottom-left corner of the image!

So there you have it!

Template matching using Python and OpenCV is actually quite simple. To start, you just need two images — an image of the object you want to match and an image that contains the object. From there, you just need to make calls to cv2.matchTemplate and cv2.minMaxLaoc. The rest is just wrapper code to glue the output of these functions together.

Learn Computer Vision In A Single Weekend

Of course, we are only scratching the surface of computer vision and image processing. Template matching is just the start.

Luckily, I can teach you the basics of computer vision in a single weekend.

I know, it sounds crazy.

But my method really works.

See, I just finished writing my new book, Practical Python and OpenCV. I wanted this book to be as hands-on as possible. I wanted something that you could easily learn from, without all the rigor and details associated with a college level computer vision and image processing course.

The bottom line is that Practical Python and OpenCV is the best, guaranteed quick start guide to learning the fundamentals of computer vision and image processing.

Plus, I have created a downloadable Ubuntu VirtualBox virtual machine with OpenCV, PIL, mahotas, scikit-image, scikit-learn, and many other computer vision and image processing libraries pre-configured and pre-installed.

So go ahead, jump start your computer vision education. Don’t waste time installing packages…invest your time learning!

To learn more about my new book and downloadable virtual machine, just click here.

UPDATE: Continue the discussion on Reddit.

Frustrated With Python Machine Learning?

Master Machine Learning With Python

Develop Your Own Models in Minutes

…with just a few lines of scikit-learn code

Discover how in my new Ebook:
Machine Learning Mastery With Python

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, modeling, tuning, and much more…

Finally Bring Machine Learning To
Your Own Projects

Skip the Academics. Just Results.

Click to learn more.

29 Responses to Using OpenCV, Python and Template Matching to play “Where’s Waldo?”

  1. Marcel May 18, 2014 at 10:46 pm #

    Congratulations for your work! This example is fantastic for beginners in machine learning and practical image processing.

    By the way, I think you forgot to put at the article the code for the lines 14-20 ! Please check it again, I really would like to use this example at my classes.

    Are the code and photos available for download ?

    One more time,
    Keep going with this work!

    Marcel

    • jasonb May 21, 2014 at 7:55 am #

      I have added the missing lines, thanks for pointing that out Marcel. I’ll talk to Adrian about getting the code as a project on github or something.

  2. jzkunlun May 23, 2014 at 12:11 pm #

    Does this work for all the rest of Waldo pages? If ‘Waldo’ image’s size is different than search picture’s one, does it matter? If Waldo’s layout in search picture is different than template one, can this method still find him?
    Thanks,

  3. Johnny August 20, 2014 at 10:22 pm #

    Nice. I am brand new to ubuntu, opencv, ML, and cv. I can’t wait to give this a try. Thank you for sharing.

  4. Josh September 3, 2014 at 10:10 am #

    I have been trying really hard to understand this, but I am not quite getting it. I just want to create a function that returns true or false for a match, and gives me the x,y coodinates of the match. Ievery example i find wants to draw boxes around the match i just need to verify the match! The min/max val makes no sense. Please explain this! There must be a simple answer i just don’t see it! Thank you! Any help would be appreciated.

  5. sahar September 30, 2014 at 8:55 pm #

    Thanks for your awesome work 😉
    I have a problem here,
    “ImportError: No module named imutils”
    I’ve searched a lot I couldn’t find this package to download, can you hel me with this please?

    • Mehul B May 18, 2017 at 6:51 pm #

      In your virtual environment of python simply type
      pip install imutils

  6. John December 4, 2014 at 7:41 am #

    what a chump. Creates an unfinished guide including the package he made “imutils” thayt he doesnt even provide info for or even a link to download it. Just a big scheme to get your to sign up to his other site for a monthly fee

  7. Chakku December 12, 2014 at 10:50 pm #

    I cannot find the package “imutils”. Can you please share the link.

  8. Alan February 17, 2015 at 1:47 am #

    link for the “imutils” package:

    https://github.com/jrosebr1/imutils

  9. mohamad July 6, 2015 at 5:16 pm #

    Hi Dear.
    where is the path “imutils” after installation in raspberry pi?
    my install command “pip install imutils”

  10. Prasanna December 29, 2015 at 3:24 pm #

    This one is too good Adrian! I’m going to use this example.

  11. kishan January 27, 2016 at 9:44 pm #

    Hi ,

    really great Post,
    After seeing this post i tried something by myself , I took an image with a lot of color full boxes present in the image and then i tried to find a blue color box in the image using a blue color box template. The output i got was completely wrong as all the boxes were getting marked in my image . Initially when i tried passing the color image of the template and the main image ,it gave me error . I then saw the opencv matchTemplate() method in opencv official site,There the images were converted to gray scale images and then passed to the matchTemplate method . Any suggestion as what should i do to find the blue color box in the main image using template matching .

    • karthik November 8, 2016 at 11:27 pm #

      Hi Kishan,
      Not sure if this is even required for you now. But, may help others visiting the site…. you can just use color filters to filter everything other than blue and us that as a mask to “find”. If you are using c++ version of opencv, you can use the inrange function to achieve this.

  12. Anton J Szilasi October 19, 2016 at 5:18 am #

    Fortunately this didnt work for me, instead it returns an image of the basket ball hoop…

    • Yair February 6, 2017 at 8:12 pm #

      I also got the basketball hoop…any ideas?

  13. nada January 6, 2017 at 7:13 am #

    Hi,

    Can you post a link to the complete code?

    Thanks a lot!

  14. Clinton Chard February 3, 2017 at 5:15 am #

    I have been over this example like a fine toothed comb, but I cannot find the links to download the ‘puzzle.png’ nor the ‘waldo.png’ files to test this code on… Am I blind and missing it in an obvious place on this page? Appreciate the help.

  15. David March 23, 2017 at 6:19 am #

    Hi Nada, Clinton,

    The link is also in the article, but you can download the code and images at https://s3-us-west-2.amazonaws.com/static.pyimagesearch.com/wheres-waldo/waldo.zip

  16. robloxhack.1gamz.com June 14, 2017 at 6:35 am #

    I do not know whether it’s just me or if perhaps everybody
    else encountering issues with your blog. It looks like some
    of the written text in your content are running off the screen. Can someone else please
    comment and let me know if this is happening to them too?

    This might be a issue with my web browser because
    I’ve had this happen previously. Many thanks

    • Jason Brownlee June 14, 2017 at 8:52 am #

      I do not see this.

      What browser and what size window are you using? Tablet or phone for example?

      My blog is best read on a workstation where you can copy-paste code into your IDE.

  17. Vin September 24, 2017 at 9:25 am #

    How to recognize combination of two different objects like .if priority sign with left turn sign, instead of detecting each sign individually… Is there any possibility to say what is the meaning of both signs together ?? Combination of different signs were detected .. I just want to say what is stuation of that road like …how intersection is combination of different signs

    • Jason Brownlee September 25, 2017 at 5:34 am #

      I don’t see why not, but you would need to train your model on these types of examples.

  18. olivia January 5, 2018 at 1:20 am #

    hai jason
    thank you for this lesson
    can i use this code
    for recognizing road sign in edge mode?
    and i have multiple template on my direktori
    thank you..

  19. Jesús Martínez April 4, 2018 at 12:49 am #

    This is a very easy to follow and fun tutorial. Computer vision is really exciting and, for me, the most appealing subfield of AI nowadays!

    Thanks for sharing!

  20. Juan L. April 11, 2018 at 2:30 am #

    Hi Jason,

    First of all, thanks for the tutorial.

    I’m specifically looking for a method to do template matching, where the template is rotated (the angle is unknown). The scale is not modified: I just need to find the rigid body motion.
    The template is in fact a crop of the original image, rotated by an unknown angle.

    Could you suggest an approach to this problem?

    Thanks!

  21. amrith June 27, 2018 at 11:04 pm #

    i have tried the example but the region gets highlighted even if there is no match does it mean the code doesnt work?

Leave a Reply