Computer vision is a subfield of artificial intelligence concerned with understanding the content of digital images, such as photographs and videos.
Deep learning has made impressive inroads on challenging computer vision tasks and makes the promise of further advances.
Before diving into the application of deep learning techniques to computer vision, it may be helpful to develop a foundation in computer vision more broadly. This may include the foundational and classical techniques, theory, and even basic data handling with standard APIs.
One of the best ways to get up to speed quickly with the field is to get a book on the topic.
In this post, you will discover the top textbooks and programmer books on computer vision.
Kick-start your project with my new book Deep Learning for Computer Vision, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
Overview
This post is divided into three parts; they are:
- Top 5 Computer Vision Textbooks
- Top 3 Computer Vision Programmer Books
- Recommendations
Top 5 Computer Vision Textbooks
Textbooks are those books written by experts, often academics, and are designed to be used as a reference for students and practitioners.
They focus mainly on general methods and theory (math), not on the practical concerns of problems and the application of methods (code).
I gathered a list of the top five textbooks based on their usage in university courses at top schools (e.g. MIT, etc.) and recommendations on discussion websites (e.g. Quora, etc.).
The top five textbooks on computer vision are as follows (in no particular order):
- Computer Vision: Algorithms and Applications, 2010.
- Computer Vision: Models, Learning, and Inference, 2012.
- Computer Vision: A Modern Approach, 2002.
- Introductory Techniques for 3-D Computer Vision, 1998.
- Multiple View Geometry in Computer Vision, 2004.
Let’s take a closer look at each in turn, including the target audience and table of contents for each book.
Computer Vision: Algorithms and Applications
This book was written by Richard Szeliski and published in 2010.
A draft version of the book in PDF format is available from the book’s homepage.
I like this book. It provides a strong foundation for beginners (undergraduates) in computer vision techniques for a wide range of standard computer vision problems. The book was developed by Richard based on his years of experience teaching the topic at the University of Washington.
This book also reflects my 20 years’ experience doing computer vision research in corporate research labs […] I have mostly focused on problems and solution techniques (algorithms) that have practical real-world applications and that work well in practice. Thus, this book has more emphasis on basic techniques that work under real-world conditions and less on more esoteric mathematics that has intrinsic elegance but less practical applicability.
— Page ix, Computer Vision: Algorithms and Applications, 2010.
The table of contents for this book is as follows:
- 1. Introduction
- 2. Image formation
- 3. Image processing
- 3. Feature detection and matching
- 5. Segmentation
- 6. Feature-based alignment
- 7. Structure from motion
- 8. Dense motion estimation
- 9. Image stitching
- 10. Computational photography
- 11. Stereo correspondence
- 12. 3D reconstruction
- 13. Image-based rendering
- 14. Recognition
Computer Vision: Models, Learning, and Inference
This book was written by Simon Prince and published in 2012.
A draft version of the book is available on the book’s website in PDF format.
This is a great introductory book (for students) and covers a wide range of computer vision techniques and problems. The book takes more time to introduce computer vision and spends useful time on foundational topics related to probabilistic modeling.
This modern treatment of computer vision focuses on learning and inference in probabilistic models as a unifying theme. It shows how to use training data to learn the relationships between the observed image data and the aspects of the world that we wish to estimate, such as the 3D structure or the object class, and how to exploit these relationships to make inferences about the world from new image data.
— Computer Vision: Models, Learning, and Inference, 2012.
The table of contents for this book is as follows:
- 1. Introduction
- 2. Introduction to probability
- 3. Common probability distributions
- 4. Fitting probability models
- 5. The normal distribution
- 6. Learning and inference in vision
- 7. Modeling complex data densities
- 8. Regression models
- 9. Classification models
- 10. Graphical models
- 11. Models for chains and trees
- 12. Models for grids
- 13. Image preprocessing and feature extraction
- 14. The pinhole camera
- 15. Models for transformations
- 16. Multiple cameras
- 17. Models for shape
- 18. Models for style and identity
- 19. Temporal models
- 20. Models for visual words
Computer Vision: A Modern Approach
This book was written by David Forsyth and Jean Ponce and published in 2011.
This is an introductory textbook on computer vision and is perhaps more broad in the topics covered than many of the other textbooks. Although broad, it may be less loved (popular) than some of the other introductory text as it can be challenging to read: it dives right in.
… vision relies on a solid understanding of cameras and of the physical process of image formation (Part I of this book) to obtain simple inferences from individual pixel values (Part II), combine the information available in multiple images into a coherent whole (Part III), impose some order on groups of pixels to separate them from each other or infer shape information (Part IV), and recognize objects using geometric information or probabilistic techniques (Part V).
— xvii, Computer Vision: A Modern Approach, 2002.
The table of contents for this book is as follows:
- Part I. Image Formation
- 1. Radiometry – Measuring Light
- 2. Sources, Shadows and Shading
- 3. Colour
- Part II. Image Models
- 4. Geometric Image Features
- 5. Analytical Image Features
- 6. An introduction to Probability
- Part III. Early Vision: One Image
- 7. Linear Filters
- 8. Edge Detection
- 9. Filters and Features
- 10. Texture
- Part IV. Early Vision: Multiple Images
- 11. The Geometry of Multiple Views
- 12. Stereopsis
- 13. Affine Structure from Motion
- 14. Projective Structure from Motion
- Part V. Mid-Level Vision
- 15. Segmentation Using Clustering Methods
- 16. Fitting
- 17. Segmentation and Fitting Using Probabilistic Methods
- 18. Tracking
- Part VI. High-Level Vision
- 19. Correspondence and Pose Consistency
- 20. Finding Templates Using Classifiers
- 21. Recognition by Relations Between Templates
- 22. Aspect Graphs
- Part VII. Applications and Topics
- 23. Range Data
- 24. Applications: Finding in Digital Libraries
- 25. Application: Image-Based Rendering
Introductory Techniques for 3-D Computer Vision
This book was written by Emanuele Trucco and Alessandro Verri and was published in 1998.
This is an older book that focuses on computer vision in general with some focus on techniques related to 3D problems in vision. It’s a great starting point, intended for undergraduate rather than graduate-level readers.
This book is meant to be: […] an applied introduction to the problems and solutions of modern computer vision.
– xiii, Introductory Techniques for 3-D Computer Vision, 1998.
The table of contents for this book is as follows:
- 1. Introduction
- 2. Digital snapshots
- 3. Dealing with Image Noise
- 4. Image Features
- 5. More Image Features
- 6. Camera Calibration
- 7. Stereopsis
- 8. Motion
- 9. Shape from Single-image Cues
- 10. Recognition
- 11. Locating Objects in Space
- A. Appendix
Multiple View Geometry in Computer Vision
This book was written by Richard Hartley and Andrew Zisserman and was published in 2004.
Samples of some of the chapters are available in PDF format from the book’s webpage.
It is a reasonably advanced book (graduate level) on a specialized topic in computer vision, specifically on the problem and methods related to inferring geometry from multiple images.
The book is divided into six parts and there are seven short appendices. Each part introduces a new geometric relation: the homography for background, the camera matrix for single view, the fundamental matrix for two views, the trifocal tensor for three views, and the quadrifocal tensor for four views.
— Page xiv, Multiple View Geometry in Computer Vision, 2004.
The table of contents for this book is as follows:
- 1. Introduction
- PART 0. The Background: Projective Geometry, Transformations and Estimation
- 2. Projective Geometry and Transformations of 2D
- 3. Projective Geometry and Transformations of 3D
- 4. Estimation – 2D Projective Transformations
- 5. Algorithm Evaluation and Error Analysis
- PART I. Camera Geometry and Single View Geometry
- 6. Camera Models
- 7. Computation of the Camera Matrix P
- 8. More Single View Geometry
- PART II. Two-View Geometry
- 9. Epipolar Geometry and the Fundamental Matrix
- 10. 3D Reconstruction of Cameras and Structure
- 11. Computation of the Fundamental Matrix F
- 12. Structure Computation
- 13. Scene Planes and Homographies
- 14. Affine Epipolar Geometry
- PART III. Three-View Geometry
- 15. The Trifocal Tensor
- 16. Computation of the Trifocal Tensor T
- PART IV. N-View Geometry
- 17. N-Linearities and Multiple View Tensors
- 18. N-View Computational Methods
- 19. Auto-Calibration
- 20. Duality
- 21. Cheirality
- 22. Degenerate Configurations
- PART V. Appendices
Want Results with Deep Learning for Computer Vision?
Take my free 7-day email crash course now (with sample code).
Click to sign-up and also get a free PDF Ebook version of the course.
Top 3 Computer Vision Programmer Books
Programmer books are playbooks (e.g. O’Reilly books) written by experts, often developers and engineers, and are designed to be used as a reference by practitioners.
They focus mainly on techniques and the practical concerns of problem solving with a focus on example code and standard libraries. Techniques may be described briefly with relevant theory (math) but should probably not be used as a primary reference.
I’ve gathered a list of the top three playbooks based on their rank ordering in lists of top computer vision books and on recommendations on discussion websites.
The top three textbooks on computer vision are as follows (in no particular order):
- Learning OpenCV 3, 2017.
- Programming Computer Vision with Python, 2012.
- Practical Computer Vision with SimpleCV, 2012.
Let’s take a closer look at each in turn, including the target audience and table of contents for each book.
Learning OpenCV 3
This book was written by Adrian Kaehler and Gary Bradski and published in 2017. The subtitle of the book is “Computer Vision in C++ with the OpenCV Library.”
The book focuses on teaching you how to use the OpenCV library, perhaps the premiere open source computer vision library. All code examples are in C++, suggesting that the target audience are professional developers looking to learn how to incorporate computer vision into their applications. Importantly, the authors are board members and founders of OpenCV.
It is a technical book and perhaps more an elaborated API documentation than a playbook.
This book provides a working guide to the C++ Open Source Computer Vision Library (OpenCV) version 3.x and gives a general background on the field of computer vision sufficient to help readers use OpenCV effectively.
— Learning OpenCV 3, 2017.
This book may be considered an updated version of the older (2008) book Learning “OpenCV: Computer Vision with the OpenCV Library” by the same authors.
The table of contents for this book is as follows:
- 1. Overview
- 2. Introduction to OpenCV
- 3. Getting to Know OpenCV Data Types
- 4. Images and Large Array Types
- 5. Array Operations
- 6. Drawing and Annotating
- 7. Functions in OpenCV
- 8. Image, Video and Data Files
- 9. Cross-Platform and Native Windows
- 10. Filters and Convolutions
- 11. General Image Transforms
- 12. Image Analysis
- 13. Histograms and Templates
- 14. Contours
- 15. Background Subtraction
- 16. Keypoints and Descriptors
- 17. Tracking
- 18. Camera Models and CAlibration
- 19. Projection and Three-Dimensional Vision
- 20. The Basics of Machine Learning in OpenCV
- 21. StatModel: The Standard Model for Learning in OpenCV
- 22. Object Detection
- 23. Future of OpenCV
Programming Computer Vision with Python
This book was written by Jan Erik Solem and published in 2012. The subtitle for the book is “Tools and algorithms for analyzing images.”
A final draft version of the book is available from the book’s website in PDF format.
This is a hands-on book that focuses on teaching you how to perform basic computer vision tasks in Python, mostly with PIL, although with a basic introduction to OpenCV as well. I’m a fan of this book, although minor modifications are required to use updated libraries (e.g. Pillow). An update to this book is due!
The idea behind this book is to give an easily accessible entry point to hands-on computer vision with enough understanding of the underlying theory and algorithms to be a foundation for students, researchers, and enthusiasts.
— Page vii, Programming Computer Vision with Python, 2012.
The table of contents for this book is as follows:
- 1. Basic Image Handling and Processing
- 2. Local Image Descriptors
- 3. Image to Image Mappings
- 4. Camera Models and Augmented Reality
- 5. Multiple View Geometry
- 6. Clustering Images
- 7. Searching Images
- 8. Classifying Image Content
- 9. Image Segmentation
- 10. OpenCV
Practical Computer Vision With SimpleCV
This book was written by Kurt DeMaagd, Anthony Oliver, Nathan Oostendorp, and Katherine Scott, and was published in 2012. The subtitle of the book is “The Simple Way to Make Technology See.”
This book teaches you how to perform basic computer vision operations using the SimpleCV library in Python. This provides a nice alternative to working with PIL (Pillow) or OpenCV, although I’m not convinced that SimpleCV has been broadly adopted (I’m happy to be proven wrong).
Learn how to build your own computer vision (CV) applications quickly and easily with SimpleCV, an open source framework written in Python. Through examples of real-world applications, this hands-on guide introduces you to basic CV techniques for collecting, processing, and analyzing streaming digital images.
— Practical Computer Vision with SimpleCV, 2012.
The table of contents for this book is as follows:
- 1. Introduction
- 2. Getting to Know the SimpleCV Framework
- 3. Image Sources
- 4. Pixels and Images
- 5. The Impact of Light
- 6. Image Arithmetic
- 7. Drawing on Images
- 8. Basic Feature Detection
- 9. FeatureSet Manipulation
- 10. Advanced Features
Recommendations
I love books and am reading a few different books at any one time. As such, I own all of the books listed in this post.
Nevertheless, if I was forced to recommend one textbook and one playbook, my recommendations would be as follows:
Recommended Textbook
- Computer Vision: Algorithms and Applications, Richard Szeliski, 2010.
I recommend this book because it provides a short, focused, and very readable introduction to computer vision with relevant theory, without getting too bogged down. Straight to the point and a useful reference text.
Recommended Programmer Book
- Programming Computer Vision with Python, Jan Erik Solem, 2012.
I recommend this book because it focuses on real computer vision techniques with standard (or close enough) Python libraries. It’s an excellent starting point for getting your hands dirty in computer vision.
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
- Popular Computer Vision Books, GoodReads.
- Amazon: Best Sellers in Computer Vision.
- Ask HN: What are the best resources to learn computer vision?
- Vision Related Books including Online Books and Book Support Sites
Summary
In this post, you discovered the top textbooks and playbooks on computer vision.
Did I miss your favorite book or books on computer vision?
Let me know in the comments below.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Hi Jason,
What do you think of the book deep learning for computer vision with Python by Adrian Rosebrock https://www.pyimagesearch.com/deep-learning-computer-vision-python-book/ ?
Best,
Elie
I have not read it, sorry.
Jason, You should read Adrian’s books. They are really good and very practical, I own them all and they are always the ones people want to borrow if they don’t want the one I have of yours. You and he both have a code-first approach that works well, but a different tone and layout that resonate with different people. I find both useful and recommend them whenever possible. I keep waiting for you two to partner up on a killer project.
Thanks Harvey!
Hi, I am looking for any kind of resource that teaches CV Web Development. I want to combine them, but there aren’t many resources out there. There is only one course but costs $1000. Is there anything you can help me with please?
Hi Harvey…can you share your mail ID for dropping you a message. My mail ID would be senorhimanshu@gmail.com
Thanks.
Hi, I am looking for any kind of resource that teaches CV Web Development. I want to combine them, but there aren’t many resources out there. There is only one course but costs $1000. Is there anything you can help me with please?
Hi Mikheil…We currently don’t have web development content.
Hey Elie — Adrian here from PyImageSearch.com. I actually wrote Deep Learning for Computer Vision with Python. I can share a number of reviews on the book but I don’t want to do that on Jason’s blog as that could come across as rude.
Please send me an email or use my contact form (https://www.pyimagesearch.com/contact/) and we can chat there. Thanks!
Thanks Adrian.
Hi, I have no programming experience will these books help me with learning cv without knowing phyton. Will they help me learn phyton or what do you suggest for me as a beginner in the cv field
Thanks for this review of CV books and for all the very helpful content you’ve posted over the years, Jason.
I have gone through a number of the tutorials posted on Adrian’s site (pyimagesearch) and I’m lobbying for my employer to purchase his book for me. It’s expensive to get the full version but from what I can tell it will be worth it based on the thoroughness of the tutorials. If you like Jason’s thorough and well thought out style on this site then you’ll find the same but with a focus on computer vision on Adrian’s site.
Thanks for the tip James.
Its good. Adrian knows his stuff.
Thanks.
Thanks for these recommendations. They were mighty helpful.
I’m happy to hear that.
A must read before dwelling into computer vision is
Digital Image Processing, 3rd Ed. by Gonzalez and Woods
Thanks for the recommendation!
What do you like about it?
Hi,
Im considering getting Computer Vision: Principles, Algorithms, Applications, Learning 5th Edition by E.R Davies. https://www.amazon.com/gp/product/012809284X/ref=ox_sc_act_title_2?smid=A1C79WJQJ5SBBJ&psc=1
Main reason is because he also talks about deep learning. Would be interesting to see if anyone has any review on it
Intersting, thanks for sharing.
Thanks a lot for this valuable information !!!
I have Learning OpenCV3 and it’s a amazing book !!!
I’ve been trying to make a project in my university…. I’d like to do something like 360° replay (true view vision) of Intel … This is a great challenge for me but I never give up (Y)
Again , thanks for this post …
Best !
Thanks.
hi
thank you for recommendation
can you recommend a book that use python 3.X instead of “Programming Computer Vision with Python” ?
I have a nice book that focuses of deep learning for computer vision that might interest you:
https://machinelearningmastery.com/deep-learning-for-computer-vision/
Hello Jason,
Thanks for your recommendations. I’d like to ask if possible, of all books you’ve mentioned up there, what you would say most suitable for traditional computer vision practitioners, mostly for image preprocessing?
Say we’d like to know average brightness or darkness of the image? Or topics centered around pixel intensity/color twisting tips/histogram etc..
I know there are books about edge detection and contours, but quite often if it comes to the contrast/color/brightness. I am not sure if there is a primer book explaining the basic color theory and tell us how to use opencv to adjust the images.
Many thanks for your help.
I think the books listed in the recommendation section are a good place to start.
Perhaps flick through a few books/table of contents on amazon/google books to see if they are a good fit?
Hi Jason,
Like we have “An Introduction to Statistical Learning: With Applications in R” for machine learning (in depth knowledge), similarly do we have a book on Computer vision which will give in-depth knowledge?