Learning OpenCV book review

By | January 31, 2009

Learning OpenCV book coverA few days ago, I received my copy of a new book focused on implementing computer vision systems using the OpenCV library. OpenCV is the most comprehensive open source, cross-platform, computer vision library that have been available and under development for many years; however, only recently a book on learning to program with it finally became available. The Learning OpenCV: Computer Vision with the OpenCV Library book is published by O’Reilly and written by Gary Bradski and Adrian Kaehler both veterans in machine vision with lots of academic and industry experience.

The origins of the OpenCV library go back many years. Intel first introduced the image processing library in 1999. At the time, the library was being developed for real-time image processing on Intel CPUs; in fact, having a copy of Intel’s Integrated Performance Primitives(IPP)library installed on your computer can provide huge speed ups in processing images even today. As an open source project, several Beta versions of OpenCV were published on a yearly schedule with the official version 1.0 released in 2006. Since then, the project remained dormant until a year ago when robotics startup Willow Garage decided to take over continuing its development as an open source computer vision library that would be of use to scientists, industry developers, and hobbyists alike. Version 1.1 was released in October 2008 to coincide with the publication of the Learning OpenCV book.

I have used OpenCV over the years and I am glad that finally someone wrote a book describing how to program with it. The online documentation is not the best to get people started and it is only useful to those with knowledge of computer vision who just want to know how to use the API. The new book is here to fix the documentation problem helping experts and amateurs alike to get started with programming basic and advanced image processing algorithms with ease. I have spent a couple of days reading parts of the book and so far I find it to be very well written and more useful than the online documentation. The book does a great job describing both the API and the mathematics behind the implemented algorithms. Although some sections tend to be a bit heavy on the math, the authors often suggest that readers can skip over these descriptions and jump straight to the section that describes the API. I would encourage anyone who wants to take full advantage of the library to read through the entire book including the mathematical descriptions; you don’t need to know how an internal combustion engine works to drive a car but if you want to build a better car then you do.

The book starts by discussing basic image processing (image smoothing, resizing, thresholding, etc.) and eventually moves on to talk about gradients, transforms (Hough, Discrete Fourier, and Discrete Cosine Transforms,) integral images and histograms. These basic tools can be used to implement advanced algorithms for contour extraction and matching as well as image segmentation and tracking. In the latter chapters, the focus changes to camera calibration and 3D vision. There is also a brief introduction to Machine Learning for classification and clustering (K-means, naïve Bayes classifier, binary decision trees, and boosting.) The authors conclude the book with their overview of where they hope to take OpenCV over the next few years implementing many state of the art algorithms and providing better documentation. Since this is an open source project, everyone is encouraged to make a contribution either by implementing new algorithms, fixing bugs, or optimizing the current implementation.

Learning OpenCV is a must have book for everyone working on computer vision.