Machine Learning for OpenCV
A practical introduction to the world of machine learning and
image processing using OpenCV and Python
Michael Beyeler
BIRMINGHAM - MUMBAI
Machine Learning for OpenCV
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a
retrieval system, or transmitted in any form or by any means, without the
prior written permission of the publisher, except in the case of brief
quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the
accuracy of the information presented. However, the information contained in
this book is sold without warranty, either express or implied. Neither the
author, nor Packt Publishing, and its dealers and distributors will be held
liable for any damages caused or alleged to be caused directly or indirectly by
this book.
Packt Publishing has endeavored to provide trademark information about all
of the companies and products mentioned in this book by the appropriate use
of capitals. However, Packt Publishing cannot guarantee the accuracy of this
information.
First published: July 2017
Production reference: 1130717
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78398-028-4
www.packtpub.com
Credits
Author
Copy Editor
Michael Beyeler
Manisha Sinha
Reviewers
Project Coordinator
Vipul Sharma
Rahul Kavi
Manthan Patel
Commissioning Editor
Proofreader
Veena Pagare
Safis Editing
Acquisition Editor
Indexer
Varsha Shetty
Tejal Daruwale Soni
Content Development Editor
Graphics
Jagruti Babaria
Tania Dutta
Technical Editor
Production Coordinator
Sagar Sawant
Deepika Naik
Foreword
Over the last few years, our machines have slowly but surely learned how to
see for themselves. We now take it for granted that our cameras detect our
faces in pictures that we take, and that social media apps can even recognize
us and our friends in the photos that we upload from these cameras. Over the
next few years we will experience even more radical transformation. Before
long, cars will be driving themselves, our cellphones will be able to read and
translate a sign in any language for us, and our x-rays and other medical
images will be read and analyzed by powerful algorithms that will be able to
accurately suggest a medical diagnosis, and even recommend effective
treatments.
These transformations are driven by an explosive combination of increased
computing power, masses of image data, and a set of clever ideas taken from
math, statistics, and computer science. This rapidly growing intersection that
is machine learning has taken off, affecting many of our day-to-day
interactions with the world, and with each other. One of the most remarkable
features of the current machine learning paradigm-shift in computer vision is
that it relies to a large extent on software tools that are freely available and
developed by large groups of volunteers, hobbyists, scientists, and engineers
in open source communities. This means that, in principle, the barriers to
entry are also lower than ever: anyone who is interested in putting their mind
to it can harness machine learning for image processing.
However, just like in a garden with many forking paths, the wealth of tools
and ideas, and the rapid development of these ideas, underscores the need for
a guide who can show you the way, and orient you in the right direction. I
have some good news for you: having picked up this book, you are in the
good hands of my colleague and collaborator Dr. Michael Beyeler as your
guide. With his broad range of expertise, Michael is both a hard-nosed
engineer, computer scientist, and neuroscientist, as well as a prolific open
source software developer. He has not only taught robots how to see and
navigate through complex environments, and computers how to model brain
activity, but he also regularly teaches humans how to use programming to
solve a variety of different machine learning and image processing problems.
This means that you will get to benefit not only from the sure-handed rigor of
his expertise and experience, but also that you will get to enjoy his
thoughtfulness in teaching the ideas in his book, as well as a good dose of his
sense of humor.
The second piece of good news is that this going to be an exhilarating trip.
There's nothing that matches the thrill of understanding that comes from
putting together the pieces of the puzzle that go into solving a problem in
computer vision and machine learning with code and data. As Richard
Feynman put it: "What I cannot create, I do not understand". So, get ready to
get your hands dirty (so to speak) with the code and data in the (open source!)
code examples that accompany this book, and to get creative. Understanding
will surely follow.
Ariel Rokem
Data Scientist, The University of Washington eScience Institute
About the Author
Michael Beyeler is a Postdoctoral Fellow in Neuroengineering and Data
Science at the University of Washington, where he is working on
computational models of bionic vision in order to improve the perceptual
experience of blind patients implanted with a retinal prosthesis (bionic eye).
His work lies at the intersection of neuroscience, computer engineering,
computer vision, and machine learning. Michael is the author of OpenCV
with Python Blueprints by Packt Publishing, 2015, a practical guide for
building advanced computer vision projects. He is also an active contributor
to several open source software projects, and has professional programming
experience in Python, C/C++, CUDA, MATLAB, and Android.
Michael received a PhD in computer science from the University of
California, Irvine as well as a MSc in biomedical engineering and a BSc in
electrical engineering from ETH Zurich, Switzerland. When he is not
"nerding out" on brains, he can be found on top of a snowy mountain, in front
of a live band, or behind the piano.
About the Reviewers
Vipul Sharma is a Software Engineer at a startup in Bangalore, India. He
studied engineering in Information Technology at Jabalpur Engineering
College (2016). He is an ardent Python fan and loves building projects on
computer vision in his spare time. He is an open source enthusiast and hunts
for interesting projects to contribute to. He is passionate about learning and
strives to better himself as a developer. He writes blogs on his side projects at
. He also publishes his code at />
Rahul Kavi works as a research scientist in Silicon Valley. He holds a
Master's and PhD degree in computer science from West Virginia University.
Rahul has worked on researching and optimizing computer vision
applications for a wide variety of platforms and applications. He has also
contributed to the machine learning module in OpenCV. He has written
computer vision and machine learning software for prize-winning robots for
NASA's 2015 and 2016 Centennial Challenges: Sample Return Robot (1st
prize). Rahul's research has been published in conference papers and journals.
www.PacktPub.com
For support files and downloads related to your book, please visit www.PacktPub
.com.
Did you know that Packt offers eBook versions of every book published, with
PDF and ePub files available? You can upgrade to the eBook version at www.P
acktPub.com and as a print book customer, you are entitled to a discount on the
eBook copy. Get in touch with us at for more details.
At www.PacktPub.com, you can also read a collection of free technical articles,
sign up for a range of free newsletters and receive exclusive discounts and
offers on Packt books and eBooks.
/>
Get the most in-demand software skills with Mapt. Mapt gives you full
access to all Packt books and video courses, as well as industry-leading tools
to help you plan your personal development and advance your career.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Customer Feedback
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our
editorial process. To help us improve, please leave us an honest review on
this book's Amazon page at />If you'd like to join our team of regular reviewers, you can e-mail us at
We award our regular reviewers with free eBooks
and videos in exchange for their valuable feedback. Help us be relentless in
improving our products!
To my loving wife, who continues to support me in all my
endeavors --; no matter how grand,
silly, or nerdy they may be.
Table of Contents
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1.
A Taste of Machine Learning
Getting started with machine learning
Problems that machine learning can solve
Getting started with Python
Getting started with OpenCV
Installation
Getting the latest code for this book
Getting to grips with Python's Anaconda distribution
Installing OpenCV in a conda environment
Verifying the installation
Getting a glimpse of OpenCV's ML module
Summary
2.
Working with Data in OpenCV and Python
Understanding the machine learning workflow
Dealing with data using OpenCV and Python
Starting a new IPython or Jupyter session
Dealing with data using Python's NumPy package
Importing NumPy
Understanding NumPy arrays
Accessing single array elements by indexing
Creating multidimensional arrays
Loading external datasets in Python
Visualizing the data using Matplotlib
Importing Matplotlib
Producing a simple plot
Visualizing data from an external dataset
Dealing with data using OpenCV's TrainData container in C++
Summary
3.
First Steps in Supervised Learning
Understanding supervised learning
Having a look at supervised learning in OpenCV
Measuring model performance with scoring functions
Scoring classifiers using accuracy, precision, and recall
Scoring regressors using mean squared error, explained variance, and R squa
red
Using classification models to predict class labels
Understanding the k-NN algorithm
Implementing k-NN in OpenCV
Generating the training data
Training the classifier
Predicting the label of a new data point
Using regression models to predict continuous outcomes
Understanding linear regression
Using linear regression to predict Boston housing prices
Loading the dataset
Training the model
Testing the model
Applying Lasso and ridge regression
Classifying iris species using logistic regression
Understanding logistic regression
Loading the training data
Making it a binary classification problem
Inspecting the data
Splitting the data into training and test sets
Training the classifier
Testing the classifier
Summary
4.
Representing Data and Engineering Features
Understanding feature engineering
Preprocessing data
Standardizing features
Normalizing features
Scaling features to a range
Binarizing features
Handling the missing data
Understanding dimensionality reduction
Implementing Principal Component Analysis (PCA) in OpenCV
Implementing Independent Component Analysis (ICA)
Implementing Non-negative Matrix Factorization (NMF)
Representing categorical variables
Representing text features
Representing images
Using color spaces
Encoding images in RGB space
Encoding images in HSV and HLS space
Detecting corners in images
Using the Scale-Invariant Feature Transform (SIFT)
Using Speeded Up Robust Features (SURF)
Summary
5.
Using Decision Trees to Make a Medical Diagnosis
Understanding decision trees
Building our first decision tree
Understanding the task by understanding the data
Preprocessing the data
Constructing the tree
Visualizing a trained decision tree
Investigating the inner workings of a decision tree
Rating the importance of features
Understanding the decision rules
Controlling the complexity of decision trees
Using decision trees to diagnose breast cancer
Loading the dataset
Building the decision tree
Using decision trees for regression
Summary
6.
Detecting Pedestrians with Support Vector Machines
Understanding linear support vector machines
Learning optimal decision boundaries
Implementing our first support vector machine
Generating the dataset
Visualizing the dataset
Preprocessing the dataset
Building the support vector machine
Visualizing the decision boundary
Dealing with nonlinear decision boundaries
Understanding the kernel trick
Knowing our kernels
Implementing nonlinear support vector machines
Detecting pedestrians in the wild
Obtaining the dataset
Taking a glimpse at the histogram of oriented gradients (HOG)
Generating negatives
Implementing the support vector machine
Bootstrapping the model
Detecting pedestrians in a larger image
Further improving the model
Summary
7.
Implementing a Spam Filter with Bayesian Learning
Understanding Bayesian inference
Taking a short detour on probability theory
Understanding Bayes' theorem
Understanding the naive Bayes classifier
Implementing your first Bayesian classifier
Creating a toy dataset
Classifying the data with a normal Bayes classifier
Classifying the data with a naive Bayes classifier
Visualizing conditional probabilities
Classifying emails using the naive Bayes classifier
Loading the dataset
Building a data matrix using Pandas
Preprocessing the data
Training a normal Bayes classifier
Training on the full dataset
Using n-grams to improve the result
Using tf-idf to improve the result
Summary
8.
Discovering Hidden Structures with Unsupervised Learning
Understanding unsupervised learning
Understanding k-means clustering
Implementing our first k-means example
Understanding expectation-maximization
Implementing our own expectation-maximization solution
Knowing the limitations of expectation-maximization
First caveat: No guarantee of finding the global optimum
Second caveat: We must select the number of clusters beforehand
Third caveat: Cluster boundaries are linear
Fourth caveat: k-means is slow for a large number of samples
Compressing color spaces using k-means
Visualizing the true-color palette
Reducing the color palette using k-means
Classifying handwritten digits using k-means
Loading the dataset
Running k-means
Organizing clusters as a hierarchical tree
Understanding hierarchical clustering
Implementing agglomerative hierarchical clustering
Summary
9.
Using Deep Learning to Classify Handwritten Digits
Understanding the McCulloch-Pitts neuron
Understanding the perceptron
Implementing your first perceptron
Generating a toy dataset
Fitting the perceptron to data
Evaluating the perceptron classifier
Applying the perceptron to data that is not linearly separable
Understanding multilayer perceptrons
Understanding gradient descent
Training multi-layer perceptrons with backpropagation
Implementing a multilayer perceptron in OpenCV
Preprocessing the data
Creating an MLP classifier in OpenCV
Customizing the MLP classifier
Training and testing the MLP classifier
Getting acquainted with deep learning
Getting acquainted with Keras
Classifying handwritten digits
Loading the MNIST dataset
Preprocessing the MNIST dataset
Training an MLP using OpenCV
Training a deep neural net using Keras
Preprocessing the MNIST dataset
Creating a convolutional neural network
Fitting the model
Summary
10.
Combining Different Algorithms into an Ensemble
Understanding ensemble methods
Understanding averaging ensembles
Implementing a bagging classifier
Implementing a bagging regressor
Understanding boosting ensembles
Implementing a boosting classifier
Implementing a boosting regressor
Understanding stacking ensembles
Combining decision trees into a random forest
Understanding the shortcomings of decision trees
Implementing our first random forest
Implementing a random forest with scikit-learn
Implementing extremely randomized trees
Using random forests for face recognition
Loading the dataset
Preprocessing the dataset
Training and testing the random forest
Implementing AdaBoost
Implementing AdaBoost in OpenCV
Implementing AdaBoost in scikit-learn
Combining different models into a voting classifier
Understanding different voting schemes
Implementing a voting classifier
Summary
11.
Selecting the Right Model with Hyperparameter Tuning
Evaluating a model
Evaluating a model the wrong way
Evaluating a model in the right way
Selecting the best model
Understanding cross-validation
Manually implementing cross-validation in OpenCV
Using scikit-learn for k-fold cross-validation
Implementing leave-one-out cross-validation
Estimating robustness using bootstrapping
Manually implementing bootstrapping in OpenCV
Assessing the significance of our results
Implementing Student's t-test
Implementing McNemar's test
Tuning hyperparameters with grid search
Implementing a simple grid search
Understanding the value of a validation set
Combining grid search with cross-validation
Combining grid search with nested cross-validation
Scoring models using different evaluation metrics
Choosing the right classification metric
Choosing the right regression metric
Chaining algorithms together to form a pipeline
Implementing pipelines in scikit-learn
Using pipelines in grid searches
Summary
12.
Wrapping Up
Approaching a machine learning problem
Building your own estimator
Writing your own OpenCV-based classifier in C++
Writing your own scikit-learn-based classifier in Python
Where to go from here?
Summary
Preface
I'm glad you're here. It's about time we talked about machine learning.
Machine learning is no longer just a buzzword, it is all around us: from
protecting your email, to automatically tagging friends in pictures, to
predicting what movies you like. As a subfield of data science, machine
learning enables computers to learn through experience: to make predictions
about the future using collected data from the past.
And the amount of data to be analyzed is enormous! Current estimates put the
daily amount of produced data at 2.5 exabytes (or roughly 1 billion
gigabytes). Can you believe it? This would be enough data to fill up 10
million blu-ray discs, or amount to 90 years of HD video. In order to deal
with this vast amount of data, companies such as Google, Amazon,
Microsoft, and Facebook have been heavily investing in the development of
data science platforms that allow us to benefit from machine learning
wherever we go--scaling from your mobile phone application all the way to
supercomputers connected through the cloud.
In other words: this is the time to invest in machine learning. And if it is your
wish to become a machine learning practitioner, too--then this book is for
you!
But fret not: your application does not need to be as large-scale or influential
as the above examples in order to benefit from machine learning. Everyone
starts small. Thus, the first step of this book is to introduce you to the
essential concepts of statistical learning, such as classification and regression,
with the help of simple and intuitive examples. If you have already studied
machine learning theory in detail, this book will show you how to put your
knowledge into practice. Oh, and don't worry if you are completely new to
the field of machine learning--all you need is the willingness to learn.
Once we have covered all the basic concepts, we will start exploring various