OpenCV with Python Blueprints: Design and develop advanced computer vision projects using OpenCV with Python
()
About this ebook
Michael Beyeler
Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye).His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.
Related to OpenCV with Python Blueprints
Related ebooks
OpenCV Computer Vision Application Programming Cookbook Second Edition Rating: 0 out of 5 stars0 ratingsOpenCV By Example Rating: 0 out of 5 stars0 ratingsMastering OpenCV with Practical Computer Vision Projects Rating: 0 out of 5 stars0 ratingsOpenCV with Python By Example Rating: 5 out of 5 stars5/5OpenCV Android Programming By Example: Leverage OpenCV to develop vision-aware and intelligent Android applications. Rating: 0 out of 5 stars0 ratingsOpenCV 3.0 Computer Vision with Java Rating: 0 out of 5 stars0 ratingsOpenCV for Secret Agents Rating: 0 out of 5 stars0 ratingsMastering OpenCV 3 - Second Edition Rating: 0 out of 5 stars0 ratingsOpenGL Data Visualization Cookbook Rating: 0 out of 5 stars0 ratingsOpenCV 3 Computer Vision Application Programming Cookbook - Third Edition Rating: 5 out of 5 stars5/5PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks Rating: 0 out of 5 stars0 ratingsAndroid Application Programming with OpenCV Rating: 3 out of 5 stars3/5Raspberry Pi Robotic Projects - Third Edition Rating: 0 out of 5 stars0 ratingsKeras to Kubernetes: The Journey of a Machine Learning Model to Production Rating: 0 out of 5 stars0 ratingsCinder Creative Coding Cookbook Rating: 0 out of 5 stars0 ratingsRaspberry Pi By Example Rating: 0 out of 5 stars0 ratingsComputer Vision for the Web: Unleash the power of the Computer Vision algorithms in JavaScript to develop vision-enabled web content Rating: 0 out of 5 stars0 ratingsPractical Robotics in C++: Build and Program Real Autonomous Robots Using Raspberry Pi (English Edition) Rating: 0 out of 5 stars0 ratingsOpenCV Essentials Rating: 0 out of 5 stars0 ratingsHands-on Supervised Learning with Python Rating: 0 out of 5 stars0 ratingsApplied Machine Learning Solutions with Python: SOLUTIONS FOR PYTHON, #1 Rating: 0 out of 5 stars0 ratingsProgramming Techniques using Python: Have Fun and Play with Basic and Advanced Core Python Rating: 0 out of 5 stars0 ratingsDATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB Rating: 0 out of 5 stars0 ratingsNatural Language Processing with Java and LingPipe Cookbook Rating: 0 out of 5 stars0 ratingsNeural Networks with Python Rating: 0 out of 5 stars0 ratingsFoundations of Data Intensive Applications: Large Scale Data Analytics under the Hood Rating: 0 out of 5 stars0 ratings
Reviews for OpenCV with Python Blueprints
0 ratings0 reviews
Book preview
OpenCV with Python Blueprints - Michael Beyeler
Table of Contents
OpenCV with Python Blueprints
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Fun with Filters
Planning the app
Creating a black-and-white pencil sketch
Implementing dodging and burning in OpenCV
Pencil sketch transformation
Generating a warming/cooling filter
Color manipulation via curve shifting
Implementing a curve filter by using lookup tables
Designing the warming/cooling effect
Cartoonizing an image
Using a bilateral filter for edge-aware smoothing
Detecting and emphasizing prominent edges
Combining colors and outlines to produce a cartoon
Putting it all together
Running the app
The GUI base class
The GUI constructor
Handling video streams
A basic GUI layout
A custom filter layout
Summary
2. Hand Gesture Recognition Using a Kinect Depth Sensor
Planning the app
Setting up the app
Accessing the Kinect 3D sensor
Running the app
The Kinect GUI
Tracking hand gestures in real time
Hand region segmentation
Finding the most prominent depth of the image center region
Applying morphological closing to smoothen the segmentation mask
Finding connected components in a segmentation mask
Hand shape analysis
Determining the contour of the segmented hand region
Finding the convex hull of a contour area
Finding the convexity defects of a convex hull
Hand gesture recognition
Distinguishing between different causes of convexity defects
Classifying hand gestures based on the number of extended fingers
Summary
3. Finding Objects via Feature Matching and Perspective Transforms
Tasks performed by the app
Planning the app
Setting up the app
Running the app
The FeatureMatching GUI
The process flow
Feature extraction
Feature detection
Detecting features in an image with SURF
Feature matching
Matching features across images with FLANN
The ratio test for outlier removal
Visualizing feature matches
Homography estimation
Warping the image
Feature tracking
Early outlier detection and rejection
Seeing the algorithm in action
Summary
4. 3D Scene Reconstruction Using Structure from Motion
Planning the app
Camera calibration
The pinhole camera model
Estimating the intrinsic camera parameters
The camera calibration GUI
Initializing the algorithm
Collecting image and object points
Finding the camera matrix
Setting up the app
The main function routine
The SceneReconstruction3D class
Estimating the camera motion from a pair of images
Point matching using rich feature descriptors
Point matching using optic flow
Finding the camera matrices
Image rectification
Reconstructing the scene
3D point cloud visualization
Summary
5. Tracking Visually Salient Objects
Planning the app
Setting up the app
The main function routine
The Saliency class
The MultiObjectTracker class
Visual saliency
Fourier analysis
Natural scene statistics
Generating a Saliency map with the spectral residual approach
Detecting proto-objects in a scene
Mean-shift tracking
Automatically tracking all players on a soccer field
Extracting bounding boxes for proto-objects
Setting up the necessary bookkeeping for mean-shift tracking
Tracking objects with the mean-shift algorithm
Putting it all together
Summary
6. Learning to Recognize Traffic Signs
Planning the app
Supervised learning
The training procedure
The testing procedure
A classifier base class
The GTSRB dataset
Parsing the dataset
Feature extraction
Common preprocessing
Grayscale features
Color spaces
Speeded Up Robust Features
Histogram of Oriented Gradients
Support Vector Machine
Using SVMs for Multi-class classification
Training the SVM
Testing the SVM
Confusion matrix
Accuracy
Precision
Recall
Putting it all together
Summary
7. Learning to Recognize Emotions on Faces
Planning the app
Face detection
Haar-based cascade classifiers
Pre-trained cascade classifiers
Using a pre-trained cascade classifier
The FaceDetector class
Detecting faces in grayscale images
Preprocessing detected faces
Facial expression recognition
Assembling a training set
Running the screen capture
The GUI constructor
The GUI layout
Processing the current frame
Adding a training sample to the training set
Dumping the complete training set to a file
Feature extraction
Preprocessing the dataset
Principal component analysis
Multi-layer perceptrons
The perceptron
Deep architectures
An MLP for facial expression recognition
Training the MLP
Testing the MLP
Running the script
Putting it all together
Summary
Index
OpenCV with Python Blueprints
OpenCV with Python Blueprints
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: October 2015
Production reference: 1141015
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78528-269-0
www.packtpub.com
Credits
Author
Michael Beyeler
Reviewers
Jia-Shen Boon
Florian LE BOURDAIS
Steve Goldsmith
Rahul Kavi
Scott Lobdell
Vipul Sharma
Commissioning Editor
Akram Hussain
Acquisition Editor
Divya Poojari
Content Development Editor
Zeeyan Pinheiro
Technical Editor
Namrata Patil
Copy Editor
Vikrant Phadke
Project Coordinator
Suzanne Coutinho
Proofreader
Safis Editing
Indexer
Rekha Nair
Production Coordinator
Melwyn D'sa
Cover Work
Melwyn D'sa
About the Author
Michael Beyeler is a PhD candidate in the department of computer science at the University of California, Irvine, where he is working on computational models of the brain as well as their integration into autonomous brain-inspired robots. His work on vision-based navigation, learning, and cognition has been presented at IEEE conferences and published in international journals. Currently, he is one of the main developers of CARLsim, an open source GPGPU spiking neural network simulator.
This is his first technical book that, in contrast to his (or any) dissertation, might actually be read.
Michael has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Born and raised in Switzerland, he received a BSc degree in electrical engineering and information technology, as well as a MSc degree in biomedical engineering from ETH Zurich. When he is not nerding out
on robots, he can be found on top of a snowy mountain, in front of a live band, or behind the piano.
I would like to thank Packt Publishing for this great opportunity and their support, my girlfriend for putting up with my late-night writing sessions, as well as the technical reviewers, who have spotted (hopefully) all my glaring errors and helped make this book a success.
About the Reviewers
Jia-Shen Boon is a researcher in robotics at the University of Wisconsin-Madison, supervised by Professor Michael Coen. He is a proud son of the sunny city-state of Singapore. Before coming to Wisconsin, he was a research engineer at DSO National Labs, where he worked on autonomous underwater vehicles and other unspeakable things. During his free time, Jia-Shen likes to study the Japanese language and write about himself in the third person.
Florian LE BOURDAIS hails from France and Germany. While he was growing up in the lazy south of France, an encounter with music from The Beatles gave him an early grasp of the English language. One of his earliest childhood memories has him watching his older German cousin, Dominik, coding a Tetris clone in the family basement using QBasic. High school and the advent of hand-held calculators led him to write his first Snake program using the TI-Basic language. After having acquired a solid background in mathematics and physics, Florian was admitted to one of the top French engineering schools. He studied mechanical engineering, but interned as an index-arbitrage trader in Japan during the financial crisis. Keen to come back to a country he much liked, he specialized in nuclear engineering and was doing an internship in a Japanese fast-breeder reactor during the Fukushima nuclear crisis.
Coming back to France, Florian was happy to start an engineering job in non-destructive testing. He specializes in ultrasound inspection methods, with a focus on phased array transducers, guided waves, and EMATs. He has published more than 10 international conference proceedings. At night, he's a hacker who likes to play with 3D printers, fermented Korean cabbage, the Raspberry Pi, Japanese characters, and guitars. He regularly writes a blog about his side projects at http://flothesof.github.io.
I would like to thank my friends and family for supporting me throughout this project. Special thanks goes to my favorite machine learning specialists at the Geeks d'Orléans, as well as Coloc du 1000.
Rahul Kavi is a PhD student at West Virginia University. He holds a master's degree in computer science. He is pursuing a PhD in the area of distributed machine learning and computer vision. He is a computer vision and robotics enthusiast. Rahul has worked on developing prototypes, optimizing computer vision, and machine learning applications for desktops, mobile devices, and autonomous robots. He writes blogs on his research interests and part-time projects at www.developerstation.org. He is a source code contributor to OpenCV.
Vipul Sharma is an engineering undergraduate from Jabalpur Engineering College. He is an ardent Python enthusiast and was one of the students selected for Google Summer of Code 2015 under the Python Software Foundation. He has been actively involved in Python and OpenCV since 2012. A few of his projects on OpenCV include a motion sensing surveillance camera, hand-gesture recognition, and solving a Rubik's cube by reading images of its faces in real time. Vipul loves contributing to open source software and is currently working on Optical Character Recognition (OCR) using OpenCV. You can check out his projects at https://github.com/vipul-sharma20.
www.PacktPub.com
Support files, eBooks, discount offers, and more
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
Support files, eBooks, discount offers, and morehttps://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Free access for Packt account holders
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
Preface
OpenCV is a native, cross-platform C++ library for computer vision, machine learning, and image processing. It is increasingly being adopted in Python for development. OpenCV has C++/C, Python, and Java interfaces, with support for Windows, Linux, Mac, iOS, and Android. Developers who use OpenCV build applications to process visual data; this can include live streaming data such as photographs or videos from a device such as a camera. However, as developers move beyond their first computer vision applications, they might find it difficult to come up with solutions that are well-optimized, robust, and scalable for real-world scenarios.
This book demonstrates how to develop a series of intermediate to advanced projects using OpenCV and Python, rather than teaching the core concepts of OpenCV in theoretical lessons. The working projects developed in this book teach you how to apply your theoretical knowledge to topics such as image manipulation, augmented reality, object tracking, 3D scene reconstruction, statistical learning, and object categorization.
By the end of this book, you will be an OpenCV expert, and your newly gained experience will allow you to develop your own advanced computer vision applications.
What this book covers
Chapter 1, Fun with Filters, explores a number of interesting image filters (such as a black-and-white pencil sketch, warming/cooling filters, and a cartoonizer effect), and we apply them to the video stream of a webcam in real time.
Chapter 2, Hand Gesture Recognition Using a Kinect Depth Sensor, helps you develop an app to detect and track simple hand gestures in real time using the output of a depth sensor, such as a Microsoft Kinect 3D Sensor or Asus Xtion.
Chapter 3, Finding Objects via Feature Matching and Perspective Transforms, is where you develop an app to detect an arbitrary object of interest in the video stream of a webcam, even if the object is viewed from different angles or distances, or under partial occlusion.
Chapter 4, 3D Scene Reconstruction Using Structure from Motion, shows you how to reconstruct and visualize a scene in 3D by inferring its geometrical features from camera motion.
Chapter 5, Tracking Visually Salient Objects, helps you develop an app to track multiple visually salient objects in a video sequence (such as all the players on the field during a soccer match) at once.
Chapter 6, Learning to Recognize Traffic Signs, shows you how to train a support vector machine to recognize traffic signs from the German Traffic Sign Recognition Benchmark (GTSRB) dataset.
Chapter 7, Learning to Recognize Emotions on Faces, is where you develop an app that is able to both detect faces and recognize their emotional expressions in the video stream of a webcam in real time.
What you need for this book
This book supports several operating systems as development environments, including Windows XP or a later version, Max OS X 10.6 or a later version, and Ubuntu12.04 or a later version. The only hardware requirement is a webcam (or camera device), except for in Chapter 2, Hand Gesture Recognition Using a Kinect Depth Sensor, which instead requires access to a Microsoft Kinect 3D Sensor or an Asus Xtion.
The book contains seven projects, with the following requirements.
All projects can run on any of Windows, Mac, or Linux, and they require the following software packages:
OpenCV 2.4.9 or later: Recent 32-bit and 64-bit versions as well as installation instructions are available at http://opencv.org/downloads.html. Platform-specific installation instructions can be found at http://docs.opencv.org/doc/tutorials/introduction/table_of_content_introduction/table_of_content_introduction.html.
Python 2.7 or later: Recent 32-bit and 64-bit installers are available at https://www.python.org/downloads. The installation instructions can be found at https://wiki.python.org/moin/BeginnersGuide/Download.
NumPy 1.9.2 or later: This package for scientific computing officially comes in 32-bit format only, and can be obtained from http://www.scipy.org/scipylib/download.html. The installation instructions can be found at http://www.scipy.org/scipylib/building/index.html#building.
wxPython 2.8 or later: This GUI programming toolkit can be obtained from http://www.wxpython.org/download.php. Its installation instructions are given at http://wxpython.org/builddoc.php.
In addition, some chapters require the following free Python modules:
SciPy 0.16.0 or later (Chapter 1): This scientific Python library officially comes in 32-bit only, and can be obtained from http://www.scipy.org/scipylib/download.html. The installation instructions can be found at http://www.scipy.org/scipylib/building/index.html#building.
matplotlib 1.4.3 or later (Chapters 4 to 7): This 2D plotting library can be obtained from http://matplotlib.org/downloads.html. Its installation instructions can be found by going to http://matplotlib.org/faq/installing_faq.html#how-to-install.
libfreenect 0.5.2 or later (Chapter 2): The libfreenect module by the OpenKinect project (http://www.openkinect.org) provides drivers and libraries for the Microsoft Kinect hardware, and can be obtained from https://github.com/OpenKinect/libfreenect. Its installation instructions can be found at http://openkinect.org/wiki/Getting_Started.
Furthermore, the use of iPython (http://ipython.org/install.html) is highly recommended as it provides a flexible, interactive console interface.
Finally, if you are looking for help or get stuck along the way, you can go to several websites that provide excellent help, documentation, and tutorials:
The official OpenCV API reference, user guide, and tutorials: http://docs.opencv.org
The official OpenCV forum: http://www.answers.opencv.org/questions
OpenCV-Python tutorials by Alexander Mordvintsev and Abid Rahman K: http://opencv-python-tutroals.readthedocs.org/en/latest
Who this book is for
This book is for intermediate users of OpenCV who aim to master their skills by developing advanced practical applications. You should already have some experience of building simple applications, and you are expected to be familiar with OpenCV's concepts and Python libraries. Basic knowledge of Python programming is expected and assumed.
Conventions
In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: In OpenCV, a webcam can be accessed with a call to cv2.VideoCapture.
A block of code is set as follows:
def main():
capture = cv2.VideoCapture(0)
if not(capture.isOpened()):
capture.open()
capture.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH, 640)
capture.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT, 480)
New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes