A Guide to Convolutional Neural Networks for Computer Vision
Synthesis Lectures on Computer Vision
Editors
Gérard Medioni, University of Southern California
Sven Dickinson, University of Toronto
Synthesis Lectures on Computer Vision is edited by Gérard Medioni of the University of Southern California and Sven Dickinson of the University of Toronto. The series publishes 50–150 page publications on topics pertaining to computer vision and pattern recognition. The scope will largely follow the purview of premier computer science conferences, such as ICCV, CVPR, and ECCV. Potential topics include, but not are limited to:
• Applications and Case Studies for Computer Vision
• Color, Illumination, and Texture
• Computational Photography and Video
• Early and Biologically-inspired Vision
• Face and Gesture Analysis
• Illumination and Reflectance Modeling
• Image-Based Modeling
• Image and Video Retrieval
• Medical Image Analysis
• Motion and Tracking
• Object Detection, Recognition, and Categorization
• Segmentation and Grouping
• Sensors
• Shape-from-X
• Stereo and Structure from Motion
• Shape Representation and Matching
• Statistical Methods and Learning
• Performance Evaluation
• Video Analysis and Event Recognition
A Guide to Convolutional Neural Networks for Computer Vision
Salman Khan, Hossein Rahmani, Syed Afaq Ali Shah, and Mohammed Bennamoun
2018
Covariances in Computer Vision and Machine Learning
Hà Quang Minh and Vittorio Murino
2017
Elastic Shape Analysis of Three-Dimensional Objects
Ian H. Jermyn, Sebastian Kurtek, Hamid Laga, and Anuj Srivastava
2017
The Maximum Consensus Problem: Recent Algorithmic Advances
Tat-Jun Chin and David Suter
2017
Extreme Value Theory-Based Methods for Visual Recognition
Walter J. Scheirer
2017
Data Association for Multi-Object Visual Tracking
Margrit Betke and Zheng Wu
2016
Ellipse Fitting for Computer Vision: Implementation and Applications
Kenichi Kanatani, Yasuyuki Sugaya, and Yasushi Kanazawa
2016
Computational Methods for Integrating Vision and Language
Kobus Barnard
2016
Background Subtraction: Theory and Practice
Ahmed Elgammal
2014
Vision-Based Interaction
Matthew Turk and Gang Hua
2013
Camera Networks: The Acquisition and Analysis of Videos over Wide Areas
Amit K. Roy-Chowdhury and Bi Song
2012
Deformable Surface 3D Reconstruction from Monocular Images
Mathieu Salzmann and Pascal Fua
2010
Boosting-Based Face Detection and Adaptation
Cha Zhang and Zhengyou Zhang
2010
Image-Based Modeling of Plants and Trees
Sing Bing Kang and Long Quan
2009
Copyright © 2018 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher.
A Guide to Convolutional Neural Networks for Computer Vision
Salman Khan, Hossein Rahmani, Syed Afaq Ali Shah, and Mohammed Bennamoun
www.morganclaypool.com
ISBN: 9781681730219 paperback
ISBN: 9781681730226 ebook
ISBN: 9781681732787 hardcover
DOI 10.2200/S00822ED1V01Y201712COV015
A Publication in the Morgan & Claypool Publishers series
SYNTHESIS LECTURES ON COMPUTER VISION
Lecture #15
Series Editors: Gérard Medioni, University of Southern California
Sven Dickinson, University of Toronto
Series ISSN
Print 2153-1056 Electronic 2153-1064
A Guide to Convolutional Neural Networks for Computer Vision
Salman Khan
Data61-CSIRO and Australian National University
Hossein Rahmani
The University of Western Australia, Crawley, WA
Syed Afaq Ali Shah
The University of Western Australia, Crawley, WA
Mohammed Bennamoun
The University of Western Australia, Crawley, WA
SYNTHESIS LECTURES ON COMPUTER VISION #15
ABSTRACT
Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision.
This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools