Deep Learning For Vision

AL3502: Deep Learning for Vision

Syllabus

Course Objectives

Introduce basic computer vision concepts.
Understand methods and terminologies used in deep neural networks.
Impart knowledge on Convolutional Neural Networks (CNNs).
Introduce Recurrent Neural Networks (RNNs) and Deep Generative Models.
Apply deep learning to real-world computer vision applications.

Unit I: Computer Vision Basics

Unit II: Introduction to Deep Learning

Unit III: Visualization and Understanding CNN

Convolutional Neural Networks (CNNs): Introduction to CNNs
Evolution of CNN Architectures: AlexNet, ZFNet, VGG
Visualization of Kernels
Backprop-to-Image/Deconvolution Methods
Deep Dream, Hallucination, Neural Style Transfer
Class Activation Mapping (CAM), Grad-CAM

Unit IV: CNN and RNN for Image and Video Processing

CNNs for Recognition, Verification, Detection, Segmentation
- CNNs for Recognition and Verification (Siamese Networks, Triplet Loss, Contrastive Loss, Ranking Loss)
- CNNs for Detection: Object Detection Background, R-CNN, Fast R-CNN
- CNNs for Segmentation: FCN, SegNet
Recurrent Neural Networks (RNNs): Review of RNNs
CNN + RNN Models for Video Understanding: Spatio-Temporal Models, Action/Activity Recognition

Unit V: Deep Generative Models

Review of Popular Deep Generative Models: GANs, VAEs
Variants and Applications of Generative Models in Vision
- Applications: Image Editing, Inpainting, Superresolution, 3D Object Generation, Security
Recent Trends: Self-Supervised Learning, Reinforcement Learning in Vision

Practical Exercises (30 Periods)

Course Outcomes

Upon successful completion, students will be able to:

Implement basic image processing operations
Understand deep learning concepts
Design and implement CNN, RNN, and deep generative models
Understand the role of deep learning in computer vision
Design and implement deep generative models

Textbooks

Ian Goodfellow, Yoshua Bengio, Aaron Courville, “Deep Learning”, MIT Press, 2017
Ragav Venkatesan, Baoxin Li, “Convolutional Neural Networks in Visual Computing”, CRC Press, 2018

References

Rajalingappaa Shanmugamani, Deep Learning for Computer Vision, Packt Publishing, 2018
David Forsyth, Jean Ponce, Computer Vision: A Modern Approach, 2002
Modern Computer Vision with PyTorch, V. Kishore Ayyadevara, Yeshwanth Reddy, Packt Publishing Ltd, 2020
Richard Szeliski, Computer Vision: Algorithms and Applications, 2010
Simon Prince, Computer Vision: Models, Learning, and Inference, 2012
NPTEL Course Materials

Structure

Lectures: 45 Periods
Practical Exercises: 30 Periods
Total: 75 Periods