COMPUTER VISION
- Teaching in italian
- COMPUTER VISION
- Teaching
- COMPUTER VISION
- Subject area
- ING-INF/03
- Reference degree course
- COMPUTER ENGINEERING
- Course type
- Master's Degree
- Credits
- 9.0
- Teaching hours
- Frontal Hours: 81.0
- Academic year
- 2021/2022
- Year taught
- 2021/2022
- Course year
- 1
- Language
- ENGLISH
- Curriculum
- PERCORSO COMUNE
- Reference professor for teaching
- DISTANTE Cosimo
- Location
- Lecce
Teaching description
No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:
- Math: Linear algebra, vector calculus, and probability. Linear algebra is the most important.
- Data structures: Students will write code that represents images as feature and geometric constructions.
- Programming: A good working knowledge. All lecture code and project starter code will be Python, and Pytorch for Deep Learning, but student familiar with other frameworks such as tensorflow is ok.
Computer Vision today is everywhere in our society and images have become pervasive, with applications in several sectors; just to mention some in: apps, drones, healthcare and precision medicine, precision agricolture, searching, understanding, control in robotics and self-driving cars.
The course introduces the basics of image formation, reconstruction and inferring motion models, as well as camera calibration theory and practice.
Recent developments in neural networks (Deep Learning) have considerably boosted the performance of the visual recognition systems in tasks such as: classification, localisation, detection, segmentation etc. Students will learn the building blocks of a general convolutional neural network, the way how it is trained and optimized, how to prepare a dataset and how to measure the final performance.
Upon completion of this course, students will:
- Be familiar with both the theoretical and practical aspects of computing with images;
- Have described the foundation of image formation, measurement, and analysis;
- Have implemented common methods for robust image matching and alignment;
- Understand the geometric relationships between 2D images and the 3D world;
- Have gained exposure to object and scene recognition and categorization from images;
- Grasp the principles of state-of-the-art deep neural networks; and
- Developed the practical skills necessary to build computer vision applications.
Teaching is based on theoretical and practical lectures. The student will write in python algorithms taught in class
Oral session. The student will explain the developed project and shall answer two or more questions regarding theoretical aspects of the studied topics
The student must develop a project by choosing a practical simple application with some algorithms done during the course. The choice is at total disposal of the student, as well as the fact of developing it in group os solo. In group setting the students must proof their own activities developed in the common project application.
The final examination is based on oral assessment of the topics covered during lectures.
For the LAB practice, students may use for the deep learning development the Google Colab or Cloud Platform.
Introduction to Computer Vision
Image Formation
2D and 3D geometric primitives - Projections
Color perception, color spaces and processing
Image Filtering
LAB Introduction to Python and Operations with images
interpolation, optimization, image pyramids and blending
Machine learning
LAB (pytorch basics? Dataloaders, ML?, T-sne?)
loss functions, optimization with stochastic gradient descent
backpropagation and neural networks, computational graphs and gradient estimation
Convolutional Neural Network, CNN activation functions, data preprocessing, weight normalization, batch normalization, monitoring the learning process, hyperparameter optimization, Regularization (Dropout, drop connect, fractional pooling, cotout, mixup)
CNN architectures (Alexnet, VGG, GoogleNet, ResNET, DenseNet, SENet, EfficientNet), Siamese Architectures (applications to face verification, people and vehicle re-identification)
LAB CNN
Recurrent neural networks, Attention mechanisms
Object detection and segmentation
LAB - object detection - segmentation
Generative Models
edges, feature matching Ransac and alignment
Optical flow, 3D, Depth perception and stereo
SLAM/SfM
Camera Calibration - distortion models and compensations - linear methods for camera parameters. Calibration with a checkerboard
LAB camera calib
3D shapes
There is no requirement to buy a book. The goal of the course is to be self contained, but sections from the following textbooks will be suggested for more formalization and information.
The primary course text will be Rick Szeliski’s draft Computer Vision: Algorithms and Applications 2nd Edition 2022; we will use an online copy (fill the form) at this link.
We will be using Piazza for all course notes, homework and final project.
A copy and link will be provided in website.
A textbook for Deep Learning with Pytorch script can be accessed at this link
Deep Learning, MIT Press book, Ian Goodfellow and Yoshua Bengio and Aaron Courville
Semester
Second Semester (dal 01/03/2022 al 10/06/2022)
Exam type
Compulsory
Type of assessment
Oral - Final grade
Course timetable
https://easyroom.unisalento.it/Orario