Teaching in italian
Subject area
Reference degree course
Course type
Master's Degree
Teaching hours
Frontal Hours: 81.0
Academic year
Year taught
Course year
Reference professor for teaching

Teaching description

No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:

  • Math: Linear algebra, vector calculus, and probability. Linear algebra is the most important.
  • Data structures: Students will write code that represents images as feature and geometric constructions.
  • Programming: A good working knowledge. All lecture code and project starter code will be Python, and Pytorch for Deep Learning, but student familiar with other frameworks such as tensorflow is ok. 

Computer Vision today is everywhere in our society and images have become pervasive, with applications in several sectors; just to mention some in: apps, drones, healthcare and precision medicine, precision agricolture, searching, understanding, control in robotics and self-driving cars.

The course introduces the basics of image formation, reconstruction and inferring motion models, as well as camera calibration theory and practice.

Recent developments in neural networks (Deep Learning) have considerably boosted the performance of the visual recognition systems in tasks such as: classification, localisation, detection, segmentation etc. Students will learn the building blocks of a general convolutional neural network, the way how it is trained and optimized, how to prepare a dataset and how to measure the final performance.

Upon completion of this course, students will:

  1. Be familiar with both the theoretical and practical aspects of computing with images;
  2. Have described the foundation of image formation, measurement, and analysis;
  3. Have implemented common methods for robust image matching and alignment;
  4. Understand the geometric relationships between 2D images and the 3D world;
  5. Have gained exposure to object and scene recognition and categorization from images;
  6. Grasp the principles of state-of-the-art deep neural networks; and
  7. Developed the practical skills necessary to build computer vision applications.

Teaching is based on theoretical and practical lectures. The student will write in python algorithms taught in class

Oral session. The student will explain the developed project and shall answer two or more questions regarding theoretical aspects of the studied topics

The student must develop a project by choosing a practical simple application with some algorithms done during the course. The choice is at total disposal of the student, as well as the fact of developing it in group os solo. In group setting the students must proof their own activities developed in the common project application.

The final examination is based on oral assessment of the topics covered during lectures.

For the LAB practice, students may use for the deep learning development the Google Colab or Cloud Platform.

Introduction to Computer Vision

Image Formation

2D and 3D geometric primitives - Projections

Color perception, color spaces and processing

Image Filtering

LAB Introduction to Python and Operations with images

interpolation, optimization, image pyramids and blending

Machine learning

LAB (pytorch basics? Dataloaders, ML?, T-sne?)

 loss functions, optimization with stochastic gradient descent

backpropagation and neural networks, computational graphs and gradient estimation

Convolutional Neural Network, CNN activation functions, data preprocessing, weight normalization, batch normalization, monitoring the learning process, hyperparameter optimization, Regularization (Dropout, drop connect, fractional pooling, cotout, mixup)

CNN architectures (Alexnet, VGG, GoogleNet, ResNET, DenseNet,  SENet, EfficientNet), Siamese Architectures (applications to face verification, people and vehicle re-identification)


Recurrent neural networks, Attention mechanisms

Object detection and segmentation

LAB - object detection - segmentation

Generative Models

edges, feature matching Ransac and alignment

Optical flow, 3D, Depth perception and stereo


Camera Calibration - distortion models and compensations - linear methods for camera parameters. Calibration with a checkerboard

LAB camera calib

3D shapes 


There is no requirement to buy a book. The goal of the course is to be self contained, but sections from the following textbooks will be suggested for more formalization and information.

The primary course text will be Rick Szeliski’s draft Computer Vision: Algorithms and Applications 2nd Edition 2022; we will use an online copy (fill the form) at this link

We will be using Piazza for all course notes, homework and final project. 

A copy and link will be provided in website.  

A textbook for Deep Learning with Pytorch script can be accessed at this link

Deep Learning, MIT Press book, Ian Goodfellow and Yoshua Bengio and Aaron Courville




Second Semester (dal 01/03/2022 al 10/06/2022)

Exam type

Type of assessment
Oral - Final grade

Course timetable

Download teaching card (Apre una nuova finestra)(Apre una nuova finestra)