Computer Vision

CMPT 412 / 462, Fall 2024, with full online support

TAs: S. Mahdi H. Miangoleh, Khaled Al Butainy

Mon 10:30 – 12:20; Wed 10:30 – 11:20; Fri 12:30 – 14:20

Computer vision is the process of automatically extracting information from images and videos. Building up from introductory image processing and computer vision background, this advanced course covers deep neural architectures for image classification, object detection, image segmentation, metric learning, and other image/video understanding tasks; as well as 3D Computer Vision, including camera models, two-view geometry, stereo reconstruction, and camera pose estimation.

Attendence

Attendence to lectures, with in-person and Zoom options, is mandatory for the entire semester. Attendance will be checked at random times, in total 5 to 8 times during the semester. Students missing in class in more than 1 of these attendence checks will receive an NA grade.

Pre-reqs

It’s a good idea for the most of you to brush up on your linear algebra skills as soon as possible to make the best out of this class. 3Blue1Brown has introductory Linear Algebra classes on Youtube with great visual explanations of concepts that we will make use of during the class.

This course builds up on CMPT 361. A refresher on the camera model and image processing fundamentals is advised before taking this course. You can check out this playlist of CV-related lectures from CMPT 361 - Intro. Visual Computing here.

COVID-19 Policy

In order to have a safe learning environment for everyone, we have several guidelines for in-person lectures:
- If you are feeling sick or you suspect you might have contracted COVID-19 or flu, please do not attend the in-person lectures and instead join the Zoom sessions.
- If you have contracted COVID-19, please do not attent the in-person lectures for 2 weeks after your initial diagnosis. Please join the Zoom sessions instead.

Grading

Programming assignments - 4 x 25% = 100%

Tentative Schedule

Weeks 1-3: CNNs, classification
Weeks 4-6: Stereo, Multi-view, SfM
Weeks 7-8: Segmentation
Weeks 9-11: Metric learning, RNN, Transformers, GAN
Weeks 12-13: Nerfs, current research directions

Programming assignments

There will be 4 programming assignments.

Assignment 1: Digit recognition with convolutional neural networks
Deadline: Sep 29, 23:59

Assignment 2: Deep learning by PyTorch
Deadline: Oct 20, 23:59

Assignment 3: 3D Reconstruction
Deadline: Nov 10, 23:59

Assignment 4: Object Detection, Semantic Segmentation, and Instance Segmentation
Deadline: Dec 8, 23:59

Textbook

There is no required textbook for the course. One useful resource that is also available online for free is the textbook Computer Vision: Algorithms and Applications by Richard Szeliski. There is a great number of resources you can find online, and don't forget that Wikipedia is always your friend.

Announcements, Questions and Discussion

Our main medium of communication is Coursys Discussion Forum.

Academic Integrity

You are encouraged to talk about and discuss coding assignments and projects with your class-mates. You are allowed to use existing code/library (e.g., optimization library or vector calculus library), in which case, you have to explicitly describe it in your report. Besides the above case, every single line of code must be written by you, and you are not allowed to copy from other sources. Writing the code by exactly or closely following existing code is not technically copy-and-paste, but is also considered to be copy-and-paste. Use your fair judgement. You know what is good and bad. When in doubt, consult the instructor. You are expected to maintain the highest standards of academic integrity and refrain from the forms of misconduct.