Course Description

This is an advanced seminar course that will focus on the latest research on transformer models for visual recognition. The course will consist of research paper presentations and a semester-long course project. Topics will include vision transformers, MLP-based models, self-supervised learning, multi-modal learning, and various image and video-based applications. Background in deep learning is required.

Administrative Information

Instructor: Gedas Bertasius
Time: Mon & Wed 11 am - 12:15 pm
Location: FB 009
Office Hours: by appointment
Canvas Site: https://uncch.instructure.com/courses/49024

Grading

Class Participation: 10%
Paper Critiques: 20%
Paper Presentations: 30%
Course Project: 40%

Course Policies

Class Participation: Please come to class prepared for a paper discussion with your peers. Furthermore, please do not discuss the papers with your peers before the class. I'm interested in hearing your own opinion about the papers.

Late Submissions: The class is structured around a tight paper presentation schedule. Therefore, late assignments will not be accepted.

Academic Integrity: For your presentations and projects, you are allowed to use materials from external sources. However, you must clearly acknowledge those sources.

About

Schedule

Project Details

Paper Critiques

Paper Battles

Course Description

Administrative Information

Grading

Course Policies