top of page

Course Description

This is an advanced seminar course that will focus on the latest research on transformer models for visual recognition. The course will consist of research paper presentations and a semester-long course project. Topics will include vision transformers, MLP-based models, self-supervised learning, multi-modal learning, and various image and video-based applications.  Background in deep learning is required.

Administrative Information


  • Class Participation: 10%

  • Paper Critiques: 20%

  • Paper Presentations: 30%

  • Course Project: 40%

Course Policies

  • Class Participation: Please come to class prepared for a paper discussion with your peers.  Furthermore, please do not discuss the papers with your peers before the class. I'm interested in hearing your own opinion about the papers.

  • Late Submissions: The class is structured around a tight paper presentation schedule. Therefore, late assignments will not be accepted. 

  • Academic Integrity: For your presentations and projects, you are allowed to use materials from external sources. However,  you must clearly acknowledge those sources.

bottom of page