Gedas Bertasius

Assistant Professor

I am an Assistant Professor in the Computer Science department at the University of North Carolina, Chapel Hill. Before joining UNC, I was a postdoctoral researcher at Facebook AI Research (FAIR) working with Lorenzo Torresani. I finished my Ph.D. at the University of Pennsylvania, advised by Jianbo Shi, and my undergraduate degree at Dartmouth College.

About Me

Research

I lead the Multimodal Video Perception (MVP) group at UNC. We develop foundational models for multimodal video understanding, enabling machines to comprehend, reason about, and interact with complex video, audio, and language data. Moving beyond perception, we ask: what spatiotemporal abstractions are needed for AI to truly grasp complex human behaviors over long horizons? Representative projects include TimeSformer, Video ReCap, LLoVi, BIMBA, VideoTree.

Video Recognition

Developing spatiotemporal models for automatic video analysis (e.g., TimeSformer, ViS4mer).

Multimodal AI

Building models that can learn from video, audio, and text (e.g., Video ReCap, LLoVi, BIMBA).

Perceptual Assistants

Sports Analytics

Developing AI models that can assist people with daily tasks and skill learning (e.g., VidAssist, Ego-Exo4D, and ExAct).

Video for Robotics

Elevating strategic insights using state-of-the-art multimodal video models (e.g., BASKET).

Generative Video Modeling

Translating visual inputs into effective real-world actions (e.g., BOSS, ReBot, and ARCADE)

Enabling applications such as video-to-music generation and audio-visual editing (e.g., VMAs, AvED, and V2M-Zero)

Learn More

Recent News

March 6, 2026

Gave a keynote talk "From Perception to Agency in Strategic Video Understanding" at the WACV 2026 Computer Vision for Winter Sports workshop held in Tucson, Arizona.