
Learning to Retrieve Videos by Asking Questions
Avinash Madasu, Junier Oliva, Gedas Bertasius
[arxiv]

ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius
[arxiv]

TALLFormer: Temporal Action Localization with Long-memory Transformer
Feng Cheng, Gedas Bertasius

Long Movie Clip Classification with State-Space Video Models
Md Mohaiminul Islam, Gedas Bertasius
[arxiv]

Learning To Recognize Procedural Activities with Distant Supervision
Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani
CVPR 2022

Long-Short Temporal Contrastive Learning of Video Transformers
Jue Wang, Gedas Bertasius, Du Tran, Lorenzo Torresani
CVPR 2022

Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius, Heng Wang, Lorenzo Torresani
ICML 2021
[arxiv] [code] [talk] [slides] [Facebook AI Blog] [VentureBeat] [SiliconAngle] [bibtex]

Vx2Text: End-to-End Learning of Video-Based Text Generation from Multimodal Inputs
Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani
CVPR 2021
[arxiv] [VentureBeat] [bibtex]

Supervoxel Attention Graphs for Long-Range Video Modeling
Yang Wang, Gedas Bertasius, Tae-Hyun Oh, Abhinav Gupta, Minh Hoai, Lorenzo Torresani
WACV 2021

COBE: Contextualized Object Embeddings from Narrated Instructional Video
Gedas Bertasius, Lorenzo Torresani
NeurIPS 2020
[arxiv] [talk] [slides] [HowTo100M_BB pseudo annotations] [bibtex]

Attentive Action and Context Factorization
Yang Wang, Vinh Tran, Gedas Bertasius, Lorenzo Torresani, Minh Hoai
BMVC 2020
[arxiv]

Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation
Gedas Bertasius, Lorenzo Torresani
CVPR 2020 (Best Paper Nominee)
Ranked 1st on YouTube-VIS Leaderboard and in the EPIC-Kitchens Detection Challenge.
[arxiv] [talk] [slides] [bibtex]

Learning Temporal Pose Estimation from Sparsely-Labeled Videos
Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani
NeurIPS 2019
ranked 1st on PoseTrack Leaderboard for multi-frame pose estimation.
[arxiv] [poster] [code] [bibtex]

Object Detection in Video with Spatiotemporal Sampling Networks
Gedas Bertasius, Lorenzo Torresani and Jianbo Shi
ECCV 2018
[arxiv] [results] [bibtex]

Egocentric Basketball Motion Planning from a Single First-Person Image
Gedas Bertasius, Aaron Chan and Jianbo Shi
CVPR 2018
[arxiv] [results] [MIT SSAC Poster] [bibtex]

Am I a Baller? Basketball Performance Assessment from First-Person Videos
Gedas Bertasius, Stella X. Yu, Hyun Soo Park and Jianbo Shi
ICCV 2017
[arxiv] [results] [bibtex]

Unsupervised Learning of Important Objects from First-Person Videos
Gedas Bertasius, Hyun Soo Park, Stella X. Yu and Jianbo Shi
ICCV 2017
[arxiv] [bibtex]

Convolutional Random Walk Networks for Semantic Image Segmentation
Gedas Bertasius, Lorenzo Torresani, Stella X. Yu and Jianbo Shi
CVPR 2017
[arxiv] [bibtex]

First-Person Action-Object Detection with EgoNet
Gedas Bertasius, Hyun Soo Park, Stella X. Yu, and Jianbo Shi
RSS 2017
[arxiv] [New Scientist Article] [Impact Article] [results] [bibtex]

Local Perturb-and-MAP for Structured Prediction
Gedas Bertasius, Qiang Liu, Lorenzo Torresani, and Jianbo Shi
AISTATS 2017
[arxiv] [bibtex]

Semantic Segmentation with Boundary Neural Fields
Gedas Bertasius, Jianbo Shi and Lorenzo Torresani
CVPR 2016
[arxiv] [code] [bibtex]

High-for-Low, Low-for-High: Efficient Boundary Detection from Deep Object Features and its Applications to High-Level Vision
Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani
ICCV 2015
[arxiv] [code] [bibtex]

DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection
Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani
CVPR 2015
[arxiv] [bibtex]