Learning to Retrieve Videos by Asking Questions

Avinash Madasu, Junier Oliva, Gedas Bertasius

ACM Multimedia 2022

[arxiv] [bibtex]

ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound

Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius

ECCV 2022 (Oral)

[arxiv] [code] [project page] [bibtex

Long Movie Clip Classification with State-Space Video Models

Md Mohaiminul Islam, Gedas Bertasius

ECCV 2022

[arxiv] [code] [bibtex]

Learning To Recognize Procedural Activities with Distant Supervision

Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani

CVPR 2022

[arxiv] [code] [project page] [bibtex]

Long-Short Temporal Contrastive Learning of Video Transformers

Jue Wang, Gedas Bertasius, Du Tran, Lorenzo Torresani

CVPR 2022

[arxiv] [bibtex]

Vx2Text: End-to-End Learning of Video-Based Text Generation from Multimodal Inputs

Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani

CVPR 2021

[arxiv] [VentureBeat] [bibtex]

Supervoxel Attention Graphs for Long-Range Video Modeling

Yang Wang, Gedas Bertasius, Tae-Hyun Oh, Abhinav Gupta, Minh Hoai, Lorenzo Torresani

WACV 2021

Attentive Action and Context Factorization
Yang Wang, Vinh Tran,
Gedas Bertasius, Lorenzo Torresani, Minh Hoai
​BMVC 2020


Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation
Gedas Bertasius, Lorenzo Torresani
​CVPR 2020 (
Best Paper Nominee)
Ranked 1st on YouTube-VIS Leaderboard and in the EPIC-Kitchens Detection Challenge.
[arxiv] [talk] [slides] [bibtex]

Learning Temporal Pose Estimation from Sparsely-Labeled Videos
Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani
NeurIPS 2019
ranked 1st on PoseTrack Leaderboard for multi-frame pose estimation.
​[arxiv] [poster] [code] [bibtex]

Object Detection in Video with Spatiotemporal Sampling Networks
Gedas Bertasius, Lorenzo Torresani and Jianbo Shi
​ECCV 2018
[arxiv] [results] [bibtex]


Am I a Baller? Basketball Performance Assessment from First-Person Videos
Gedas Bertasius, Stella X. Yu, Hyun Soo Park and Jianbo Shi
​ICCV 2017
[​arxiv] [results] [bibtex

Unsupervised Learning of Important Objects from First-Person Videos
Gedas Bertasius, Hyun Soo Park, Stella X. Yu and Jianbo Shi
​ICCV 2017
[arxiv] [bibtex]

Convolutional Random Walk Networks for Semantic Image Segmentation
Gedas Bertasius, Lorenzo Torresani, Stella X. Yu and Jianbo Shi
​CVPR 2017
[arxiv]​​ [bibtex

First-Person Action-Object Detection with EgoNet
Gedas Bertasius, Hyun Soo Park, Stella X. Yu, and Jianbo Shi
​RSS 2017
[arxiv] [New Scientist Article] [Impact Article] [results[bibtex]

Local Perturb-and-MAP for Structured Prediction 
Gedas Bertasius, Qiang Liu, Lorenzo Torresani, and Jianbo Shi
[arxiv] ​[bibtex

Semantic Segmentation with Boundary Neural Fields
Gedas Bertasius, Jianbo Shi and Lorenzo Torresani
CVPR 2016
[arxiv] [code] [bibtex]

DeepEdge:  A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection
Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani
CVPR 2015 
[arxiv] [bibtex]