CVPR 2021

2nd Comprehensive Tutorial on Video Modeling

Recorded videos can be found below.


June 20th, 2021, 11am to 4pm (all in PST California time)




This is the 2nd tutorial on video modeling organized by Amazon AWS. Our target audience includes students, researchers and engineers, who are interested in learning the recent advances in video modeling, performing research and applying them to real-world problems.

If you are interested in video action recognition basics and edge deployment of video algorithms, please check out our [1st Comprehensive Tutorial on Video Modeling] in CVPR 2020.


11:00 - 11:40 : Do you see what I see? Large-scale learning from multimodal videos by Cordelia Schmid [YouTube] [Bilibili]

11:40 - 12:20 : Multimodal Learning from Videos by Chuang Gan [YouTube] [Bilibili]

12:20 - 13:00 : Video-Modelling for Fine-Grained Understanding by Dima Damen [YouTube] [Bilibili]

13:00 - 13:20 : Panel discussion

13:20 - 13:30 : Break

13:30 - 14:10 : Representing Longer Videos - TokenLearner by Michael S. Ryoo [YouTube] [Bilibili]

14:10 - 14:50 : First-Person Video and Introducing the Ego4D Dataset by Kristen Grauman [YouTube] [Bilibili]

14:50 - 15:30 : Leveraging Motion in Videos by Joseph Tighe [YouTube] [Bilibili]

15:30 - 16:10 : Efficient and Compositional Human Event Understanding by Juan Carlos Niebles [YouTube] [Bilibili]


For offline Q&A, please post questions to Google Doc



Please contact Yi Zhu if you have question.

Special thanks to Xueqing Deng and Yuxin Tian for helping on video recording!