ICCV21 Multimodal

Overview

In this tutorial, we would like to cover many aspects of multi-modality learning in vision, e.g., using language, audio, video, wireless, touch, etc. Our target audience includes students, researchers and engineers, who are interested in learning the recent advances in multi-modality, performing research and applying them to real-world problems.

If you are interested in video action recognition basics and edge deployment of video algorithms, please check out our [1st Comprehensive Tutorial on Video Modeling] in CVPR 2020.

If you are interested in more recent research in video understanding, please check out our [2nd Comprehensive Tutorial on Video Modeling] in CVPR 2021.

Schedule

14:00 - 14:40 : Computer Perception with Perceivers by João Carreira [YouTube] [Bilibili]

14:40 - 15:20 : Neuro-Symbolic Dynamic Visual Reasoning by Chuang Gan [YouTube] [Bilibili]

15:20 - 16:00 : Video, Multimodality & Similarity by Cees Snoek [YouTube] [Bilibili]

16:00 - 16:10 : Break

16:10 - 16:50 : Computer Vision with Sight, Sound, and Speech by Lorenzo Torresani [YouTube] [Bilibili]

16:50 - 17:30 : Knowledgeable and Spatial-Temporal Vision+Language by Mohit Bansal [YouTube] [Bilibili]

17:30 - 18:10 : Learning the physical world from High-resolution Tactile Sensing by Wenzhen Yuan [YouTube] [Bilibili]

For offline Q&A, please post questions to Google Doc

ICCV 2021

Zoom

Oct 11, 2021 2pm to 6pm. ALL TIME IN EST.

Speakers

Overview

Schedule

Organizers

ICCV 2021 Multi-Modality Learning from Videos and Beyond
Zoom Oct 11, 2021 2pm to 6pm. ALL TIME IN EST.

ICCV 2021

Zoom Oct 11, 2021 2pm to 6pm. ALL TIME IN EST.

Speakers

Overview

Schedule

Organizers

Zoom

Oct 11, 2021 2pm to 6pm. ALL TIME IN EST.