Multisensory Machine Intelligence

Talk
Ruohan Gao
Talk Series: 
Time: 
03.09.2023 11:00 to 12:00

The future of Artificial Intelligence demands a paradigm shift towards multisensory perception—to systems that can digest ongoing multisensory observations, that can discover structure in unlabeled raw sensory data, and that can intelligently fuse useful information from different sensory modalities for decision making. While we humans perceive the world by looking, listening, touching, smelling, and tasting, traditional form of machine intelligence mostly focuses on a single sensory modality, particularly vision. My research aims to teach machines to see, hear, and feel like humans to perceive, understand, and interact with the multisensory world. In this talk, I will present my research of multisensory machine intelligence that studies two important aspects of the multisensory world: 1) multisensory objects, and 2) multisensory space. In both aspects, I will talk about how I design systems to reliably capture multisensory data, how I effectively model them with new differentiable simulation algorithms and deep learning models, and how I explore creative cross-modal/multi-modal applications with sight, sound, and touch. In the end, I will conclude with my future plans.