Seeing is not always believing—at least not for Ph.D. alumnus Ruohan Gao, whose research in machine learning uses senses beyond just sight to inform artificial intelligence.
Gao’s work investigates how algorithms locate and understand audio-emitting objects when multiple sound sources are present. While conventional approaches to object identification in unsupervised machine learning rely solely upon visual cues, Gao’s work leverages audio as a semantic signal to disentangle object sounds in unlabeled video.