The first step to training a sound-producing algorithm is to give it sounds to study. The paper will be presented later this month at the annual conference on Computer Vision and Pattern Recognition (CVPR) in Las Vegas. The paper’s co-authors include recent PhD graduate Phillip Isola and MIT professors Edward Adelson, Bill Freeman, Josh McDermott and Antonio Torralba. Deep learning approaches are especially useful because they free computer scientists from having to hand-design algorithms and supervise their progress.
The team used techniques from the field of “deep learning,” which involves teaching computers to sift through huge amounts of data to find patterns on their own. “An algorithm that models such sounds can reveal key information about objects’ shapes and material types, as well as the force and motion of their interactions with the world.”
“When you run your finger across a wine glass, the sound it makes reflects how much liquid is in it,” says CSAIL PhD student Andrew Owens, who was lead author on the paper. The project represents much more than just a clever computer trick: researchers envision future versions of similar algorithms being used to automatically produce sound effects for movies and TV shows, as well as to help robots better understand objects’ properties.