An instructor of mine once compared mixing a song to baking a cake. The various tracks are the ingredients, and once everything is mixed and baked, as with a cake, those ingredients can’t be deconstructed. Or can they? AI researchers at MIT’s Computer Science and Artificial Intelligence Laboratory have created an app that can isolate a performance in a video by simply clicking on a specific instrument.
Using a deep learning neural network that was trained by analyzing over 60 hours of video featuring musicians playing instruments, the software is able to identify over 20 different instruments without being told what they are, and then automatically isolate the sound of one of them from the video’s audio. All it requires is the user to click on which instrument they want to hear, a process that usually requires hours of processing and tweaking by a sound engineer.
The CSAIL researchers suggest that as the software improves, and it learns to tell the difference between instruments in the same family, it could be a vital tool when it comes to remixing and remastering older performances where the original recordings no longer exist. For example, the sound of a trumpet could be boosted, while a piano was reduced, to improve the overall mix, years after a performance was first mixed. Or, musicians who are still learning an instrument could easily focus on a specific part of a song they’re trying to master.
The software also has the potential to revolutionize the process of remixing songs, or creating mashups, which is probably an application MIT doesn’t want to promote at this point. But being able to just click and extract a specific instrument’s performance would be a feature that plenty of remix artists would love to add to their toolkits.
[MIT]