Sleepy Mech

ITP Thesis 2024. Developed With Alina Liu

Sleepy Mech is something between a toy and an instrument, developed as my thesis project at ITP. The concept for this project arose from work I was doing in another course to create ML-powered stems for various parts of existing songs. I found the output of the work to be fascinating, but the process really annoying, leading me to wonder if there was a way to bring ML audio even closer to musicians and learners, and actually pull it off of a computer entirely.

The resulting project combines technical capabilities of a looper and aesthetics of vinyl toys with the pioneering sounds of machine learning style-transfer models to create an all - in - one interface. I developed all of the software and hardware for this project, and developed a suite of custom machine learning models to use with. 3D design was handled by Alina, and some graphics were designed by Proud Aiemkrusa.



To develop the software, I used RAVE, a variational encoder style transfer framework that allows transfer of timbral and rhythmic features of an audio signal from one source to another. It doesn’t make a lot of sense just reading it, so it’s probably best to hear (and see) for yourself.



Essentially, the models can take your voice (for example, beatboxing) and turn it into another instrument (for example, a bass guitar, darbouka, NASA sounds, artifical voices, or anything else). The only constraint is that the models work better the more similar the input audio is to their training data, which makes sense. Here’s a look inside:




The biggest challenge here was getting all of this tech safely housed inside the body! Also, RFID tags love to wear out, which was another source of “fun”.

I also developed a suite of machine learning models for this project, which I trained on electronic instruments commonly used in the composition of techno and house songs. I always found the interfaces very unintuitive, so I attempted to make them controllable by the human voice through this project. Here are the models I made:



Model 1: SH-101
• ~ 10 hours of SH-101 recordings


Model 2: TB-303
• ~ 3 hours of acid basslines



    
Model 3: LinnDrum
• ~ 4 hours of LinnDrum MIDI



          
Model 3: TR-08
• ~ 4 hours of TR-08 MIDI



            

All of these models are available for free download, just send me an email :)

In addition, I included open-source models from Acids-Ircam and Intelligent Instruments Lab. Here are the ones I used, along with some amazing art Proud made:

This is by no means a finished product, but I enjoyed exploring this space and am continuing to develop. Here are some process photos and a video of me presenting my thesis. You can also read this blog if you want to know more about the development process.





     ︎︎