Sleepy Mech
ITP Thesis 2024. Developed With Alina LiuSleepy Mech is something between a toy and an instrument, developed as my thesis project at ITP. The concept for this project arose from work I was doing in another course to create ML-powered stems for various parts of existing songs. I found the output of the work to be fascinating, but the process really annoying, leading me to wonder if there was a way to bring ML audio even closer to musicians and learners, and actually pull it off of a computer entirely.
The resulting project combines technical capabilities of a looper and aesthetics of vinyl toys with the pioneering sounds of machine learning style-transfer models to create an all - in - one interface. I developed all of the software and hardware for this project, and developed a suite of custom machine learning models to use with. 3D design was handled by Alina, and some graphics were designed by Proud Aiemkrusa.

To develop the software, I used RAVE, a variational encoder style transfer framework that allows transfer of timbral and rhythmic features of an audio signal from one source to another. It doesn’t make a lot of sense just reading it, so it’s probably best to hear (and see) for yourself.
Essentially, the models can take your voice (for example, beatboxing) and turn it into another instrument (for example, a bass guitar, darbouka, NASA sounds, artifical voices, or anything else). The only constraint is that the models work better the more similar the input audio is to their training data, which makes sense. Here’s a look inside:


The biggest challenge here was getting all of this tech safely housed inside the body! Also, RFID tags love to wear out, which was another source of “fun”.
I also developed a suite of machine learning models for this project, which I trained on electronic instruments commonly used in the composition of techno and house songs. I always found the interfaces very unintuitive, so I attempted to make them controllable by the human voice through this project. Here are the models I made:
Model 1: SH-101
• ~ 10 hours of SH-101 recordings
• ~ 10 hours of SH-101 recordings

Model 2: TB-303
• ~ 3 hours of acid basslines
• ~ 3 hours of acid basslines

Model 3: LinnDrum
• ~ 4 hours of LinnDrum MIDI
• ~ 4 hours of LinnDrum MIDI

Model 3: TR-08
• ~ 4 hours of TR-08 MIDI
• ~ 4 hours of TR-08 MIDI

All of these models are available for free download, just send me an email :)
In addition, I included open-source models from Acids-Ircam and Intelligent Instruments Lab. Here are the ones I used, along with some amazing art Proud made:

