Yesterday’s surfing churned a couple of interesting links on the subject of speech synthesis and computer singing. I wasn’t really explicitly looking for this stuff, and can’t reconstruct what led me here, but I thought I’d archive this here. The first is a link to a formant based speech synthesizer in just 150 lines of code. The quality is, well, not amazing, but the code is simple enough to follow, and enabled me to gain a grasp of formant synthesis, at least, with a little explanation from Tom.
They have a video too!
Apparently this is derived from Cantarino, a project to do speech synthesis on the Arduino. It can sing too, and a bit better to my ear.
Cantarino — Speech synthesis on the Arduino
Arduino sings ‘Daisy Bell’ from Tinker on Vimeo.
Both of these synthesizers have difficulty doing consonants in an intelligible way, but it is fascinating that such a simple technique can approach legible speech at all. Bookmarked for future tinkering.
On that first one, I thought “hey, this is actually pretty good!”, then I closed my eyes and realized that even knowing the tune there was a lot that reading the words was doing to fill in how I was hearing the audio.
Relatedly, I just listened to Coverville Episode #726 which featured a Japanese “vocaloid” performance of an English song that I was pretty impressed by, at least from a technical standpoint.