Well, today I wasn’t feeling especially well, but I did manage to get a tiny bit of tinkering done. My idea was to implement the framework necessary for playing with a vocoder as a first step toward creating some more robotic voice filters and experimenting with pitch shifting.
I noticed that the Fast Fourier Transform library I have used in the past, fftw has undergone a major upgrade since the last time I tinkered with this stuff, so I spent some time reading the documentation. As a side benefit I noticed that the GNU C compiler has extensions for handling complex arithmetic. Interesting.
I also dug around to find a library that allowed me to read and write WAV format sound files, and I settled on libsndfile. It allows a certain amount of data type transparency: you can read WAV files and get the data returned as shorts, floats or doubles. That simplifies things a bit.
So after tinkering for an hour, I have a simple framework in place that:
- opens a WAV file for reading
- creates a similar format WAV file for output
- reads in frames of 512 samples, which are overlapped by 25%
- the resulting buffers are windowed
- an FFT is performed on the data
- all the good stuff will happen, then…
- an inverse FFT will be performed on the frequency bins
- each output buffer will be written to the output WAV file
So far, it just acts as an identity filter: the output WAV is pretty much a copy of the input, but all the databuffering and the like seems to work just fine. When I have a moment or two of clear thought, I’ll get to work on the good stuff.
Addendum: You can find the original Bell Labs paper describing the phase vocoder here.