Inigo Quilez   ::     ::  
I'm not sound coder or synth architect, however I would like to quickly explain something about a DSP issue that many of us amateurs have faced with at least once while doing this kind of fun programming. It's about frequency modulation - something very easy in conception yet something most people don't really understand, I believe.

The big misconception

Let's start with a well known example of a simple cosine of constant frequency, for example 440 hz:

y(t) = cos( 2π*440*t )

where t is normally the "time" and can be expressed for example as t=n/fs where n is the current sample number and fs is the sampling frequency, say 44100 hz.

Now, imagine we want this tone to change frequency with time. Like, we want that after three seconds the frequency has completely decreased to zero hertz, in a linear manner. Then, many people will make the following mistake:

y(t) = cos( 2π*440*t*(1-t/3) )

hoping that at t=3 the 440 is cancelled out. Wrong!!! If you listen to it (or draw the graph) you will see that the signal actually has zero frequency at time t=1.5 and that at t=3 is has completely recover its original frequency (although it's vertically mirrored).

Another example, we all have one day tried to do

y(t) = cos( 2π*(440+40*cos(2π*t))*t )

in the believe it would create a nice vibrato effect or something. But we get an explosion of beeps instead.

So, what's wrong again?

The explanation

The big misconception is that

y(t) = cos( 2π*440*t )

actually sounds at 440 hz because that thing in front of "t" reads "2π*440", and that therefore anything in front of "t" will give the appropriate pitch to our tone, what obviously probes to be false. The truth is that the pitch of a cosine is given by the derivative of its argument.

So, let's write the tone above as

y(t) = cos(p(t)), p(t) = 2π*440*t

p(t) is the argument of the cosine, and its derivative is

dp(t)/dt = p'(t) = 2π*440

thus it's sound is 2764.6 radians per second, or 440 cycles per second (hertz). Let's take the example of the fading sound he wanted to achieve: we were using

p(t) = 2π*440*t*(1-t/3)

meaning the pitch was

p'(t) = 2π*440*( 1-2*t/3 )

so we were clearly starting at a pitch of 440 hz (t=0), geting 0 at t=1.5 and going back to -440 at t=3. What we wanted instead was more

p'(t) = 2π*440*(1-t/3), therefore

p(t) = 2π*440*(t-t*t/6) = 2π*440*t*(1-t/6),


y(t) = cos( 2π*440*t*(1-t/6) )

For the tremolo effect, we want

p'(t) = 2π*(440 + 40*cos(2pit) )

and therefore

p(t) = 2π*( 440*t + 40*sin(2π*t)/2π )

and this one does not explode but do what it's supposed to do.


So, remember, the pitch of a cosine is given by the derivative of its argument, not by the expression in front of "t" of the argument. Some people know this as the famous "additive fm is better than multiplicative fm", what tells me they don't really know what's going on. Here you have the right interpretation of that sentence.