[music-dsp] Wavetable interpolation

Discussion:

ChordWizard Software

13 years ago

Hi all,

I am working on a new project using PortAudio and testing it with a waveform stored in a buffer. This could be generated myself (sine, square, sawtooth, etc) or a more complex waveform loaded from a file.

I want to be able to render the waveform at different frequencies, but I can see that I will be limited by quantisation if I just play out the samples themselves.

For example, to generate a 440 Hz tone with sample rate 44100, I need 100.2272727 samples. This gets rounded to 100 samples, which then produces an actual tone of 441 Hz.

So it looks like I will be needing to interpolate between the samples in the wavetable. Can someone point me to some discussion or resources on this?

Simple linear interpolation looks easy enough to achieve, but does it introduce an unacceptable level of distortion? Is there a rule of thumb for minimum numer of samples per waveform to keep artifacts of this type undetectable?

Also, as an aside, is it general practice to calculate sample values with doubles (8 bytes) or are floats (4 bytes) generally adequate? I'm targeting CD-quality audio (shorts @ 2 bytes) so I'm not sure if the extra precision during calculations helps much.

Regards,

Stephen Clarke
Managing Director
ChordWizard Software Pty Ltd
***@chordwizard.com
http://www.chordwizard.com
ph: (+61) 2 4960 9520
fax: (+61) 2 4960 9580

Didier Dambrin

13 years ago

Permalink

If you oversample your waveforms (store larger, pre-filtered waveforms) you
can get away with linear interpolation.
..or a more complex one on smaller buffers, as these days you may favor more
instructions over more mem access.

& yes of course single floats are more than enough, even 16bit would be (but
you need floats for speed here).

-----Message d'origine-----
From: ChordWizard Software
Sent: Monday, May 07, 2012 11:45 PM
To: music-***@music.columbia.edu
Subject: [music-dsp] Wavetable interpolation

Hi all,

I am working on a new project using PortAudio and testing it with a waveform
stored in a buffer. This could be generated myself (sine, square, sawtooth,
etc) or a more complex waveform loaded from a file.

I want to be able to render the waveform at different frequencies, but I can
see that I will be limited by quantisation if I just play out the samples
themselves.

For example, to generate a 440 Hz tone with sample rate 44100, I need
100.2272727 samples. This gets rounded to 100 samples, which then produces
an actual tone of 441 Hz.

So it looks like I will be needing to interpolate between the samples in the
wavetable. Can someone point me to some discussion or resources on this?

Simple linear interpolation looks easy enough to achieve, but does it
introduce an unacceptable level of distortion? Is there a rule of thumb for
minimum numer of samples per waveform to keep artifacts of this type
undetectable?

Also, as an aside, is it general practice to calculate sample values with
doubles (8 bytes) or are floats (4 bytes) generally adequate? I'm targeting
CD-quality audio (shorts @ 2 bytes) so I'm not sure if the extra precision
during calculations helps much.

Regards,

Stephen Clarke
Managing Director
ChordWizard Software Pty Ltd
***@chordwizard.com
http://www.chordwizard.com
ph: (+61) 2 4960 9520
fax: (+61) 2 4960 9580

--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

-----
Aucun virus trouve dans ce message.
Analyse effectuee par AVG - www.avg.fr
Version: 2012.0.1913 / Base de donnees virale: 2425/4983 - Date: 07/05/2012

Phil Burk

13 years ago

Permalink

Hello Stephen,

Post by ChordWizard Software
Simple linear interpolation looks easy enough to achieve, but does it
introduce an unacceptable level of distortion?

Simple linear interpolation is fairly decent and is far better than not
doing any interpolation. But note that the further you stretch the pitch
then the more important the interpolation becomes. If you are stretching
a sample up a couple octaves then you should consider using fancier
interpolation.

Post by ChordWizard Software
Is there a rule of thumb for minimum numer of samples per waveform to
keep artifacts of this type undetectable?

If your samples are recorded at a decent rate, 44100 Hz for example,
then you should be fine.

Post by ChordWizard Software
Also, as an aside, is it general practice to calculate sample values
with doubles (8 bytes) or are floats (4 bytes) generally adequate?

A lot of audio synthesis is done using 32-bit floats. It is fine for
most cases. The places where I tend to use 64-bit calculations are where
there is feedback, for example in an IIR filter, because then the errors
can accumulate.

I recommend implementing your app using 32-bit floats and with linear
interpolation. If it sounds good enough then be happy. It should be fine
for consumer grade apps. Take into account the other parts of the audio
chain. If your app is likely to only be heard over crappy PC speakers
then the interpolation technique is the least of your worries.

Phil Burk

ChordWizard Software

13 years ago

Permalink

Hi Didier, Phil,

Thanks so much for these quick and helpful replies.

It's great to get this sort of reality check on a few basic points before I go charging off in the wrong direction!

Regards,

Stephen Clarke
Managing Director
ChordWizard Software Pty Ltd
***@chordwizard.com
http://www.chordwizard.com
ph: (+61) 2 4960 9520
fax: (+61) 2 4960 9580

robert bristow-johnson

13 years ago

Permalink

Post by ChordWizard Software
I am working on a new project using PortAudio and testing it with a waveform stored in a buffer. This could be generated myself (sine, square, sawtooth, etc) or a more complex waveform loaded from a file.
I want to be able to render the waveform at different frequencies, but I can see that I will be limited by quantisation if I just play out the samples themselves.
For example, to generate a 440 Hz tone with sample rate 44100, I need 100.2272727 samples. This gets rounded to 100 samples, which then produces an actual tone of 441 Hz.

...

i don't quite understand what you would be using the number 100.2272727
anyway.

you have a wavetable length based on other parameters. it's usually
handy to have the wavetable length a power of two. and then you
calculate a phase increment or step size through that wavetable to get
you a 440 Hz tone. this phase increment will have an integer and
fractional parts. the integer value says what samples in the wavetable
are used for interpolation and the fractional part says how these
samples will be combined in calculating the interpolated value.
without considering oversampling, the length of the wavetable must be
longer than twice the highest harmonic you want in the tone. if you're
oversampling the wavetable (say, 2048 length wavetable and only 64
harmonics), then this wavetable can accommodate increments of up to
32.00000 with no aliasing. also, to reduce the effects from linear
interpolation, an oversampled wavetable is good. and sometimes memory
is cheap, so long interpolated wavetables cost little and have a good
payoff.

one last thing, if you're doing linear interpolation (i recommend that),
then your very first sample of the wavetable (at index 0) should be
copied and placed in the sample beyond (at index N) the last sample
(which index N-1). so a 2048 point wavetable would take 2049 entries.
and the 0th and 2048th entries are the same value.

Post by ChordWizard Software
So it looks like I will be needing to interpolate between the samples in the wavetable. Can someone point me to some discussion or resources on this?
Simple linear interpolation looks easy enough to achieve, but does it introduce an unacceptable level of distortion?

not if you oversample and make a big, big, big wavetable.

Post by ChordWizard Software
Is there a rule of thumb for minimum numer of samples per waveform to keep artifacts of this type undetectable?

every factor of 2 in the oversampling ratio, reduces the aliasing
distortion from linear interpolation by 12 dB.
floats are okay. but use a long int for the phase increment and the
"phase accumumlator". i think you'll have less trouble if you do the
phase increment using fixed-point arithmetic.

--
r b-j ***@audioimagination.com

"Imagination is more important than knowledge."

Ross Bencina

13 years ago

Permalink

Post by ChordWizard Software
Simple linear interpolation looks easy enough to achieve, but does it
introduce an unacceptable level of distortion?

The distortion is a function of (at least) the spectrum of the audio
stored in the wavetable, the interpolation method, and the change in
playback rate.

Note also that interpolation is a kind of filter. Consider linear
interpolation of a point half way between two samples:

y[1] = (x[0] + x[1]) / 2.

That's a moving average with window size-two. It's going to damp the
high frequencies (and cancel at nyquist). Although in general you're not
always interpolating half way between samples, this filtering effect is
sometimes detectable (if you can hear above 10k!), but it's not exactly
distortion.

Post by ChordWizard Software
Is there a rule of
thumb for minimum numer of samples per waveform to keep artifacts of
this type undetectable?

"Undetectable" by whom?

I remember back in the day reading about some later Emulator samplers
using optimised 8-point sinc interpolation with pre-filtered source
samples (I think there's a patent if you want to look it up). Even that
would not be classed as undetectable under some criteria.

On the other hand there is plenty of sound making it onto recordings
that has been linear interpolated.

Potentially you can oversample the audio using a high quality off-line
interpolator to limit the bandwidth and then apply a low-order
interpolator (some discussion here, although it's not the best reference):

http://www.student.oulu.fi/~oniemita/dsp/deip.pdf

The standard introduction to interpolation is the "Splitting the Unit
Delay" article:

http://signal.hut.fi/spit/publications/1996j5.pdf

I realise that's possibly more science than you're looking for.

As for rule of thunb, aside from what Robert already said, I would say
the rule of thumb is "use linear or cubic hermite interpolation" unless
it is important to have better (and slower) interpolation, in which case
the sky's the limit -- you'll need to define proper error margins and
evaluate your approach mathematically etc.

As usual there's a trade-off between space, time and quality.

Post by ChordWizard Software
Also, as an aside, is it general practice to calculate sample values
with doubles (8 bytes) or are floats (4 bytes) generally adequate?
the extra precision during calculations helps much.

General practice is to do what is both efficient and accurate enough on
your given architecture. If doubles are not slower, then you'd use them.

On a PC I don't think it would be uncommon to use a double for the phase
index and phase increment. You don't necessarily need to use the full
precision to perform the actual interpolation but it helps for frequency
precision.

For a variable frequency design the precision of the phase index can
also impact signal-to-noise.

Ross.

Nigel Redmon

13 years ago

Permalink

Note also that interpolation is a kind of filter...It's going to damp the high frequencies...

But if you oversample your wavetable, the corner frequency is above...and the passband is otherwise pretty flat.

Post by ChordWizard Software
Simple linear interpolation looks easy enough to achieve, but does it
introduce an unacceptable level of distortion?

The distortion is a function of (at least) the spectrum of the audio stored in the wavetable, the interpolation method, and the change in playback rate.
y[1] = (x[0] + x[1]) / 2.
That's a moving average with window size-two. It's going to damp the high frequencies (and cancel at nyquist). Although in general you're not always interpolating half way between samples, this filtering effect is sometimes detectable (if you can hear above 10k!), but it's not exactly distortion.

Post by ChordWizard Software
Is there a rule of
thumb for minimum numer of samples per waveform to keep artifacts of
this type undetectable?

"Undetectable" by whom?
I remember back in the day reading about some later Emulator samplers using optimised 8-point sinc interpolation with pre-filtered source samples (I think there's a patent if you want to look it up). Even that would not be classed as undetectable under some criteria.
On the other hand there is plenty of sound making it onto recordings that has been linear interpolated.
http://www.student.oulu.fi/~oniemita/dsp/deip.pdf
http://signal.hut.fi/spit/publications/1996j5.pdf
I realise that's possibly more science than you're looking for.
As for rule of thunb, aside from what Robert already said, I would say the rule of thumb is "use linear or cubic hermite interpolation" unless it is important to have better (and slower) interpolation, in which case the sky's the limit -- you'll need to define proper error margins and evaluate your approach mathematically etc.
As usual there's a trade-off between space, time and quality.

General practice is to do what is both efficient and accurate enough on your given architecture. If doubles are not slower, then you'd use them.
On a PC I don't think it would be uncommon to use a double for the phase index and phase increment. You don't necessarily need to use the full precision to perform the actual interpolation but it helps for frequency precision.
For a variable frequency design the precision of the phase index can also impact signal-to-noise.
Ross.

...

Nigel Redmon

13 years ago

Permalink

Hi Stephen,

This is so funny. I just posted the first part of an article on wavetable oscillators a few days ago, and just editing the remainder. I used that exact example of ~440 Hz with a 100-sample wave table (coincidence, or did you happen upon it? lol). I used it in the rationale of why we need to interpolate in the first place.

More is answered in part 2 (done but I wanted to let it stew a day or so and final-edit it tomorrow).

Go with linear interpolation—just oversample the wavetable (at least 2x). By this I mean that if the highest harmonic in the table is, say, the 350th, the minimum table size would be double that and you probably want to use a power of two (1024); double that, at minimum, to 2048. In a nut shell, the fewer harmonics relative to the table capacity, the smoother the waveform is, and the more accurate linear interpolation is. (Plus it keeps you away from the frequency rolloff inherent in linear interpolation.)

Single precision floating point is plenty for the samples. That's 25 bits of precision.

Nigel

http://www.earlevel.com/main/2012/05/04/a-wavetable-oscillator—part-1/

...

ChordWizard Software

13 years ago

Permalink

Hey, thanks everyone, this is great.

Robert, the 100.2272727 is because I'm a total noob to this and I was fixing the waveform sample points to the device sample rate, and playing them out one for one. But I've learned a lot today!

Ross, thanks for the insights as always.

Nigel, just a coincidence! Is your article already online somewhere? Sounds like just what I need, although I feel like I am already well on the way with all the help I've been given.

Regards,

Stephen Clarke
Managing Director
ChordWizard Software Pty Ltd
***@chordwizard.com
http://www.chordwizard.com
ph: (+61) 2 4960 9520
fax: (+61) 2 4960 9580

Nigel Redmon

13 years ago

Permalink

Hi Stephen,

It sounds like you have a handle on the main issues, so I'm sure you won't have a problem.

I put the link in my previous message, but I see that it doesn't get parse correctly with the em-dash—just copy and paste this into browser (Part 2 will be up later in the day when I get a chance):

www.earlevel.com/main/2012/05/04/a-wavetable-oscillator—part-1

Nigel

...

Andrew Simper

11 years ago

Permalink

---------- Forwarded message ----------
From: Andrew Simper <***@cytomic.com>
Date: 8 May 2012 11:41
Subject: Re: [music-dsp] Wavetable interpolation
To: A discussion list for music-related DSP <music-***@music.columbia.edu>

Hi Stephen,

Even if you did want to generate a tone of that is an exact multiple
of the sample rate the buffer you are reading from still needs to be
bandlimited, otherwise all the aliasing energy will not vanish, it
will align with the harmonic content and give you a waveform that is
impossible to create by bandlimited sampling an analog signal.

The easiest example is filling a buffer with a trivial square wave and
playing that back at an exact multiple of the sample rate. Every
transition will be a hard step step ...0 0 1 1.. or ...1 1 0 0 ....
Now if you generated a perfect square wave in analog and then sampled
it using a perfect brickwall filter you can never get such a sharp
transition. I went through the maths and got it wrong and thought
(incorrectly) you could get such a transition, but when Paul Frindle
was adamant it wasn't possible I went through all the maths again and
found where I went wrong (thanks Paul!).

Here is an image of the analytic bandlimited step function to show this:

www.cytomic.com/files/dsp/analytic-bandlimited-step.jpg

Andy
--
cytomic - sound music software

...