Element Green
2010-11-25 03:26:09 UTC
Hello music-dsp list,
I'm the author of a SoundFont instrument editing application called
Swami (http://swami.sourceforge.net). A while back an interested
developer added a loop finding algorithm which I integrated into the
application. This feature is supposed to generate a list of start/end
loop points which are optimal for "seamless loops".
The original algorithm was based on autocorrelation. There were many
bugs in the implementation and I was having trouble understanding how
it functioned, so I wrote a new algorithm which currently does not use
autocorrelation. The new algorithm seems to come up with good
candidates for "seamless loops", but is slower than the old algorithm
by a factor of 5, but at least it does not suffer from bugs, which
often resulted in unpredictable results.
I have limited knowledge in the area of DSP, so I thought I would seek
advice on this list to determine if the new algorithm makes sense, if
anyone has any ideas on ways to optimize it or if there are better
ways of tackling this task.
First off, is autocorrelation even a solution for this? That is what
the old algorithm used and it seemed to me that sample points closer
to the loop start or end should be given higher priority in the
"quality" calculation, which was not done in that case.
Inputs to the algorithm:
float sample_data[]: Audio data array of floating point samples
normalized to -1.0 to 1.0
int analysis_size: Size of analysis window (number of points compared
around loop points, the loop point is in the center of the window).
int half_analysis_size = analysis_size / 2 which is the center point
of the window (only really center on odd analysis_size values)
int win1start: Offset in sample_data[] of 1st search window (for loop
start point).
int win1size: Size of 1st search window (for loop start point).
int win2start: Offset in sample_data[] of 2nd search window (for loop
end point).
int win2size: Size of 2nd search window (for loop end point).
Description of the new algorithm:
A multiplication "window" array of floats (lets call it
analysis_window[]) is created which is analysis_size in length and
contains a peak value in the center of the window, each point away
from the center is half the value of its closer neighbor and all
values in the window add up to 0.5. 0.5 was chosen because the
maximum difference between two sample points is 2 (1 - -1 = 2), so
this results in a maximum "quality" value of 1.0 (worst quality).
The two search windows are exhaustively compared with two loops, one
embedded in the other. For each loop start/end candidate a quality
factor is calculated. The quality value is calculated from the sum of
the absolute differences of the sample points surrounding the loop
points (analysis window size) multiplied individually by values in the
analysis_window[] array.
C code:
/* Calculate fraction divisor */
for (i = 0, fract = 0, pow2 = 1; i <= half_window; i++, pow2 *= 2)
{
fract += pow2;
if (i < half_window) fract += pow2;
}
/* Even windows are asymetrical, subtract 1 */
if (!(analysis_window & 1)) fract--;
/* Calculate values for 1st half of window and center of window */
for (i = 0, pow2 = 1; i <= half_window; i++, pow2 *= 2)
anwin_factors[i] = pow2 * 0.5 / fract;
/* Copy values for 2nd half of window */
for (i = 0; half_window + i + 1 < analysis_window; i++)
anwin_factors[half_window + i + 1] = anwin_factors[half_window - i - 1];
for (win1 = 0; win1 < win1size; win1++)
{
startpos = win1start + win1;
for (win2 = 0; win2 < win2size; win2++)
{
endpos = win2start + win2;
for (i = 0, quality = 0.0; i < analysis_size; i++)
{
diff = sample_data[startpos + i - half_analysis_size]
- sample_data[endpos + i - half_analysis_size];
if (diff < 0) diff = -diff;
quality += diff * analysis_window[i];
}
...
}
}
Optimization ideas:
If a single value (integer or float) could be calculated which
"describes" an analysis_size window of audio, then these values could
be calculated for each window individually and then used as a
pre-filter prior to performing the calculation using the
analysis_window. I tested this using a simple signal power
calculation sum (sample_val * sample_val) and it increased the speed
substantially, but threw away some good loop candidates.
Thanks in advance for any ideas on this subject. Seems like the
result of this algorithm could be something useful to add to the
music-dsp code archive.
Best regards,
Joshua Element Green
I'm the author of a SoundFont instrument editing application called
Swami (http://swami.sourceforge.net). A while back an interested
developer added a loop finding algorithm which I integrated into the
application. This feature is supposed to generate a list of start/end
loop points which are optimal for "seamless loops".
The original algorithm was based on autocorrelation. There were many
bugs in the implementation and I was having trouble understanding how
it functioned, so I wrote a new algorithm which currently does not use
autocorrelation. The new algorithm seems to come up with good
candidates for "seamless loops", but is slower than the old algorithm
by a factor of 5, but at least it does not suffer from bugs, which
often resulted in unpredictable results.
I have limited knowledge in the area of DSP, so I thought I would seek
advice on this list to determine if the new algorithm makes sense, if
anyone has any ideas on ways to optimize it or if there are better
ways of tackling this task.
First off, is autocorrelation even a solution for this? That is what
the old algorithm used and it seemed to me that sample points closer
to the loop start or end should be given higher priority in the
"quality" calculation, which was not done in that case.
Inputs to the algorithm:
float sample_data[]: Audio data array of floating point samples
normalized to -1.0 to 1.0
int analysis_size: Size of analysis window (number of points compared
around loop points, the loop point is in the center of the window).
int half_analysis_size = analysis_size / 2 which is the center point
of the window (only really center on odd analysis_size values)
int win1start: Offset in sample_data[] of 1st search window (for loop
start point).
int win1size: Size of 1st search window (for loop start point).
int win2start: Offset in sample_data[] of 2nd search window (for loop
end point).
int win2size: Size of 2nd search window (for loop end point).
Description of the new algorithm:
A multiplication "window" array of floats (lets call it
analysis_window[]) is created which is analysis_size in length and
contains a peak value in the center of the window, each point away
from the center is half the value of its closer neighbor and all
values in the window add up to 0.5. 0.5 was chosen because the
maximum difference between two sample points is 2 (1 - -1 = 2), so
this results in a maximum "quality" value of 1.0 (worst quality).
The two search windows are exhaustively compared with two loops, one
embedded in the other. For each loop start/end candidate a quality
factor is calculated. The quality value is calculated from the sum of
the absolute differences of the sample points surrounding the loop
points (analysis window size) multiplied individually by values in the
analysis_window[] array.
C code:
/* Calculate fraction divisor */
for (i = 0, fract = 0, pow2 = 1; i <= half_window; i++, pow2 *= 2)
{
fract += pow2;
if (i < half_window) fract += pow2;
}
/* Even windows are asymetrical, subtract 1 */
if (!(analysis_window & 1)) fract--;
/* Calculate values for 1st half of window and center of window */
for (i = 0, pow2 = 1; i <= half_window; i++, pow2 *= 2)
anwin_factors[i] = pow2 * 0.5 / fract;
/* Copy values for 2nd half of window */
for (i = 0; half_window + i + 1 < analysis_window; i++)
anwin_factors[half_window + i + 1] = anwin_factors[half_window - i - 1];
for (win1 = 0; win1 < win1size; win1++)
{
startpos = win1start + win1;
for (win2 = 0; win2 < win2size; win2++)
{
endpos = win2start + win2;
for (i = 0, quality = 0.0; i < analysis_size; i++)
{
diff = sample_data[startpos + i - half_analysis_size]
- sample_data[endpos + i - half_analysis_size];
if (diff < 0) diff = -diff;
quality += diff * analysis_window[i];
}
...
}
}
Optimization ideas:
If a single value (integer or float) could be calculated which
"describes" an analysis_size window of audio, then these values could
be calculated for each window individually and then used as a
pre-filter prior to performing the calculation using the
analysis_window. I tested this using a simple signal power
calculation sum (sample_val * sample_val) and it increased the speed
substantially, but threw away some good loop candidates.
Thanks in advance for any ideas on this subject. Seems like the
result of this algorithm could be something useful to add to the
music-dsp code archive.
Best regards,
Joshua Element Green