Hi,
My input is microphone. Then , I need the second option.
Could you give me reference with example and code so that I can understand how to implement it ?
Thanks,
Alex
2018-05-29 17:04 GMT+03:00 Eder Souza <***@gmail.com <mailto:***@gmail.com> >:
WSOLA is an Time Domain algorithm, Pitch shifters can works in time domain too, there are some ways to do this...
WSOLA is just one time-scaler, but you can pitch shift combining WSOLA and Resample:
- Change the time using WSOLA (ex. time scale by 2.0)
- Use some interpolation(ex. resample scale by 1/2.0=0.5 )
The steps example time scale your signal by one factor of two(2.0) and then you resample the time scaled signal by 0.5, this give you one pitch shifted signal, that is one example of pitch shift that not works in frequency domain (important this not keep the formants)...
You want use WSOLA in real-time to do Time-Scale ? or use WSOLA in real-time to apply resample after apply Time-Scale(pitch shift) ? What is your input data (audio files or microphone) ?
The first option can not be possible in real-time using microphone input.
If you want the first option using audio files as input, you can control the time-scale factor at real time, changing the tempo of the output audio and listening, do you need build a ring buffer to control how read and write data position of your input/output .
The second option using audio files or microphones can be done using a ring buffer too, for microphone input do you need save some data to precess, this can give you some delay output.
So what option are you trying ?
Regards,
Eder
âªâ« â«âª
â â
â â
â â â â â â
â â
â
â â
â
Sent From The Moon and Written With My Thumbs !
On Tue, May 29, 2018 at 6:22 AM, Alex Dashevski <***@gmail.com <mailto:***@gmail.com> > wrote:
Hi,
From what I understood, WSOLA is algorithm that should work on Time domain. Pitch shifting is a technique that should work on Frequency domain.
Thus, I don't understand your answer.
Could you explain in a more details what I need to do ?
Thanks,
Alex
2018-05-29 12:04 GMT+03:00 robert bristow-johnson <***@audioimagination.com <mailto:***@audioimagination.com> >:
Do you mean as a time-scaler or as a pitch-shifter?
WSOLA can and does work real-time in a pitch-shifter. But a time-scaler can't be real-time whether it's WSOLA or a phase-vocoder. Because a real-time process requires the output to process the input indefinitely without the input and output pointers colliding or diverting away from each other indefinitely.
--
r b-j ***@audioimagination.com <mailto:***@audioimagination.com>
"Imagination is more important than knowledge."
-------- Original message --------
From: Alex Dashevski <***@gmail.com <mailto:***@gmail.com> >
Date: 5/28/2018 10:22 PM (GMT-08:00)
To: robert bristow-johnson <***@audioimagination.com <mailto:***@audioimagination.com> >, music-***@music.columbia.edu <mailto:music-***@music.columbia.edu>
Subject: Re: [music-dsp] WSOLA
Hi,
I mean WSOLA on RealTime. How can I proof to my instructor that it's not possible ?
Why do I need to do resampling ? Android sample and resample in the same frequency(in my case,48Khz). Maybe, do you mean to do a processing with 8Khz(subsample) ?
I also want to achieve the high performance and minimum latency.
How can I proof to my instructor that correct way to implement is pitch shifting and not WSOLA on RealTime?
Thanks,
Alex
2018-05-29 4:19 GMT+03:00 robert bristow-johnson <***@audioimagination.com <mailto:***@audioimagination.com> >:
---------------------------- Original Message ----------------------------
Subject: Re: [music-dsp] WSOLA
From: "Alex Dashevski" <***@gmail.com <mailto:***@gmail.com> >
Date: Sun, May 27, 2018 2:56 pm
To: ***@mobileer.com <mailto:***@mobileer.com>
music-***@music.columbia.edu <mailto:music-***@music.columbia.edu>
--------------------------------------------------------------------------
Post by Alex DashevskiHi,
I don't understand your answer.
I have already audio echo application on Android. Buffer size and Frequency
sample infuence on latency.
Could you explain me how implement WSOLA on Real-time ? It is a bit more
difficult .
yes WSOLA is a little difficult, but less difficult than a phase-vocoder.
now, when you say "WSOLA" and "Real-time" in the same breath, do you mean a pitch shifter? not a time-scaler, right? because pitch shifting can be done real-time, but time-scaling has to be done with an input buffer (with some number of samples) getting made into a longer (more samples) or shorter (fewer samples) buffer with the same sample rate. that can't be done on an operation the runs on indefinitely, even with a long throughput delay. eventually the input and output pointers will collide.
but you can combine time-scaling and resampling (the latter is mathematically well defined) to get pitch shifting that can run on forever. one operation increases the number of samples and the other reduces the number of samples exactly in reciprocal proportion. so the number of samples coming out every buffer of time is the same as the number going in.
now the "S" in acronym stands for "Similarity", so you have to position the windows in the input waveform to be similar to the waveform in the output. the waveform in the first-half of the input window should match the similarity to the waveform in the last-half of the output window of the previous frame. normally the frame hop is exactly half of the window width. and the window shape should be complementary like a Hann window.
i believe that 240 sample buffer in the Android is an input/output sample buffer for the media I/O. you can't really do anything with that buffer except pull in input samples and push out output samples. you will have to (using whatever programming environment one uses to make Android apps) allocate memory and create your own buffers to hold about 100 ms of sound. in that buffer, you will use a technique called AMDF, ASDF, or autocorrelation to measure waveform similarity. your input frame hop distance (which has both integer and fractional parts) is the output frame hop size times the reciprocal of the time-stretch factor. so, if you're time-stretching (instead of time-compressing), your input frame will advance more slowly than your output frame, that increases the number of samples. but in that output buffer, you will resample (interpolate) with a step-size that is time-stretch factor (do this only for output samples that have already been overlapped and added) thus reducing the final output number of samples back to the original number. you will allow some jitter on the input window that is informed by the result of the waveform similarity analysis.
that's how you do WSOLA, as best as i understand it.
--
r b-j ***@audioimagination.com <mailto:***@audioimagination.com>
"Imagination is more important than knowledge."
_______________________________________________
dupswapdrop: music-dsp mailing list
music-***@music.columbia.edu <mailto:music-***@music.columbia.edu>
https://lists.columbia.edu/mailman/listinfo/music-dsp
_______________________________________________
dupswapdrop: music-dsp mailing list
music-***@music.columbia.edu <mailto:music-***@music.columbia.edu>
https://lists.columbia.edu/mailman/listinfo/music-dsp