-
Notifications
You must be signed in to change notification settings - Fork 350
Audio: Phase Vocoder: Add new component #10541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The rename is done to avoid possible conflict with other math libraries. The change is done to prepare add of 32 bit square root function. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch adds a higher precision 32-bit fractional integer square root function to SOF math library. The algorithm uses a lookup table for initial value and two iterations with Newton-Raphson method to improve the accuracy. Both input and output format is Q2.30. The format was chosen to match complex to polar conversions numbers range for Q1.31 complex values. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch helps with more generic use of complex numbers not directly related to the FFTs domain. It prepares to add polar complex numbers format that is commonly used in frequency domain signal processing. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch adds functions sofm_icomplex32_to_polar() and sofm_ipolar32_to_complex(). In polar format the Q1.31 (real, imag) numbers pair is converted to (magnitude, angle). The magnitude is Q2.30 format and angle in -pi to +pi radians in Q3.29 format. The conversion to polar and back loses some quality so there currently is no support for icomplex16. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
The testbench quits after three file module copies without data written. The value is too low for components those accumulate more than one LL period of data before producing output or can't output at every copy. The value 10 should better ensure that testbench run is not ended too early. Currently testbench lacks the DP scheduler so, so the modules those are designed for DP can be run with this workaround. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch adds the Phase Vocoder SOF module. It provides render speed control in range 0.5-2.0x. The pitch is preserved in audio waveform stretch or shorten. The module is using a frequency domain algorithm in STFT domain to interpolate magnitude and phase of output IFFT frames from input FFT frames. The render speed can be controlled via enable/disable switch and enum control with steps of 0.1, or with finer precision with bytes control. (WIP) The STFT parameters are configured with bytes control blob. The default is 1024 size FFT with hop of 256 and Hann window. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
5c1da31 to
e816969
Compare
|
Fixed s16 processing, and did some improvements. The MCPS in MTL platform is 175 - 313 depending on selected speed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand it correctly that "vocoder" is a family of algorithms / audio processing methods, and this module implements one of them - speed control? Maybe call it vocoder_speed? Or does "Phase Vocoder" actually mean the same - only the speed processing part?
| # SPDX-License-Identifier: BSD-3-Clause | ||
|
|
||
| config COMP_PHASE_VOCODER | ||
| tristate "Phase Vocoder component" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to make it default m if LIBRARY_DEFAULT_MODULAR=
| # tests it can't use extra CONFIGs. See #9410, #8722 and #9386 | ||
| CONFIG_COMP_GOOGLE_RTC_AUDIO_PROCESSING=m | ||
| CONFIG_GOOGLE_RTC_AUDIO_PROCESSING_MOCK=y | ||
| CONFIG_COMP_PHASE_VOCODER=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you make it modular by default, then please drop =y from all ACE 3.0+ platforms
| fft_buf_ptr = fft->fft_buf; | ||
| for (j = 0; j < prev_data_size; j++) { | ||
| fft_buf_ptr->real = prev_data[j]; | ||
| fft_buf_ptr->imag = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if using *fft_buf_ptr = (struct icomplex32){.real = prev_data[j], .imag = 0} would give the compiler a better chance to optimise 64-bit writes, but maybe not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll check with profiler run! This will need as heaviest open-source component so far a lot of optimization.
| FILE *stft_debug_ifft_out_fh; | ||
| #endif | ||
|
|
||
| __cold static void phase_vocoder_reset_parameters(struct processing_module *mod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is called from .reset(), so it shouldn't be __cold?
| LOG_MODULE_REGISTER(phase_vocoder_common, CONFIG_SOF_LOG_LEVEL); | ||
|
|
||
| /* | ||
| * The main processing function for PHASE_VOCODER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to which function is this comment referring?
Need to think the name. Phase vocoder is more generic than this, the same technique also works for pitch shift. The history of this algorithm is as described by this Google AI response: _"The phase vocoder algorithm, implemented in the Short-Time Fourier Original Paper: J. L. Flanagan and R. M. Golden, "Phase Vocoder," Key Contribution: Flanagan and Golden proposed the phase vocoder as an Significance: While earlier vocoders (Dudley, 1930s) used analog Name suggestions are welcome. We could also have later in SOF maybe a more generic frequency domain modules architecture with STFT half spectrum streamed from module to module. Instead of samples we could stream spectral coefficients, e.g. in this used 32 bit (magnitude, phase) format that is quite generic for all frequency domain processing. |
WIP