### Mode Extraction Techniques

The goal of resonator factoring is to identify and remove the least-damped resonant modes of the impulse response. In principle, this means ascertaining the precise resonance frequencies and bandwidths associated with each of the narrowest ``peaks'' in the resonator frequency response, and dividing them out via inverse filtering, so they can be implemented separately as resonators in series. If in addition the amplitude and phase of a resonance peak are accurately measurable in the complex frequency response, the mode can be removed by complex spectral subtraction (equivalent to subtracting the impulse-response of the resonant mode from the total impulse response); in this case, the parametric modes are implemented in a parallel bank as in [66]. However, in the parallel case, the residual impulse response is not readily commuted with the string.

In the inverse-filtering case, the factored resonator components are
in cascade (series) so that the damped modes left behind may be
commuted with the string and incorporated in the excitation table by
convolving the residual impulse response with the desired string
excitation signal. In the parallel case, the damped modes do not
commute with the string since doing so would require somehow canceling
them in the parallel filter sections. In principle, the string would
have to be duplicated so that one instance can be driven by the
residual signal with no body resonances at all, while the other is
connected to the parallel resonator bank and driven only by the
natural string excitation without any commuting of string and
resonator. Since duplicating the string is unlikely to be
cost-effective, the impulse response of the high-frequency modes can
be commuted and convolved with the string excitation as in the series
case to obtain *qualitative* results. The error in doing this is
that the high-frequency modes are being multiplied by the parallel
resonators rather than being added to them.

Various methods are available for estimating the mode parameters for inverse filtering:

- Amplitude response peak measurement

- Weighted digital filter design

- Linear prediction

- Sinusoidal modeling

- Late impulse-response analysis

#### Amplitude response peak measurement

The longest ringing modes are associated with the narrowest bandwidths. When they are important resonances in the frequency response, they also tend to be the tallest peaks in the frequency response magnitude. (If they are not the tallest near the beginning of the impulse response, they will be the tallest near the end.) Therefore, one effective technique for measuring the least-damped resonances is simply to find the precise location and width of the narrowest and tallest spectral peaks in the measured amplitude response of the resonator. The center-frequency and bandwidth of a narrow frequency-response peak determine two poles in the resonator to be factored out. Expressing a filter in terms of its poles and zeros is one type of ``parametric'' filter representation, as opposed to ``nonparametric'' representations such as the impulse response or frequency response. Prony's method [449,297,273] is one well known technique for estimating the frequencies and bandwidths of sums of exponentially decaying sinusoids (two-pole resonator impulse responses).

In the factoring example presented in §8.8.6, the frequency and bandwidth of the main Helmholtz air mode are measured manually using an interactive spectrum analysis tool. However, it is a simple matter to automate peak-finding in FFT magnitude data. (See, for example, the peak finders used in sinusoidal modeling, discussed a bit further in §8.8.1 below.)

#### Weighted digital filter design

Many methods for digital filter design support spectral weighting functions
that can be used to focus in on the least-damped modes in the frequency
response. One is the *weighted equation-error* method which is
available in the matlab `invfreqz()` function (§8.6.4).
Figure 8.13 illustrates use of it. For simplicity, only one
frequency-response peak plus noise is shown in this synthetic example.
First, the peak center-frequency is measured using a quadratically
interpolating peak finder operating on the dB spectral magnitude. This is
used to set the spectral weighting function. Next, `invfreqz()` is
called to design a two-pole filter having a frequency response that
approximates the measured data as closely as possible. The weighting
function is also shown in Fig.8.13, renormalized to
overlay on the scale of the plot. Finally, the amplitude response of the
two-pole filter designed by the equation-error method is shown overlaid in
the figure. Note that the agreement is quite good near the peak which is
what matters most. The interpolated peak frequency measured initially in
the nonparametric spectral magnitude data can be used to fine-tune the
pole-angles of the designed filter, thus rendering the equation-error
method a technique for measuring only the peak bandwidth in this case.
There are of course many, many techniques in the signal processing
literature for measuring spectral peaks.

#### Linear prediction

Another well known method for converting the least-damped modes into parametric form is Linear Predictive Coding (LP) followed by polynomial factorization to obtain resonator poles. LP is particularly good at fitting spectral peaks due to the nature of the error criterion it minimizes [428]. The poles closest to the unit circle in the plane can be chosen for the ``ringy'' part of the resonator. It is well known that when using techniques like LP to model spectral peaks for extraction, orders substantially higher than twice the number of spectral peaks should be used. The extra degrees of freedom in the LP fit give more poles for modeling spectral detail other than the peaks, allowing the poles modeling the peaks to fit them with less distraction. On the other hand, if the order chosen is too high, sometimes more than two poles will home in on the same peak. In some cases, this may be appropriate since the body resonances are not necessary resolvable so as to separate the peaks, especially at high frequencies.

#### Sinusoidal modeling

Another way to find the least-damped mode parameters is by means of an
intermediate *sinusoidal model* of the body impulse response, or,
more appropriately, the *energy decay relief* (EDR) computed
from the body impulse response (see §3.2.2).
Such sinusoidal models have been used to determine the string loop filter in
digital waveguide strings models.
In the case of string loop-filter estimation,
the sinusoidal model is applied to the impulse response (or ``pluck''
response) of a vibrating string or acoustic tube. In the present
application, it is ideally applied to the EDR of the body impulse response (or
``hammer-strike'' response).

Since sinusoidal modeling software
(*e.g.* [424]) typically quadratically interpolates the peak
frequencies, the resonance frequencies are generally quite accurately
estimated provided the frame size is chosen large enough to span many
cycles of the underlying resonance.

The sinusoidal amplitude envelopes yield a particularly robust measurement
of resonance bandwidth. Theoretically, the modal decay should be
exponential. Therefore, on a dB scale, the amplitude envelope should decay
*linearly*. Linear regression can be used to fit a straight line to
the measured log-amplitude envelope of the impulse response of each
long-ringing mode. Note that even when amplitude modulation is present due
to modal couplings, the ripples tend to average out in the regression and
have little effect on the slope measurement. This robustness can be
enhanced by starting and ending the linear regression on local maxima in
the amplitude envelope. A method for estimating modal decay parameters
in the presence of noise is given in [125,234,235].

Below is a section of matlab code which performs linear regression:

function [slope,offset] = fitline(x,y); %FITLINE fit line 'y = slope * x + offset' % to column vectors x and y. phi = [x, ones(length(x),1)]; p = phi' * y; r = phi' * phi; t = r\p; slope = t(1); offset = t(2);

#### Late impulse-response analysis

All methods useable with inverse filtering can be modified based on the observation that late in the impulse response, the damped modes have died away, and the least-damped modes dominate. Therefore, by discarding initial impulse-response data, the problem in some sense becomes ``easier'' at the price of working closer to the noise floor. This technique is most appropriate in conjunction with the inverse filtering method for mode extraction (discussed below), since for subtraction, the modal impulse response must be extrapolated back to the beginning of the data record. However, methods used to compute the filter numerator in variations on Prony's method can be used to scale and phase-align a mode for subtraction [428,297].

One simple approximate technique based on looking only at the late impulse response is to take a zero-padded FFT of the latest Hanning-windowed data segment. The least-damped modes should give very clearly dominant peaks in the FFT magnitude data. As discussed above, the peak(s) can be interpolated to estimate the mode resonance frequency, and the bandwidth can be measured to determine the time-constant of decay. Alternatively, the time-constant of decay can be measured in the time domain by measuring the decay slope of the log amplitude envelope across the segment (this time using a rectangular window). Since the least-damped mode is assumed to be isolated in the late decay, it is easy to form a pitch-synchronous amplitude envelope.

**Next Section:**

Inverse Filtering

**Previous Section:**

Further Reading in Commuted Synthesis