Spatial Audio Filtering
S. Delikaris-Manias, J. Vilkamo and V. Pulkki
Spatial filtering with microphone arrays is a technique that can be utilized to obtain the signal of a target sound source from a specific direction. Typical approaches in the field of audio underperform in practical environments with multiple sound sources and diffuse sound. In this contribution we propose a post-filtering technique to suppress the effect of interferers and diffuse sound. The proposed technique utilizes the cross-spectral estimates of the output of two beamformers to formulate a time-frequency soft masker. The beamformers' outputs are used only for parameter estimation and not for generating an audio signal. Two sets of beamformer weights, a constant and an adaptive, are applied to the microphone array signals for the parameter estimation. The weights of the constant beamformer are designed such that they provide a spatially narrow beam pattern that is time and frequency invariant, having a unity gain towards the direction of interest. The weights of the adaptive beamformer are formulated using linearly constrained optimization with the constraint of weighted orthogonality with respect to the constant beamformer weights, as well as the unity gain towards the look direction. The orthogonality constraint provides diffuse sound suppression while the unity gain distortionless response. The cross spectrum of these two beamformers provides the target energy at a given look direction for the post filter. The study focuses on compact microphone arrays with which the typical beamforming techniques feature a trade-off between noise amplification and spatial selectivity, especially in the low frequency region. The proposed method is evaluated with instrumental measures and listening tests under different reverberation times, in dual and multi-talker scenarios. The evaluation shows that the proposed method provides a better performance when compared with a previous state-of-the-art spatial filter based on cross-pattern coherence, a linearly constrained beamformer and a Wiener post-filter.
S. Delikaris-Manias and V. Pulkki
A parametric spatial filtering algorithm with a fixed beam direction is proposed in this paper. The algorithm utilizes the normalized cross-spectral density between signals from microphones of different orders as a criterion for focusing in specific directions. The correlation between microphone signals is estimated in the time-frequency domain. A post-filter is calculated from a multichannel input and is used to assign attenuation values to a coincidentally captured audio signal. The proposed algorithm is simple to implement and offers the capability of coping with interfering sources at different azimuthal locations with or without the presence of diffuse sound. It is implemented by using directional microphones placed in the same look direction and have the same magnitude and phase response. Experiments are conducted with simulated and real microphone arrays employing the proposed post-filter and compared to previous coherence-based approaches, such as the McCowan post-filter. A significant improvement is demonstrated in terms of objective quality measures. Formal listening tests conducted to assess the audibility of artifacts of the proposed algorithm in real acoustical scenarios show that no annoying artifacts existed with certain spectral floor values.
Spatial Audio Reproduction
J. Vilkamo and S. Delikaris-Manias
Adaptive perceptual spatial sound reproduction techniques that employ a parametric model describing the properties of the sound field can reproduce spatial sound with high perceptual accuracy when compared to linear techniques. On the other hand, applying a sound-field model to control the reproduced sound may compromize the perceived quality of individual channels in cases where the model does not match the sound field. An alternative parametrization is proposed that estimates directly the perceptually relevant parameters for the target loudspeaker signals without modeling the sound field. At the synthesis stage, the loudspeaker signals with the target parametric properties are generated from the microphone signals with regularized leastsquares mixing and decorrelation. It is shown through listening experiments that the proposed method provides on average the overall perceived spatial sound reproduction quality of a state-of-the- art parametric spatial sound reproduction technique, while solving the past shortcomings related to the perceived quality of the individual channels.
Parametric Binaural Rendering Utilizing Compact Microphone Arrays, ieee.icassp2015
S. Delikaris-Manias, J. Vilkamo and V. Pulkki
The spatial capture patterns according to head-related transfer-functions (HRTFs) can be approximated using linear beamforming techniques. However, assuming a fixed spatial aliasing frequency, with reduction of the number of sensors and thus the array size, the linear approach leads to an excessive amplification of the microphone noise, unless the beam patterns are made broader than determined by the HRTFs. An adaptive technique is proposed that builds upon the assumption that the binaural perception is largely determined by a set of short-time inter-aural parameters in frequency bands. The parameters are estimated from the noisy HRTF beam pattern signals as a function of time and frequency. As a result of temporal averaging, the effect of the noise is mitigated while the perceptual spatial information is preserved. Signals with higher SNR from broader patterns are adaptively processed to obtain the parameters from the estimation stage by means of least-squares optimized mixing and decorrelation. Listening tests confirmed the perceptual benefit of the proposed approach with respect to linear techniques.
Spatial Audio Effects
Parametric Spatial Audio Effects, dafx12
A. Politis, T. Pihlajamäki and V. Pulkki
Parametric spatial audio coding methods aim to represent efficiently spatial information of recordings with psychoacoustically relevant parameters. In this study, it is presented how these parameters can be manipulated in various ways to achieve a series of spatial audio effects that modify the spatial distribution of a captured or synthesised sound scene, or alter the relation of its diffuse and directional content. Furthermore, it is discussed how the same representation can be used for spatial synthesis of complex sound sources and scenes. Finally, it is argued that the parametric description provides an efficient and natural way for designing spatial effects.
S. Delikaris-Manias, J. Gómez Bolaños, J. Eskelinen, I. Huhtakallio, E. Hæggström, and V. Pulkki
A method for auralizing arbitrary radiation patterns of acoustic sources inside a real room is presented. The method utilizes laser induced breakdown (LIB) as a point source. In this study we (1) demonstrated the performance of a volumetric array of LIBs for synthesizing arbitrary radiation patterns, (2) auralized the radiation pattern of a loudspeaker and compared the measured and synthesized impulse responses in a reverberant room, and (3) evaluated the method using listening tests. The synthesized room response matched the target response well both in room response reconstruction and in listening tests.
L. McCormack, S. Delikaris-Manias, and V. Pulkki
This paper details a software implementation of an acoustic camera, which utilises a spherical microphone array and a spherical camera. The software builds on the Cross Pattern Coherence (CroPaC) spatial filter, which has been shown to be effective in reverberant and noisy sound field conditions. It is based on determining the cross spectrum between two coincident beamformers. The technique is exploited in this work to capture and analyse sound scenes by estimating a probability-like parameter of sounds appearing at specific locations. Current techniques that utilise conventional beamformers perform poorly in reverberant and noisy conditions, due to the side-lobes of the beams used for the power-map. In this work we propose an additional algorithm to suppress side-lobes based on the product of multiple CroPaC beams. A Virtual Studio Technology (VST) plug-in has been developed for both the transformation of the time-domain microphone signals into the spherical harmonic domain and the main acoustic camera software; both of which can be downloaded here.