Third-Order Tensor Decomposition Based Multichannel Linear Prediction for Robust Dereverberation

George, Nithin V.

doi:10.1109/IWAENC61483.2024.10694189

Third-Order Tensor Decomposition Based Multichannel Linear Prediction for Robust Dereverberation

Source

2024 18th International Workshop on Acoustic Signal Enhancement Iwaenc 2024 Proceedings

Date Issued

2024-01-01

Author(s)

Yadav, Shekhar Kumar

George, Nithin V.

DOI

10.1109/IWAENC61483.2024.10694189

Abstract

Reverberation is one of the major causes of speech degradation. The popular weighted prediction error (WPE) technique performs dereverberation by estimating the late room reflections using a multi-channel prediction filter. However, the length of the prediction filter in each short-time-Fourier-transform (STFT) band must be sufficiently long to model the late reverberation component accurately. This leads to inverting a large matrix in every frequency bin, making the WPE method computationally expensive. The WPE method is also vulnerable to additive noise. To tackle these issues, we present a computationally efficient dereverberation technique in this work. We decompose the long prediction filter into three smaller sub-filters using third-order tensor decomposition. One sub-filter acts as a spatial filter, while the other two act as temporal prediction filters. We then develop an iterative algorithm to get optimal solutions for all three sub-filters. The spatial filter is optimized as a weighted distortionless beamformer to deal with noise, while the temporal filters are optimized as weighted Wiener filters. Since the lengths of the sub-filters are smaller, the respective covariance matrices are computationally easier to invert, leading to an efficient algorithm. Simulation results show that the proposed algorithm is robust to noise and outperforms the current WPE based algorithms in terms of dereverberation.

Unpaywall

URI

https://d8.irins.org/handle/IITG2025/29073

Subjects

Dereverberation | Microphone array | Multichannel linear prediction | Speech enhancement | Tensor decomposition