Noise-aware Speech Enhancement using Diffusion Probabilistic Model

Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

Research output: Working paperPreprint

2 Downloads (Pure)

Abstract

With recent advances of diffusion model, generative speech enhancement (SE) has attracted a surge of research interest due to its great potential for unseen testing noises. However, existing efforts mainly focus on inherent properties of clean speech for inference, underexploiting the varying noise information in real-world conditions. In this paper, we propose a noise-aware speech enhancement (NASE) approach that extracts noise-specific information to guide the reverse process in diffusion model. Specifically, we design a noise classification (NC) model to produce acoustic embedding as a noise conditioner for guiding the reverse denoising process. Meanwhile, a multi-task learning scheme is devised to jointly optimize SE and NC tasks, in order to enhance the noise specificity of extracted noise conditioner. Our proposed NASE is shown to be a plug-and-play module that can be generalized to any diffusion SE models. Experiment evidence on VoiceBank-DEMAND dataset shows that NASE achieves significant improvement over multiple mainstream diffusion SE models, especially on unseen testing noises.
Original languageEnglish
PublisherArXiv
Number of pages5
DOIs
Publication statusPublished - 16 Jul 2023

Bibliographical note

5 pages, 2 figures

Keywords

  • eess.AS
  • cs.LG
  • cs.SD

Fingerprint

Dive into the research topics of 'Noise-aware Speech Enhancement using Diffusion Probabilistic Model'. Together they form a unique fingerprint.

Cite this