Abstract: In this work, we propose CleanMel, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance ...
Abstract: Environmental Sound Recognition (ESR) is an essential task in audio analysis, involving the identification and classification of sounds from various environmental contexts. This study ...
This repo is the implementation of a research project aimed at enhancing Acoustic Side-Channel Attacks (ASCAs) using a novel combination of Vision Transformers (VTs) and Large Language Models (LLMs).
An open-source framework for analyzing birdsong recordings through acoustic feature extraction, dimensionality reduction, and neural audio synthesis. Transform audio signals into interactive 3D ...