Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings

Verma, VinayVinayVermaKhanna, NitinNitinKhanna2025-08-312025-08-312021-06-0110.1007/s11042-020-10205-z2-s2.0-85099090489https://d8.irins.org/handle/IITG2025/25402With the rapid increase in user-generated multimedia content, extensive outreach over social media, and their potential in critical applications such as law enforcement, sourcey identification from re-compressed and noisy multimedia are of great importance. This paper proposes a system for speaker-independent cell-phone identification from recorded audio. This system is capable of dealing with test audio with different speech content and a different speaker compared to the training audio. Each recorded audio has the device fingerprint implicitly embedded in it, which encourages us to design a CNN-based system for learning the device-specific signatures directly from the magnitude of discrete Fourier transform of the audio. This paper also addresses the scenario where the recorded audio is re-compressed due to efficient storage and network transmission requirements, which is a common phenomenon in this age of social media. The scenario of the cell-phone classification from the audio recordings in the presence of additive white Gaussian noise is addressed as well. We show that our proposed system performs as well as the state-of-art systems for the speaker-dependent case with clean audio recordings and exhibits much higher robustness in the speaker-independent case with clean, re-compressed, and noisy audio recordings.falseAdditive white Gaussian noise | Audio forensics | Audio re-compression | Cell-phone identification | Convolutional neural network | Device signatureSpeaker-independent source cell-phone identification for re-compressed and noisy audio recordingsArticle1573772123581-23603June 202112arJournal14