Challenge Tracks
To address the challenges outlined in the introduction, the AT-ADD challenge consists of two complementary tracks focusing on robust speech deepfake detection and universal audio deepfake detection across multiple audio types.
Track1: Robust Speech Deepfake Detection
Track 1 focuses on developing robust countermeasures for speech deepfake detection under realistic deployment conditions.
In practical scenarios, speech recordings may be captured by diverse devices (e.g., mobile phones, wearable devices, and in-vehicle systems), transmitted through different channels, and affected by varying acoustic environments. These factors introduce substantial domain shifts between training data and real-world audio. In addition, emerging synthesis paradigms, such as audio large language models and neural codec-based generation frameworks, further increase the diversity of synthetic speech and pose new challenges for existing detection methods.
This track aims to evaluate the ability of countermeasures to maintain robust performance under device variability, environmental conditions, and unseen generation methods.
Track2: All-Type Audio Deepfake Detection
Track 2 extends the task beyond speech to all types of audio.
While existing research often develops dedicated detectors for specific audio types, real-world audio content can belong to any type, including: speech, sound, singing voice and music.
This track aims to promote the development of universal audio deepfake countermeasures capable of detecting synthetic audio across diverse audio types and synthesis mechanisms within a unified framework.
Participants are encouraged to develop countermeasures that can generalize across diverse audio types and unseen generation methods.