VocID is a dataset of short, command-based utterances collected for research on speaker verification and voice spoofing in realistic, access-control scenarios.
It includes over 10,000 recordings from 30 Italian-native speakers, captured under controlled acoustic conditions using multiple mobile devices.
The dataset is particularly suited for evaluating the effectiveness of deepfake attacks, biometric verification pipelines, and automatic speech recognition (ASR) in voice-controlled systems (e.g., smart homes, banking interfaces).
- 30 speakers (18 male, 12 female)
- 8 fixed voice commands (in Italian and English)
- Reading tasks and multi-command sessions
- Recorded with Google Pixel 3a and iPhone 14
- 16kHz, 16-bit PCM WAV format
- Annotated metadata: speaker ID, gender, device, language
Due to the inclusion of human voice data, the VocID dataset is distributed only for non-commercial research purposes and requires an access request.
To obtain the dataset follow the instructions below.
- Download the License Agreement (PDF))
- Compile and send us ([email protected]) the License Agreements signed by the Requestor of the data and by a legal representative of you institution. The document must be certified by the stamp of your company or institution or by an electronic signature.