audio datasets | Oxen.ai

Audio

These audio datasets help train and evaluate computers ability to understand, interpret and manipulate audio. Using digital audio from microphones and deep learning models, machines can accurately identify and classify sounds — and then react to what they “hear.”

LJ-Speech-Dataset

This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books. A transcription is provided for each clip. Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.

Audio Speech Transcription

3.8 gb

313K5

Updated: 3 years ago

MiniSpeechCommands

Audio Audio Classification

252.5 mb

88K1

Updated: 3 years ago

Speaker-Recognition

Audio Audio Classification

258.2 mb

17.5K

Updated: 3 years ago