Quantcast
Channel: Blog – Center for Data Innovation
Viewing all articles
Browse latest Browse all 1154

An Open Dataset for Multilingual Speech Research 

$
0
0

Facebook AI has released Multilingual LibriSpeech (MLS), a multilingual audio dataset to help improve speech research in AI-powered services, such as voice assistants. MLS expands upon English-only audiobook data from LibriVox to provide more than 50,000 hours of audio across seven languages: German, Dutch, French, Spanish, Italian, Portuguese, and Polish. Additionally, MLS provides data for language-model training sets and pretrained language models that enable researchers to compare existing data on different automatic speech recognition systems. 

Get the data.

Image credit: Mahesh Patel 


Viewing all articles
Browse latest Browse all 1154

Trending Articles