Google introduced Voice Search support for Zulu and Afrikaans, as well as South African-accented English. “The development of Natural Language Processing (NLP) tech in these languages is critical for enabling info access for everybody. Indeed, these technologies have potential to break language barriers.
With Prof. Barnard’s team, we collected acoustic data in 3 languages, developed lexicons and grammars, and Charl and others develop 3 Voice Search systems. A team of language specialists traveled to several cities collecting audio samples from hundreds of speakers in multiple acoustic conditions such as street noise, background speech, etc. Speakers were asked to read typical search queries into an Android app specifically designed for audio data collection,” stated Google.
“For Zulu, we faced additional challenge of few text sources on the web. We often analyze search queries from local versions of Google to build lexicons and language models. However, for Zulu there weren’t enough queries to build a useful language model. Furthermore, since it has few online data sources, native speakers have learned to use a mix of Zulu and English when searching for info on web. So for Zulu Voice Search product, we had to build a truly hybrid recognizer, allowing free mixture of both languages. Our phonetic inventory covers both English and Zulu and our grammars allow natural switching from Zulu to English, emulating speaker behavior,” explains Google.