How this engineer built the first Igbo voice-to-text AI model

·
July 18, 2024
·
4 min read
Ijemma Onwuzulike

There's no doubt that artificial intelligence is moving fast from ChatGPT-3.5 which blew our minds in 2022, to Sora and GPT-4o which has left people bewildered.

While we're seeing consistent breakthroughs in AI, it is safe to say Africa has not made much progress. Though some countries like Nigeria have created policies and initiatives to improve AI development they have not yielded much results.

However, American-born Ijemma Onwuzulike has built IgboSpeech, the first Igbo voice-to-text AI model Africans can identify with.

On a call with Techpoint Africa, Onwuzulike said the motivation for building the model was to make people enjoy speaking Igbo as casually as they would Spanish or French, and not only when it's being taught in a classroom.

Advertisement

This motivation got her to learn and research languages. She figured out that there were patterns and processes to learning languages.

She even found out what it took for a platform like Duolingo to compile data about different languages on its platform.

The process

Ijemma Onwuzulike
Ijemma Onwuzulike

"The end product that people see on Duolingo is just a language learning app, but behind the scenes, there is so much data input, research, and linguistic work that goes into that friendly user experience."

To make it easier for a platform like Duolingo to add Igbo as one of its languages, Onwuzulike decided to build an Igbo API in 2020.

This API would serve as a foundational infrastructure that would be useful for anyone building anything around the Igbo language.

Let the best of tech news come to you

Join 30,000 subscribers who receive Techpoint Digest, a fun week-daily 5-minute roundup of happenings in African and global tech, directly in your inbox, hours before everyone else.
Digest Subscription

Give it a try, you can unsubscribe anytime. Privacy Policy.

A language API is like a digital dictionary that provides access to information about words in that language. It allows developers to easily retrieve data related to words, including definitions, synonyms, antonyms, and examples of usage.

Building a word API is quite challenging. First, you need a massive and accurate collection of words and their meanings. This was particularly hard for Onwuzulike and her team because existing Igbo dictionaries had inconsistencies.

After sorting through these inconsistencies, the next step is to create a system that can quickly and efficiently respond to many requests from users.

This involves setting up a strong backend, adding caching to speed things up, and ensuring the system can grow as more people use it.

Despite these difficulties, the Igbo API currently has 5,119 words, 29,922 Igbo sentences, 5,151 audio recordings, 571 Igbo definitions, and 464 Igbo proverbs.

The API is already embedded with Nkowa Okwu, an Igbo language learning and course creation platform founded by Onwuzulike.

This feat is a demonstration of her technological expertise.

Currently a software engineer at Google, she has spent almost ten years of her life learning and working in tech.

"I've always been interested in technology ever since I was a kid."

This interest in technology propelled her to go and study Computer Science at Dartmouth College, Hanover, USA. She also studied Japanese and literature, which was what set her on a path to building technologies around languages.

What's more, she tinkered around with machine learning and artificial intelligence, which now serve her well.

Before she graduated from Dartmouth College she'd interned twice at eBay. She went on to work for Squarespace after graduation.

IgboSpeech

Screenshot of IgboSpeech demo website

The demo website of IgboSpeech was launched on July 1, 2024. It is an automatic speech recognition (ASR) model tailored to Igbo speech and text. It is also partially built on Igbo API. However, it has its own dataset.

According to the website, when you record an Igbo speech, it transcribes the recording into text.

The use cases Onwuzulike envisages with IgboSpeech include "automatically generating subtitles for Igbo movies, YouTube videos or even a note-taking app."

She explained that the focus on voice-to-text and not text-to-voice or even translating Igbo to English was because "it supports people who already speak Igbo.

"This would be incredibly helpful for translators, people contracted to translate things from Igbo to English or to write out large bodies of text."

She also indicated that she and her team intend to work on text-to-voice and translation down the line, but right now, voice-to-text is the best value for money.

One way Onwuzulike and her team want to get more money to build out the different features of IgboSpeech is through grants because the project is non-profit.

One fund they're currently looking at is the Lacuna Fund, which gives grants to create Antimicrobial Resistance (AMR) and Natural Language Processing (NLP).

Getting the grant will ensure that the best hands are working on IgboSpeech. Onwuzulike said the plan is to have the best software engineers, audio recorders to collect data, and Igbo linguists.

From the Igbo API to IgboSpeech, her work has received an incredibly positive reception.

"When we released Igbo API back in 2020, there were a lot of Nigerian engineers who were super excited about working on an open-source project."

Perhaps her project will inspire other Africans within and outside Africa to build and develop AI for the continent.

He's a geek, a sucker for Blockchain and an all-round tech lover. Find me on Twitter @BoluAbiodun1.
He's a geek, a sucker for Blockchain and an all-round tech lover. Find me on Twitter @BoluAbiodun1.
Subscribe To Techpoint Digest
Join thousands of subscribers to receive our fun week-daily 5-minute roundup of happenings in African and global tech, directly in your inbox, hours before everyone else.
This is A daily 5-minute roundup of happenings in African and global tech, sent directly to your email inbox, between 5 a.m. and 7 a.m (WAT) every week day! 
Digest Subscription

Give it a try, you can unsubscribe anytime. Privacy Policy.

He's a geek, a sucker for Blockchain and an all-round tech lover. Find me on Twitter @BoluAbiodun1.

Other Stories

43b, Emina Cres, Allen, Ikeja.

 Techpremier Media Limited. All rights reserved
magnifier