Working AI: Building Bespoke Models With Jade Abbott

Title: Senior Machine Learning Engineer, Retro Rabbit

Location: Johannesburg, South Africa

Education: BS and MS of Computer Science, University of Pretoria

Favorite ML area: Natural Language Processing

Favorite ML researchers: Emily Bender, Timnit Gebru, Julia Kreutzer, Vukosi Marivate, Graham Neubig, Sebastian Ruder


Jade Abbott turned a childhood desire for a robotic best friend into a career training computers to understand human language. Having studied AI in school, she got her first job coding traditional software, but she found ways to apply machine learning to her work until that became central to her role. In the meantime, she founded an open source project to train NLP models on African languages. She spoke with us about her fascination with language, the importance of community, and how to incorporate wine into your learning practice.

How did you get interested in AI?

When I was growing up, my father was very much into science fiction and my mother used to be a developer. So I was always around the idea of AI from a very young age. I used to read a lot of Isaac Asimov and watch a lot of Star Trek with my dad, and obviously I fell in love with Lieutenant Commander Data. So, as a child, I already knew that I wanted to build a robot friend that I could talk to. I went to school for computer science and first chose to study robotics. Along the way, I realized it wasn’t a robot I was interested in but natural language processing and communication.

How did you become a machine learning engineer?

While I was doing my master’s, around seven years ago, I started working as a software engineer. Back then machine learning wasn’t really a thing like it is now. The deep learning revolution was just getting started. There, I got to work on a project where they did have quite a hard NLP problem. I was like, wait, I’ve been keeping up to date with all that’s been going on with this, and we actually need to use some deep learning here.

What was that project?

It was for a startup called Kalido, which matches people with job opportunities. They needed a piece of natural language processing that could do semantic relatedness. So if I’m looking for someone to make a website, and somebody’s profile says they are a JavaScript developer, the system will know that these concepts refer to the same thing. 

We started off with really basic, dummy rules: If a word occurred in both sentences, that was a match. However, that doesn’t work for more advanced semantic matches. And so we  brought in a number of deep learning models. It was quite difficult, because it was pre-transformer. There was very little literature back then, and we had very little data. So we had to improvise on how to make models robust, how to ensure reproducibility, how to put machine learning into the software engineering process.

What was it like witnessing the transformer come on the scene?

It was really crazy. Attention Is All You Need was published, and six or eight months later Bert came out. I was working on a project using an older LSTM-RNN model and I was about to give up. Then  I pulled in Bert code and fine-tuned the model, and it just destroyed the old model I had been working with for a long time, beating it by 10 percent. From there, it took about three weeks to go into production.

How would you describe Retro Rabbit’s work, and how does machine learning fit in?

We’re software consultants and we build bespoke things for clients. We always say, we’ll solve whatever your problem is with whatever technology we have at our disposal. We’ve hired a lot of people with an interest in machine learning, so we wind up utilizing those skills. And just as often, a client wants us to use machine learning, and we’ill look at their problem and see they only need a database.

What do you like most about your job?

I like the fact that I always feel like I’m working on a problem that hasn’t quite been solved yet. When we started, there weren’t many people trying to put these models into production. So figuring out the process around that was really, really, really new. And I continue to enjoy those sorts of challenges. The other part I love is that I have to keep up with research to be able to do my job. I have a love-hate relationship with research. The job forces me to read, go to conferences, and try new things.

And what do you find most challenging?

I get really obsessed with ensuring that users will find the app useful, and that pulls me away from my favorite thing, which is training new models. At this point in my life, I suppose it’s correct for me to be doing more of that managerial work, making sure we’re delivering the right thing. But in that role, I’m not actually doing the building, and that’s a downside.

What are you working on? 

There’s stuff I can’t talk about yet because it’s early days and I don’t want to break an NDA. Some projects are around employee wellness, a couple more are insurance-based applications. 

I also founded a research initiative called [Masakhane] (“we build together” in isiZulu), which is a grassroots collaborative online effort to produce NLP for African languages. It’s completely open source, open research, open data, and has over 150 members. 

Why is NLP for African languages important?

People have a strange belief that if an NLP tool works in English, it will work in other languages, which is not proven at all. African researchers might want to publish a model that’s useful in their own language, but you get comments from reviewers saying their model should be run in English or it won’t be impactful. There is a horrifying state of the world right now where, if you can’t speak English correctly or if you speak it with an African accent, you’re deemed less intelligent. This bias becomes replicated in the tools we use, and the last thing I want is for machine learning to reinforce that. Which is why I think we need to start supporting African languages. 

These models would also help people communicate across the world. And I think the ability to communicate well is really important. 

What’s the AI scene like in Johannesburg? What about South Africa more broadly? 

Taking in Africa as a whole, there are particular hotspots. South Africa is one of them. Nigeria, Kenya, and Rwanda are others. In these places, you have universities, startups, and machine learning organizations. It’s very grassroots: Let’s found a startup, let’s start a meetup. There’s a DIY element to it that I love. This is particularly evident in the Nigerian machine learning scene. 

South Africa has universities that offer the right degrees. There’s also a lot more funding from the government. We’ve got industry conferences focused on startups, like AI Expo, and big research conferences such as Deep Learning Indaba, which is actually Africa-wide but happened twice in South Africa. 

Here in Johannesburg, it feels like a community. You’re surrounded by others who are interested in machine learning. There’s always a lot going on. 

Following on that, what are your thoughts about diversity in the machine learning industry? 

There are different kinds of bias. There’s bias that we talk about often, in terms of individual racism and sexism, but it also emerges as institutional bias. Everyone at every part of the process has to accept that it’s their responsibility to try to do better. As soon as we all stop pointing fingers and take a little responsibility, we could make huge progress as a group. If you have some level of power, like working for a well-known machine learning group, you should self-reflect on what levers you have at your disposal to reduce bias and improve diversity. 

I’ve got a great story about that, actually. At ICLR, we had a workshop about NLP for African languages. Ethiopian NLP researcher Surafel M. Lakew mentioned how difficult it was for him to attend graduate school in the U.S. because his home city didn’t have a testing facility where he could take the GRE. Graham Neubig, who studies NLP at Carnegie Mellon University and was in attendance, went back and spoke to his university about this. The university agreed to make the GRE optional, and Neubig was able to get registration fees waived for applicants from diverse backgrounds. 

How do you continue learning?

I’ve had various strategies for staying on top of the machine learning literature, all of which tend to fail in one way or another. The most useful thing I’ve done is surround myself with other people in the community. That way learning is always shoved in my face. I also love to attend conferences, where I can get a lot of learning done at once. 

Another thing I’ve done, which obviously I can’t do now because of Covid, is have a reading group. A friend or two and I will read a paper, usually one that’s classic, popular, or controversial, then go out for dinner and wine and discuss it. 

You’ve done a bit of mentoring. What makes a good mentor?

To be a good mentor, it takes a lot of listening and the ability to be very humble. You have to be open to the possibility that the solution you’re proposing might not be right. I’ve also found that empathy is an important trait. For mentees, the most important thing is to be direct. You have to ask for what you need, whatever that might be, and be specific about what you want feedback on. You have to drive the relationship.

What problems do you think AI is well positioned to solve?

Probably my favorite area is education. For instance, a number of studies show that a child will learn mathematics better if they’re taught in the language they use at home. We can use NLP to produce content in the language people understand best.

What advice do you have for people who are hoping to break into AI?

You should definitely take courses and read books, but you can’t expect that to lead directly to a job in machine learning. It’s much easier to find work as a junior developer or data analyst. Go for something adjacent that lets you get exposure. You can learn from the people around you and dabble in machine learning, and maybe show your employers how they can modernize into a fully fledged data science system. Find a job that’s close to what you want to do and then find ways to apply machine learning to your work. If you wait for the perfect machine learning job to come along, you’re probably never going to find it.