Leading in the industry
Meet
ABEBA BERHANE
Adviser in AI Accountability at Mozilla Foundation and professor at Trinity College Dublin, Berhane focuses on AI data accuracy.
Achievements
Abeba Birhane’s journey into AI began with a critical gap she noticed in the field: as datasets used to train AI models grew exponentially, almost no one was scrutinizing their contents. These datasets, often scraped indiscriminately from the internet, contain harmful material that embeds biases into AI systems. Birhane, now a senior adviser in AI Accountability at Mozilla Foundation and adjunct assistant professor at Trinity College Dublin, has made it her mission to address this oversight.
Auditing AI at Scale
Working with a small team, Birhane has pioneered the field of auditing AI training datasets. The job is grueling. “Most of the time, my screen is not safe for work,” she says, noting that she can no longer work in public spaces like cafés. In her latest study, Birhane found that larger datasets are more likely to perpetuate harmful biases. Contrary to the common belief that scaling up improves AI, her research suggests the opposite. “As datasets scale, hateful content also scales,” she explains.
To counteract these issues, some AI companies have built supplementary systems like content-moderation classifiers and reinforcement learning models to filter harmful material and encourage harmless behavior. But Birhane critiques these as unsustainable solutions. “It comes at a huge cost to disenfranchised, underpaid workers, often in the so-called third world. That’s not a good solution—not for the people that have to suffer and pay for it,” she says. She also calls out the opacity of corporate practices. “Corporations like OpenAI tend to be completely closed. We don’t know how they source or detoxify their datasets, so it’s difficult to suggest solutions. The first step is to open up.”
Reluctant Pioneer
Birhane didn’t initially seek out this line of work. A cognitive scientist by training, she was drawn into the field while working alongside machine-learning researchers during her Ph.D. program. She noticed a troubling lack of attention to how data was sourced and how critical it was to model performance. Her entry into dataset auditing has resulted in a profound impact on the AI industry, urging it to confront the foundational biases in its systems. Her work serves as a rallying cry for transparency, equity, and accountability in AI development.
Want to nominate yourself or others for the LeadBoard?