Skip to main content

Bridging the Gender Gap in AI: Is It Possible?

by Leila Nilipour

In March 2018, a draft biography of Canadian optical physicist Donna Strickland was rejected for inclusion on English Wikipedia. A volunteer editor considered her not “notable” enough to merit a page, per the encyclopedia’s standards. Later that year, Strickland won a Nobel Prize in Physics, becoming the third woman to do so. Her Wikipedia page was quickly created.  

The Wikimedia Foundation, host of Wikipedia, explained why the recent Nobel laureate did not have an entry in the largest online encyclopedia before receiving the prize, attributing the omission to insufficient media coverage rather than a lack of significant contributions to her field. 

Strickland’s case reflects a greater issue: Wikipedia’s gender gap. According to the Wikimedia Foundation, only 19% of all biographies on Wikipedia are about women, while about 13% of editors identify as women. A 2021 study found that women’s biographies that meet the encyclopedia’s standards are “more frequently considered non-notable and nominated for deletion compared to men’s biographies.”[2] Although activists constantly organize training workshops and edit-a-thons to reduce this gender gap, that seems not to be enough. 

The fact that women worldwide disproportionately shoulder the burden of unpaid labor likely further complicates efforts to reduce the gender gap among Wikipedia editors. I started reflecting on this during an online Wikipedia workshop with a group of Wikipedia editors focused on getting more Latin American women involved. In this process, I realized that many women, including myself, often missed weekly training sessions due to caring responsibilities. Additional research suggests that psychological factors may also discourage women from engaging as editors in the online encyclopedia.[4]  

Wikipedia’s gender gap has implications that slide surreptitiously into our daily routines, and in the new ways people are searching for information online. For example, when we use popular AI-powered tools such as Siri, Alexa or ChatGPT, whose training models integrate data from Wikipedia, the answers we get are based on biased data.[3]This is far from new. The Pitt Cyber report Who Authors the Internet claims that “the omission of women’s voices in AI is the latest manifestation of a long history of underrepresenting women in data collection and analysis.”  

Can we do something about it? Considering what happened to Donna Strickland before receiving the Nobel, I thought that one effective approach could be to organize edit-a-thons that actively involve journalists. Through their reporting, media professionals are important producers of the secondary sources needed to be considered “notable” in Wikipedia. Engaging them is one step towards raising awareness about the significance of featuring underrepresented groups in their narratives and helping shift the tide toward less biased responses in artificial intelligence tools that use Wikipedia in their training models.  

As it turns out, it is much more complicated than thatalthough I still think an edit-a-thon with journalists wouldn’t hurt. According to an article by Ravit Dotan, founder of TechBetter, a consultancy focusing on responsible AI, there are multiple layers to data bias in Generative AI beyond the gender gap. For instance, the data used for training typically comes from Western countries, so it doesn’t entirely represent reality outside those geographies. However, Rotan argues that providing representative data to the algorithm isn’t enough either, as “the design choices engineers make when building the algorithm” play a critical role in reinforcing or mitigating bias. [6]  

So, what is left for us, the mere mortals who don’t design algorithms? In another article, Dotan offers several suggestions on ways that we can proactively reduce bias when using chatbots. One idea is to employ “anti-bias prompts” before asking for an answer, such as “don’t make any stereotypical or biased assumptions.” [7]  

Meanwhile, Elise Silva, the Director of Policy Research at Pitt Cyber and one of the minds behind Who Authors the Internet, suggests that we follow legislation or public policy surrounding AI, given that states can regulate it. “Calling your legislators is one way of influencing how these systems are implemented, even if not changing the systems themselves,” she said.  

Another way may be to engage with AI literacy work, one of the areas Pitt Cyber is active in, focusing on K-12 learners. As graduate students or faculty members, we can follow their lead by educating undergraduates on responsible interactions with AI technologies. 

We can safely assume that AI powered tools are here to stay and may significantly influence how many of us engage with and make sense of the world. Although the ways these technologies are developed and trained are beyond the control of individual users, as we have seen, we do have possibilities for individual action. For example, by raising awareness about their strengths and flaws, pressuring our policymakers for responsible AI legislation, and, why not, editing Wikipedia, we could play an active role in shaping the future of AI and its impact on society.