AI Model Favors Wealthier, Western Perspectives, Amplifying Inequality in Digital Technology Representation
A recent study conducted by University of Michigan researchers has revealed that biases within OpenAI’s CLIP, a large image-text AI model, disproportionately favor wealthier and Western perspectives. This disparity in performance on lower-income and non-Western images could potentially exacerbate inequality in the representation of digital technology.
The study, initiated and advised by Rada Mihalcea, the Janice M. Jenkins Collegiate Professor of Computer Science and Engineering, emphasizes the importance of including everyone in AI tools during a time when these tools are being widely deployed. However, the researchers found that a significant portion of the population, especially those in the lowest social income brackets, are not adequately reflected in these applications.
The CLIP AI model serves as a foundation for various applications, including the popular DALL-E image generator. When such models are trained with data that only represents a narrow view of the world, the resulting bias can propagate into downstream applications, further perpetuating inequalities in digital representation.
Joan Nwatu, a doctoral student in computer science and engineering and part of the research team, expressed concern over potential consequences, stating, If a software was using CLIP to screen images, it could exclude images from a lower-income or minority group instead of truly mislabeled images. It could sweep away all the diversity that a database curator worked hard to include.
To assess the bias in CLIP, the researchers evaluated its performance using Dollar Street, an image dataset encompassing diverse households across Africa, the Americas, Asia, and Europe. The dataset features over 38,000 images collected from households with monthly incomes ranging from $26 to nearly $20,000. These images depict everyday items and are manually annotated with relevant contextual topics.
CLIP generates a score to represent the compatibility between text and images. This score is utilized in downstream applications for tasks such as image flagging and labeling. The researchers found that CLIP consistently assigned higher scores to images from higher-income households compared to those from lower-income households. For example, images depicting electric lamps from wealthier households received higher scores than images featuring kerosene lamps from poorer households under the topic light source.
Geographic bias was also observed, with low-income African countries predominantly receiving lower CLIP scores. This bias could potentially lead to reduced diversity in large image datasets and result in the underrepresentation of low-income and non-Western households in CLIP-dependent applications.
Oana Ignat, a postdoctoral researcher in the same department, highlights the necessity of addressing this performance gap across demographics. Ignat remarks, Many AI models aim to achieve a ‘general understanding’ by utilizing English data from Western countries. However, our research shows this approach results in a considerable performance gap across demographics.
The researchers outline actionable steps for AI developers to cultivate more equitable AI models. Nwatu emphasizes the need for transparency, stating, The public should know what the AI was trained on so that they can make informed decisions when using a tool.
The study was financially supported by the John Templeton Foundation (#62256) and the U.S. Department of State (#STC10023GR0014). As AI technology continues to evolve and impact various aspects of society, addressing bias and striving for inclusivity become imperative in creating a more equitable digital landscape for all.