The Bias Problem: When AI Reflects Our Worst Selves

“Resist, don’t accept the coded gaze.”

— Joy Buolamwini, Algorithmic Justice League

Cambridge, Massachusetts. 2018. A researcher named Joy Buolamwini is sitting at a computer in MIT’s Media Lab, running tests on a commercial facial recognition system. She is a computer science researcher with a specific interest in how AI systems perform across different demographic groups. She has been testing several commercial facial recognition systems from major technology companies — systems that are being used, or considered for use, in real-world applications including law enforcement.

She is not surprised by what she finds. But she is troubled.

Gender Shades — Buolamwini’s facial-recognition audit

Date:: February 2018
Location:: MIT Media Lab, Cambridge, Massachusetts, USA
Significance:: First rigorous intersectional audit of commercial facial recognition systems, demonstrating catastrophic performance disparities along lines of race and gender
Outcome:: Three major technology companies (IBM, Microsoft, Amazon) revised or withdrew their facial recognition products from law enforcement use; algorithmic bias entered mainstream conversation

The systems she is testing perform significantly worse on darker-skinned faces than on lighter-skinned faces. Significantly worse on women’s faces than on men’s faces. And catastrophically worse on darker-skinned women’s faces than on lighter-skinned men’s faces. The system that best classifies lighter-skinned men correctly 99% of the time classifies darker-skinned women correctly less than 35% of the time.

The error rate is not a small gap. It is a canyon. And the canyon falls precisely along the lines of race and gender — the dimensions along which society has historically discriminated, the dimensions along which the people most likely to be wrongly identified by these systems are the people least able to absorb the consequences of being wrongly identified.

Important

The canyon falls precisely along the lines of race and gender — the dimensions along which society has historically discriminated, the dimensions along which the people most likely to be wrongly identified by these systems are the people least able to absorb the consequences of being wrongly identified.

She publishes the results in a paper she calls “Gender Shades.” The paper becomes one of the most consequential in the history of AI ethics. Three major technology companies revise or withdraw their facial recognition products from law enforcement use. The concept of algorithmic bias enters the mainstream conversation.

But the problem is not solved. It is not even close to solved. And it is not limited to facial recognition.

What Algorithmic Bias Is and Where It Comes From

Algorithmic bias — the tendency of AI systems to produce systematically different outcomes for different demographic groups — is one of the most practically consequential challenges in AI deployment, and one of the most widely misunderstood.

Definition

Algorithmic bias — The tendency of an AI system to produce systematically different outcomes for different demographic groups, typically arising not from intentional discrimination but from the data and optimisation objectives through which the system learns.

The misunderstanding is often about what causes it. Algorithmic bias is sometimes described as if it were the result of deliberate discrimination — as if AI systems were programmed to treat certain groups differently. This is almost never true. Algorithmic bias typically arises not from intentional discrimination but from the data and the optimisation objectives through which AI systems learn.

The sources of algorithmic bias are numerous and interconnected, but they fall into several broad categories.

Info

1. Historical bias in training data

AI systems learn from data generated by human activity. Human activity, particularly in domains like hiring, lending, criminal justice, and healthcare, has historically been shaped by discrimination. If an AI system learns to predict “good employees” from historical hiring data, and if historical hiring was discriminatory — if the people who were hired, and who were subsequently evaluated as performing well, were drawn from a non-representative sample because of discriminatory practices — the AI system will learn to reproduce those discriminatory patterns. The system is not discriminating intentionally; it is accurately reflecting the patterns in the data it was trained on.

2. Representation bias

If the population represented in the training data is not representative of the population the system will be deployed on, the system’s performance will be worse for underrepresented groups. Buolamwini’s facial recognition findings were partly a consequence of the fact that the training datasets for these systems contained substantially more images of lighter-skinned people and men than of darker-skinned people and women. A system trained primarily on one group’s faces will be less accurate for groups that were less represented in training.

3. Measurement bias

The features that AI systems use to make decisions are often proxies for the things we actually care about. A credit scoring system that uses zip code as a feature is using a proxy for creditworthiness that is also correlated with race — because residential segregation, itself a product of historical discrimination, makes zip code a proxy for race in many American cities. When the proxy is correlated with the outcome we want to predict, using it improves the model’s accuracy. When the proxy is also correlated with protected characteristics, using it introduces bias.

4. Feedback loops

When AI decisions affect the data that is used to train future versions of the system, the initial bias can be amplified. A predictive policing system that allocates police resources based on predicted crime rates will generate more arrests in the areas it targets. More arrests in an area increases the measured crime rate in that area. A model trained on the updated data will predict higher crime rates in those areas and allocate more resources there — reinforcing the initial bias in a self-amplifying loop.

The Specific Domains Where Bias Matters Most

Algorithmic bias exists across almost all AI applications, but its consequences are most serious in specific domains where AI decisions affect fundamental life opportunities and where errors cause direct harm to individuals.

Criminal justice. The use of AI in criminal justice — for pretrial risk assessment, for parole decisions, for predictive policing — has been among the most contested applications of AI and one of the most thoroughly documented in terms of bias.

The COMPAS ProPublica investigation

Date:: May 2016
Location:: Broward County, Florida, USA
Significance:: ProPublica’s analysis of the COMPAS recidivism risk assessment tool found it was twice as likely to falsely flag Black defendants as high risk compared to white defendants — sparking the modern algorithmic fairness debate
Outcome:: Triggered an intense, ongoing debate about algorithmic fairness in criminal justice; led to formal proofs that different intuitive definitions of fairness are mathematically incompatible

COMPAS — Correctional Offender Management Profiling for Alternative Sanctions — is a risk assessment tool widely used in American courts to inform bail and sentencing decisions. In 2016, the investigative journalism organisation ProPublica published an analysis of COMPAS’s performance in Broward County, Florida, that found the tool was twice as likely to falsely flag Black defendants as high risk (when they did not reoffend) as it was to falsely flag white defendants. It was more likely to falsely label white defendants as low risk (when they did reoffend). The analysis sparked an intense debate about algorithmic fairness in criminal justice that continues.

The COMPAS controversy illustrated a specific difficulty: it is mathematically impossible to simultaneously satisfy all the definitions of fairness that are intuitively appealing. COMPAS’s developers argued that the tool was fair in the sense that it correctly predicted recidivism at the same rate for Black and white defendants (calibration). ProPublica’s analysis showed it was unfair in the sense that it produced systematically different false positive rates for Black and white defendants (error rate balance). These two definitions of fairness are mathematically incompatible when the base rates of recidivism differ between groups — you cannot simultaneously have equal calibration and equal error rates if the underlying rates are unequal.

Definition

The fairness impossibility result (Chouldechova 2017; Kleinberg, Mullainathan, Raghavan 2016) — When base rates of an outcome differ across groups, no risk assessment system can simultaneously satisfy equal calibration (predicted probabilities matching actual frequencies within each group) and equal error rates (false positive and false negative rates equal across groups). Any system will be unfair by at least one reasonable definition.

This impossibility result — proved formally by several researchers in 2016 — is one of the most important and most troubling results in algorithmic fairness research. It means that any risk assessment system will be unfair by at least one reasonable definition of fairness, and that choosing between these definitions involves value judgments that cannot be resolved by technical means alone.

Healthcare. AI systems in healthcare have been found to produce systematically different recommendations for patients from different racial and ethnic groups, sometimes in ways that have direct consequences for health outcomes.

A 2019 study published in Science examined a widely used healthcare algorithm that was used by hospitals and insurance companies to allocate healthcare resources. The algorithm used healthcare costs as a proxy for health needs — a seemingly reasonable choice, since patients with greater health needs tend to incur higher costs. But healthcare costs are not just a function of health needs; they are also a function of access to healthcare. Black patients with the same health conditions as white patients tended to have lower costs, because historical inequities in access meant they received less healthcare. The algorithm, trained on costs as a proxy for need, systematically underestimated the needs of Black patients.

Example

The healthcare cost proxy failed because the proxy variable (cost) was correlated with both the target (health needs) and a protected attribute (race). Costs reflected not only need but also historical access inequities. The algorithm was technically accurate on its proxy objective — and systematically biased against the patients it was supposed to help. An estimated 50% reduction in bias would have added millions of Black patients to the additional care programmes, with potentially significant health consequences.

The consequence was that Black patients were less likely than equally ill white patients to be flagged for additional care. An estimated 50% reduction in bias — if the health disparities driving the cost differences were addressed — would have added millions of Black patients to the additional care programmes, with potentially significant health consequences.

Hiring. Amazon’s experience with an AI recruiting tool illustrates the training data bias problem with particular clarity. Amazon developed a machine learning system to screen resumes, trained on data from resumes submitted over the preceding decade. The company discovered the system was penalising resumes that contained the word “women’s” — as in “women’s chess club” or “women’s college” — and was downgrading graduates of all-women’s colleges. The system had learned, from the historical pattern of who was hired at Amazon (predominantly men), to prefer male-associated features in resumes. Amazon abandoned the system in 2018.

Amazon abandons AI recruiting tool

Date:: 2018
Location:: Amazon, Seattle, Washington, USA
Significance:: A high-profile demonstration that an AI hiring tool trained on historical resume data will reproduce the demographic patterns of historical hiring — including its biases
Outcome:: Amazon shut down the system; became a frequently cited cautionary tale in algorithmic hiring

The Amazon case illustrates that training AI systems on historical hiring data will reproduce historical hiring biases. Even without any intent to discriminate, a system that learns “what makes a good Amazon employee” from data on who Amazon historically hired will encode the demographic patterns in that hiring history.

Lending and financial services. AI systems used in credit scoring, loan approval, and financial risk assessment have been documented to produce systematically different outcomes for applicants from different racial groups. The Fair Housing Act and the Equal Credit Opportunity Act prohibit discrimination in lending on the basis of race, but demonstrating that an AI system’s decisions constitute illegal discrimination — rather than legitimate risk assessment that happens to correlate with race — is legally and technically complex.

The specific legal challenge is that AI systems often use variables that are not themselves protected characteristics but that are correlated with protected characteristics. Income, credit history, zip code, employment type — these are all legitimate factors in credit assessment that are also correlated with race in ways that reflect historical discrimination. A system that uses these factors may produce discriminatory outcomes without explicitly using race as a variable.

The Researchers Who Named the Problem

The algorithmic bias problem has been documented, analysed, and publicised by a community of researchers whose work has moved the field from one where bias was acknowledged but poorly understood to one where it is a central concern with significant research attention and policy implications.

Joy Buolamwini

Born:: 1989, Edmonton, Alberta, Canada
Nationality:: Ghanaian-American
Role:: Computer scientist, digital activist
Known for:: The “Gender Shades” research (2018); founding the Algorithmic Justice League; coining the term “the coded gaze” for the perspective embedded in AI systems; intersectional audits of commercial facial recognition

Timnit Gebru

Born:: 1983, Addis Ababa, Ethiopia
Nationality:: Ethiopian-American
Role:: Computer scientist, AI ethics researcher
Known for:: Co-authorship of “Gender Shades” (2018); foundational research on bias in NLP and the environmental and social costs of large language models; co-founding Black in AI; her dismissal from Google in December 2020 became a flashpoint in debates over AI ethics inside industry labs

Safiya Umoja Noble

Born:: 1970, Los Angeles, California, USA
Nationality:: American
Role:: Scholar of media studies, information studies, and African American studies
Known for:: Algorithms of Oppression (2018), documenting how search algorithms encode and amplify racial bias; Professor at UCLA; MacArthur Fellow (2024)

Cathy O’Neil

Born:: 1972
Nationality:: American
Role:: Mathematician, data scientist, author
Known for:: Weapons of Math Destruction (2016), the most accessible account of how algorithmic systems cause harm at scale by appearing objective while amplifying inequality; founder of the consultancy O’Neil Risk Consulting & Algorithmic Auditing (ORCAA)

Solon Barocas

Born:: 1979
Nationality:: American
Role:: Computer scientist, AI fairness researcher
Known for:: Co-author of Fairness and Machine Learning (2019, with Hardt and Narayanan) — the standard technical textbook on algorithmic fairness; Principal Researcher at Microsoft Research; founding editor of the Fairness, Accountability, and Transparency (FAccT) research community

Moritz Hardt

Born:: 1980 (approximate)
Nationality:: German
Role:: Computer scientist, machine learning researcher
Known for:: Co-author of Fairness and Machine Learning; foundational work on fairness in machine learning including the impossibility results; Professor at the Max Planck Institute for Intelligent Systems and ETH Zürich

Joy Buolamwini’s “Gender Shades” research — and her subsequent founding of the Algorithmic Justice League — was the highest-profile single intervention, bringing the bias in facial recognition systems to mainstream attention and triggering direct policy and commercial responses. Her emphasis on the specific harm done to darker-skinned women — the intersection of racial and gender bias producing the worst outcomes for a specific demographic — was important for moving the conversation beyond simple categories of “racial bias” or “gender bias” to the intersectional reality of how bias operated.

Timnit Gebru’s research — on the environmental and social costs of large language models, on bias in NLP systems, and on the ethical implications of AI development more broadly — contributed both technical findings and a broader framework for thinking about whose perspectives were included and excluded in AI development. Her dismissal from Google in 2020, after submitting a paper that the company had asked her to withdraw, became a flashpoint in the debate about how AI companies managed internal dissent on safety and ethics.

Safiya Umoja Noble’s “Algorithms of Oppression” (2018) documented the ways that search algorithms encoded and amplified racial biases — showing how searches for racialized terms on Google returned results that reinforced stereotypes and perpetuated harm. Noble’s work connected the technical problem of bias to the broader social and political context in which AI systems were developed and deployed.

Cathy O’Neil’s “Weapons of Math Destruction” (2016) provided the most accessible comprehensive account of how algorithmic systems could cause harm at scale — through credit scoring, hiring algorithms, and educational assessment tools that amplified inequality while appearing objective. O’Neil’s framing — that the appearance of objectivity in algorithmic systems could be a specific form of injustice — was influential in policy and media discussions of AI.

Solon Barocas, Moritz Hardt, and their collaborators developed the mathematical foundations of algorithmic fairness — the formal definitions of fairness, the impossibility results, and the specific technical approaches to measuring and mitigating bias. This theoretical work provided the intellectual scaffolding for the broader research community working on bias.

The bias problem was not discovered by the AI industry. It was named — by researchers outside the labs that built the systems, working with communities that bore the consequences — and forced into a conversation the industry would have preferred to defer.

The Technical Responses: What Can Be Done

The recognition that algorithmic bias was a serious problem triggered a significant research effort to develop technical approaches to measuring and mitigating it. The technical approaches are varied and each has specific limitations.

Fairness metrics. The first step toward addressing bias is measuring it, and developing precise, quantitative definitions of fairness has been a significant research effort. Different metrics capture different aspects of fairness:

Definition

Three core fairness metrics:

Demographic parity requires that the proportion of positive decisions is equal across groups. A hiring algorithm with demographic parity would hire candidates from different racial groups at equal rates.
Equalised odds requires that the true positive rate and false positive rate are equal across groups. A risk assessment system with equalised odds would correctly identify high-risk individuals and incorrectly flag low-risk individuals at equal rates across racial groups.
Calibration requires that the probability assigned by the system to an outcome matches the actual frequency of that outcome across groups. A risk assessment system with calibration would correctly predict recidivism rates for both Black and white defendants.

The impossibility result — that these metrics cannot all be simultaneously satisfied when group base rates differ — means that choosing a fairness metric is itself a value judgment that involves tradeoffs.

Info

1. Preprocessing

One approach to mitigating bias is to modify the training data before training the system. This might involve removing protected attributes and their proxies from the training data, resampling the training data to improve representation of underrepresented groups, or reweighting training examples to reduce the influence of biased historical patterns.

Preprocessing approaches are limited by the depth of the correlation between protected attributes and legitimate features. If race is deeply correlated with many legitimate predictors — because historical discrimination has created systematic differences in income, credit history, and other legitimate factors — removing the race feature from the training data may not significantly reduce racial disparities in outcomes.

2. In-processing

Another approach is to modify the training objective to include a fairness constraint — to train the system not just to be accurate but to be accurate while satisfying a specified fairness metric. This approach requires choosing a fairness metric, and the impossibility result means that optimising for one fairness metric may worsen performance on others.

3. Postprocessing

A third approach is to adjust the system’s outputs after training to satisfy fairness constraints — for example, by setting different decision thresholds for different demographic groups. This approach is technically straightforward but may require using the protected attribute explicitly in the decision process, raising legal concerns in some jurisdictions.

Each of these approaches is technically implementable and each reduces some forms of bias in some contexts. None of them addresses the underlying problem that the training data reflects historical discrimination, and none of them resolves the fundamental tension between different definitions of fairness.

Warning

Technical bias mitigation is necessary but not sufficient. Each of the three approaches can reduce some forms of bias in some contexts. None addresses the underlying reality that training data reflects historical discrimination, and none resolves the fundamental tension between different definitions of fairness. Treating technical mitigation as a solution rather than a partial measure is itself a form of technological solutionism.

The Regulatory Response: Governments Enter the Conversation

The documentation of algorithmic bias in consequential domains — criminal justice, healthcare, lending, hiring — triggered regulatory responses in several jurisdictions that represent the first systematic attempts to govern the use of AI in high-stakes decisions.

Info

The European Union’s AI Act classifies AI systems used in employment, credit scoring, criminal justice, and essential services as “high risk” and subjects them to specific requirements: risk assessments, documentation, human oversight, and data governance standards that address the specific concerns of bias and accuracy.
In the United States, the Biden administration’s Executive Order on AI directed federal agencies to develop guidance on using AI in contexts where bias is a concern. The Equal Employment Opportunity Commission has issued guidance on AI in hiring, noting that AI hiring tools that produce disparate impact — disproportionate effects on protected groups without business justification — may violate existing anti-discrimination law even if they do not explicitly use protected attributes.
Several cities and states have enacted more specific AI regulation. New York City’s Local Law 144 requires employers who use AI in hiring to conduct bias audits of their AI tools. Illinois’s AI Video Interview Act requires disclosure and consent when AI is used to analyse job interview videos. Several cities have imposed moratoria on the use of facial recognition by law enforcement, citing the bias documentation from the Gender Shades and subsequent research.

These regulatory responses are important but incomplete. They address specific, high-visibility applications rather than providing a comprehensive framework for AI governance. They often require bias assessments without specifying what standards those assessments must meet. And they are being developed in a context where the technical understanding of bias is still evolving and where the impossibility results make it unclear what “passing” a bias audit would even mean.

The Deeper Problem: What Fairness Actually Requires

The technical and regulatory responses to algorithmic bias are necessary but insufficient, because they treat bias as a technical problem that can be solved by better algorithms and better measurement. The deeper problem is that algorithmic systems operate in societies that are already unequal, and no technical fix can correct for the underlying social inequality.

This is the most important insight that the algorithmic fairness research has produced, and it is one that the field is still working through. AI systems are not creating inequality — they are interacting with existing inequality in specific ways that can amplify or reduce it, but the root causes of the inequality are social, historical, and political rather than technical.

A credit scoring system that uses historical credit data will reflect the effects of historical discrimination in lending on the credit histories of different groups. Improving the algorithm cannot correct for the decades of discriminatory lending that shaped those credit histories. Expanding credit access to previously excluded groups cannot eliminate the wealth gap that discriminatory lending created. Technical improvements are valuable at the margin, but they cannot substitute for the substantive justice that addressing historical discrimination requires.

This has specific implications for what AI fairness research can and cannot achieve. Algorithmic fairness research can reduce the extent to which AI systems amplify existing inequality. It can measure and document disparate impact. It can identify specific algorithmic choices that are unnecessary and harmful. All of this is valuable.

What algorithmic fairness research cannot do is make AI systems fair in a deeper sense if the societies in which they operate are unfair. Equalising the false positive rates in a criminal justice risk assessment system does not address the underlying conditions — poverty, lack of opportunity, inadequate education — that drive the differences in recidivism rates. Removing bias from a hiring algorithm does not address the educational and economic inequalities that shape the pool of qualified candidates.

The relationship between technical AI fairness and substantive social justice is not that the former produces the latter. The relationship is that technical AI fairness is a necessary condition for responsible AI deployment in unjust societies, but it is not sufficient for justice, and treating it as sufficient is a form of technological solutionism that can obscure the political work that justice actually requires.

The Intersectionality Dimension: How Bias Compounds

One of the most important insights from the bias research — exemplified by Buolamwini’s Gender Shades work — is that bias does not affect different demographic groups independently. Bias at the intersection of multiple dimensions of identity can be dramatically worse than bias along any single dimension.

Definition

Intersectional bias — The phenomenon by which bias at the intersection of multiple dimensions of identity (e.g. race × gender, disability × class) can be dramatically worse than bias along any single dimension. A system that performs reasonably on white women and on Black men may still perform catastrophically on Black women, because the specific combination is underrepresented in training data in ways neither single dimension reveals.

A facial recognition system that performs reasonably well for white women and for Black men may perform catastrophically for Black women, because the specific combination of race and gender is underrepresented in the training data in ways that neither race nor gender alone would reveal. A hiring algorithm that is fair to women on average and fair to Black applicants on average may still systematically disadvantage Black women, if the algorithm has learned that the combination of being Black and being a woman is associated with lower outcomes in the biased historical data.

The mathematical framework for capturing intersectional bias is more complex than the framework for capturing bias along single dimensions, and the research on intersectional fairness is less mature than the research on single-dimension fairness. This means that bias audits that focus only on single protected attributes can give misleadingly positive results — a system can pass a race bias audit and a gender bias audit while still producing severely biased outcomes for specific demographic subgroups.

The intersectionality insight also has implications for who is most harmed by algorithmic bias. The groups that are most harmed tend to be those at the intersection of multiple disadvantaged dimensions — Black women, disabled women of color, transgender people of color, undocumented immigrants with other disadvantaged characteristics. These groups are often the least represented in training data, the least likely to have participated in the development of the AI systems that affect them, and the least able to access the resources needed to contest unfair algorithmic decisions.

The political implication is that addressing algorithmic bias requires active attention to the specific groups most harmed by it — attention that goes beyond ensuring that the overall population is represented and includes ensuring that the most vulnerable and most affected subgroups are specifically considered.

The Accountability Gap: When Algorithms Make Consequential Decisions

One of the most troubling features of algorithmic decision-making is the accountability gap — the difficulty of assigning responsibility when an algorithm makes a decision that harms someone.

When a human judge makes a bail decision that is wrong — releasing someone who then commits a crime, or detaining someone who would not have reoffended — there is a specific person who made the decision and who can, in principle, be held accountable. The accountability is imperfect, but the mechanism exists.

When an algorithmic risk assessment tool makes the same decision — when it flags someone as high risk who is not, or fails to flag someone as high risk who is — the accountability is diffuse. Who is responsible? The company that developed the algorithm? The court that chose to use it? The judge who relied on its recommendation? The jurisdiction that approved its use?

Pitfall

When nobody is specifically responsible for an algorithmic decision, nobody has a specific incentive to investigate whether the decision was correct. The feedback loop that allows human decision-makers to learn from their errors — recognising that a decision was wrong, investigating why, adjusting practice to prevent similar errors — is interrupted by the insertion of an algorithm between the decision and the accountability.

The diffusion of responsibility is not just a legal abstraction — it has practical consequences for how errors are identified and corrected. When nobody is specifically responsible for an algorithmic decision, nobody has a specific incentive to investigate whether the decision was correct. The feedback loop that allows human decision-makers to learn from their errors — the recognition that a decision was wrong, the investigation of why it was wrong, the adjustment of practice to prevent similar errors — is interrupted by the insertion of an algorithm between the decision and the accountability.

The accountability gap is a specific form of the alignment problem: not the catastrophic misalignment of a future superintelligent system, but the everyday misalignment of systems that are deployed without adequate mechanisms for detecting, attributing, and correcting the harms they cause.

The Data Problem: Who Generated the Data and Who Benefits

Underlying many of the specific bias problems in AI is a more fundamental question: who generated the data that AI systems are trained on, and who benefits from the value that training data creates?

Info

The data that trains AI systems was generated, for the most part, by people who were not compensated for its generation and did not consent to its use for AI training. The text that trains large language models was written by humans — by writers, journalists, academics, social media users, programmers — most of whom had no expectation that their writing would be used to train AI systems and who received no compensation for that use. The images that train computer vision systems were photographed by people who had no knowledge of or consent to their use in AI training.

The question of who should control data and who should benefit from the value it creates is one of the central political questions raised by AI. The current answer — that companies can train AI systems on publicly available data without compensating its creators — is the answer produced by the current legal framework, which was not designed with AI training in mind. It may not be the right answer.

The data question has a specific connection to bias. If the people whose data is used to train AI systems have no power over how that data is used, the systems trained on their data may not serve their interests. If the training data reflects primarily the perspectives and experiences of the people who have historically been most active on the internet — disproportionately wealthy, educated, English-speaking, and from the global north — the AI systems trained on that data will reflect those perspectives rather than the full diversity of human experience.

Building AI systems that are genuinely fair to all people, including people whose perspectives are underrepresented in training data, requires addressing the data question: not just ensuring that underrepresented groups are in the training data, but ensuring that their perspectives and interests are genuinely represented in ways that influence how the systems are trained and deployed.

The Progress: What Has Changed Since 2016

The documentation of algorithmic bias in consequential AI applications began in earnest around 2016, with the COMPAS controversy, the Gender Shades research, and a wave of academic work on fairness, accountability, and transparency in machine learning. In the years since, significant progress has been made on some dimensions, while others remain deeply challenging.

Example

Progress on awareness. Before 2016, the mainstream AI community treated bias primarily as a technical problem to be addressed in future work, not as an urgent practical concern. The research and advocacy of the past decade has made bias a central concern in AI development, with major AI conferences featuring extensive work on fairness and with bias audits becoming increasingly standard practice in high-stakes applications.

Progress on measurement. The development of formal fairness metrics, bias audit tools, and standardised testing methodologies has made it possible to measure bias more precisely and more systematically than was possible before. The Gender Shades methodology — testing AI systems across demographic subgroups and documenting differential performance — has been applied to many AI applications and has produced important documentation of where bias is most severe.

Progress on specific applications. The facial recognition controversy triggered concrete responses: IBM, Microsoft, and Amazon all withdrew or limited their facial recognition products from law enforcement use after the Gender Shades research. New York City’s bias audit law and similar regulations in other jurisdictions have created market incentives for conducting and disclosing bias audits. The EEOC guidance on AI in hiring has created legal risk for deploying hiring AI without adequate bias testing.

What has not progressed as much. The underlying causes of algorithmic bias — historical discrimination, representation gaps in training data, the use of proxies that encode historical inequalities — have not been adequately addressed. The accountability gap — the difficulty of assigning responsibility for algorithmic harms — remains largely unaddressed in law. The data governance questions — who controls training data and who benefits from its use — are not resolved. And the deepest insight — that technical fairness interventions cannot substitute for substantive social justice — is not yet widely reflected in policy or practice.

The Future: Bias in More Capable Systems

As AI systems become more capable, the bias problem becomes both more urgent and more complex.

Warning

More capable language models trained on larger, more diverse datasets might in principle produce less biased outputs by virtue of having learned from more representative data. But they also interact with the world in more complex ways, making their biases more difficult to detect and document. A large language model’s outputs can be subtly biased in ways that are invisible to standard testing but that accumulate into significant effects when the system is deployed at scale.

The deployment of AI systems in more consequential decisions — in clinical diagnosis, in sentencing, in benefit eligibility determination — makes the stakes of bias higher. The systems that are being built today will increasingly affect fundamental life decisions, and their biases will have correspondingly more serious consequences.

The specific biases of generative AI systems — the ways that language models represent different demographic groups in text and images — raise their own concerns. Research has documented that language models generate different kinds of language when describing people of different races and genders, that image generation models produce stereotyped outputs that reflect the biases of their training data, and that the outputs of these systems can reinforce stereotypes in ways that have social consequences.

The solutions to these problems are not primarily technical. They require attention to who is involved in developing AI systems, whose perspectives are represented in training data, what values are embedded in design choices, and what accountability mechanisms exist for identifying and correcting harms. These are political and organisational questions as much as technical ones, and addressing them requires the kind of sustained, broad engagement with the people affected by AI systems that the AI development community has been slow to undertake.

The Essential Insight: Technology Cannot Fix Injustice, Only People Can

The most important thing that the algorithmic bias research has established is also the most uncomfortable: there is no technical solution to inequality.

Important

There is no technical solution to inequality. AI systems that are deployed in unequal societies will interact with that inequality, and how they interact with it is a choice that reflects values. The choice to use a system that amplifies inequality is a value choice. The choice to use a system that reduces inequality — at the cost of some aggregate accuracy — is a value choice. The choice to not deploy a system at all in contexts where its bias would cause disproportionate harm is a value choice.

AI systems that are deployed in unequal societies will interact with that inequality, and how they interact with it is a choice that reflects values. The choice to use a system that amplifies inequality is a value choice. The choice to use a system that reduces inequality — at the cost of some aggregate accuracy — is a value choice. The choice to not deploy a system at all in contexts where its bias would cause disproportionate harm is a value choice.

The technical work on fairness can inform these value choices — it can measure the tradeoffs, identify the groups most affected, compare the consequences of different algorithmic choices. But it cannot make the value choices. That requires the kind of democratic engagement with the people affected by AI systems that is still largely absent from AI development practice.

Joy Buolamwini’s slogan for the Algorithmic Justice League — “resist, don’t accept the coded gaze” — captures the essential political dimension of the bias problem. The “coded gaze” is the perspective embedded in AI systems, and that perspective reflects who built the systems and whose data trained them. Resisting the coded gaze means refusing to accept AI decisions as neutral or objective, insisting on the humanity behind the code, and demanding accountability for the choices embedded in the technology.

Definition

The coded gaze (Joy Buolamwini’s term) — The perspective embedded in AI systems, reflecting who built them and whose data trained them. Like the “male gaze” of film theory or the imperial gaze of colonial discourse, the coded gaze is a structured way of seeing that prioritises some perspectives and invisibilises others. To resist it is to refuse the implicit neutrality of algorithmic outputs.

This is not an argument against AI. It is an argument for AI that is built with genuine attention to the people it affects — AI developed with diverse teams, trained on representative data, tested for disproportionate harms, and deployed with accountability mechanisms that can detect and correct errors. This kind of AI development is possible. It is also more expensive, more time-consuming, and more politically demanding than the kind of AI development that has been the default.

The question is whether the people who are building and deploying AI systems will make the choices required. That question is not technical. It is political. And the answer is not yet clear.

The bias problem is not a glitch to be patched. It is the technology reflecting, with brutal accuracy, the society that built it. There is no technical solution to inequality. There are only choices about how to deploy tools in an unequal world — and those choices belong to people, not algorithms.

The Bias Problem: When AI Reflects Our Worst Selves

What Algorithmic Bias Is and Where It Comes From

1. Historical bias in training data

2. Representation bias

3. Measurement bias

4. Feedback loops

The Specific Domains Where Bias Matters Most

The Researchers Who Named the Problem

The Technical Responses: What Can Be Done

1. Preprocessing

2. In-processing

3. Postprocessing

The Regulatory Response: Governments Enter the Conversation

The Deeper Problem: What Fairness Actually Requires

The Intersectionality Dimension: How Bias Compounds

The Accountability Gap: When Algorithms Make Consequential Decisions

The Data Problem: Who Generated the Data and Who Benefits

The Progress: What Has Changed Since 2016

The Future: Bias in More Capable Systems

The Essential Insight: Technology Cannot Fix Injustice, Only People Can

Further Reading

Comments