Skip to Main Content

Bias in Search Engines And Algorithms

A critical analysis of the explicit and implicit biases present in various search engines, databases, and algorithms that people regularly interact with in their daily lives

Media

                                                           

  1. Image Credit: Memac Ogilvy & Mather Dubai
  2. Image Credit: Jasminder Bains
  3. Image Credit: Pixabay

 

Introduction

Search engines aren’t neutral.  They’re influenced by the same white cis-male patriarchy that permeates American society and government. Dr. Noble's book Algorithms of Oppression revealed how typing "black girls" into Google brought back search results for pornographic sites. Google changed the results for this search term in light of Dr. Noble's findings, but the problem remains for search terms related to other female minorities such as “Asian girls.” It’s also no accident that the Library of Congress subject headings classify Native American sand art as “arts and crafts” (W.
Wiegand, personal communication, Oct 10, 2018). Systematic oppression is occurring behind the computer screen. "Algorithms can reproduce and support the same negative stereotyping that occurs in society" (Farkas, 2017). Algorithms don't create themselves. Data scientists create the code for them and — like all human beings — they have biases for better or worse.

This library guide examines the biases present in those algorithms, provides information on how to navigate them, and ways to fight bias. The topics in this library guide include:

  • Glossary of Terms to understand key vocabulary
  • Suggested Readings to learn more about the issue
  • Examples that illustrate how bias is still occurring today
  • Current Events that tell the stories of people who are fighting bias
  • People to Follow that are thought leaders in this area
  • Ways to Get Involved in fighting bias

 

How Subject Heading Bias Occurs

Subject headings are derived from mainstream articles and books because they can't realistically encompass everything that was ever published. This can become precarious because marginalized groups often have difficulty publishing their works so mainstream literature often doesn't include them. Subject headings also tend to follow what's considered socially acceptable in mainstream culture at the time which results in pushing non-dominant culture to the side and unrelated search terms becoming associated with marginalized groups. For example, Strottman (2007) revealed several headings used to categorize Native Americans that were completely inaccurate and clearly biased representations. 

On the flip side, the LC subject headings don't include white privilege at all because the Library of Congress chooses not to "include specific headings for groups discriminated against" (Subject Authority Cooperative Program, 2016). Unless these headings are contested by librarians or the general public, the bias remains. It's important to note that librarians have successfully lobbied these headings in the past, but there's still a lot more work to be done.

How Algorithmic Bias Occurs

Algorithms have to be fed data in order to be able to make decisions and judgments, similar to how people need to have evidence and information to come to conclusions. There's one key difference between people and algorithms. Algorithms will make the same decision 100% of the time — unless adjusted by a computer programmer — while people are prone to making exceptions or getting distracted by other variables. Therefore, if an algorithm is biased, it'll consistently make biased choices which can cause even more damage than a biased human being. This isn't just a hypothetical scenario. Algorithms are regularly fed biased data because they rely on historic data. For example, a technology company looking to automate its job hiring process may assign an algorithm to scan applicants' resumes. Technology has been historically male-dominated so the data the company feeds to the algorithm will be male-dominated. As a result, the algorithm will become biased and filter out resumes not written by male applicants.

One may think the solution is simply to remove gender from the algorithm's decision-making protocol altogether. However, removing data — biased or not — will decrease the algorithm's accuracy. This is where the fairness over accuracy curve that many computer programmers use comes from. The ideal scenario is create an algorithm that's as fair and accurate as possible.

Why It Matters

  • Algorithms have become yet another way to systematically oppress and silence the voices of marginalized groups
  • Stereotypes only make it harder for minority groups to shatter the glass ceiling
  • Bias in search engines towards more affluent websites makes it even more difficult for smaller companies (usually run by people of color) to get exposure
  • Information organizations such as public/academic libraries and government institutions are increasingly using — or getting replaced by — algorithms and digital tools which has political, social, and economic consequences (Noble, 2016)
  • Harmful stereotypes in search engines normalize racism, sexism, homophobia, etc. and encourage complacency (Noble, 2016)

Key Takeaways

  • It's important to ask yourself: Who's the assumed audience and who benefits?
  • Search engines and artificial intelligence are neither neutral nor free from human judgment.
  • Shock factor is profitable, so content moderators may not take down discriminatory advertisements because more people click on them.