I Spent a Week Asking AI to Assess Media Bias for Israel-Palestine.

7 min readNov 18, 2023

Introduction

The Israeli-Palestinian conflict is one of the most divisive and controversial issues in the world today. However, with so much misinformation and bias in reporting on the conflict, it can be challenging for news consumers to identify truly objective sources.

This article delves into an exploratory exercise performed using OpenAI’s language model GPT-4 to gauge media bias when reporting on this complex geopolitical issue. Utilizing a customized variation of the renowned Ad Fontes Media Bias Chart, this study probes how AI can provide valuable insights into alignment and reliability when assessing publications covering Israel-Palestine affairs.

The resulting Media Bias Chart shows data points generated over multiple iterations with GPT-4, plotted with chart.js, and labeled with further aesthetic touch-ups in Adobe Photoshop.

Direct image download.

The Challenge of Evaluating Media Bias

Media bias, or the perceived slant in journalism and mass communication, has long been a topic of contention. Academics have devoted substantial research to understanding how bias manifests in news coverage and impacts public opinion. But empirically measuring and visualizing media bias remains an imperfect science.

One arena that frequently finds itself under scrutiny is reporting on Israel-Palestine. The complex historical, religious, and socio-political dynamics make objective coverage inherently difficult. When paired with the objective scarcity of authoritative Palestinian media outlets, pronounced imbalances emerge in narratives reaching global audiences.

This exploratory study utilizes artificial intelligence to critically examine bias tendencies in a diverse sample of English-language news and advocacy groups covering issues in the Southern Levant. By leveraging the pattern recognition and text analysis capabilities in advanced learning language models, this project aims to probe how machine learning technologies might enhance our understanding of bias in media ecosystems.

A version of the static Ad Fontes Media Bias Chart used as reference (source). All images © 2016–2023 Ad Fontes Media.

Adapting the Ad Fontes Media Bias Chart

To visualize variations in bias and reporting quality, this study employs a customized variant of the renowned Ad Fontes Media Bias Chart as a framework (shown above). The Ad Fontes chart succinctly maps media outlets across a spectrum from “Original Fact Reporting” to “Misinformation,” with positioning on the X-axis indicating partisan leanings from “Extreme Left” to “Extreme Right.”

For assessing bias in Israel-Palestine coverage specifically, the X-axis parameter was adapted to represent alignment towards Palestinians on the left and Israelis on the right. The scale spanned from 1 to 7, with 1 indicating extreme pro-Palestinian alignment and 7 denoting extreme pro-Israel sympathies. A rating of 4 represents neutral, which is represented in the visualization as 0.0 as the center axis of a bipolar Likert scale.

The Y-axis “Reliability” parameter from the original Ad Fontes chart was retained to measure each outlet’s factuality, accuracy, and thoroughness, labeled with the shorthand “Quality.” Scores ranged from 1 to 8, with 1 representing low-quality reporting and 8 indicating highly reliable original journalism with rigorous standards.

Diverse Publications Under AI Review

The list of media sources analyzed comprised roughly 40 diverse publications and organizations with global reach, including major outlets like The Guardian and The Wall Street Journal, alongside alternative platforms such as Mondoweiss and Electronic Intifada known for their pro-Palestinian perspectives.

The media selection is not randomized, but rather attempts to be thorough. Some smaller publications were rejected for lack of data preventing a large enough sample size for GPT-4 to assess. The sample list also incorporated advocacy groups like BDS, JVP, AIPAC, and StandWithUs, which also helped gauge the accuracy of AI assessment. Along with Reuters and AP as control for the Y-axis (Quality), these groups helped gauge accuracy for the X-axis (Bias) on stated positions. In total, mainstream and niche media across the ideological spectrum were represented.

AI Analysis and Scoring

The natural language model GPT-4 was provided the adapted Ad Fontes framework for Israel-Palestine along with the sample list of media publications and advocacy groups. Through a week-long iterative process, GPT-4 produced bias and reliability scores for each outlet based on its aggregated analysis of their content and framing.

For example, The Guardian scored 3.5 and 6 on the X and Y axes respectively, indicating left-leaning tendencies with highly reliable reporting. Meanwhile, Electronic Intifada registered as 1 and 4, reflecting extreme pro-Palestinian alignment with medium reliability. GPT-4 most commonly referenced consistent fact-checking as its main source for quality, factuality, and reliability.

In other words, the semantic concept of “Quality” distinguishes itself from the concept of “Bias” in that bias can exist via both factual and non-factual accounts. An example would be:

A publication that regularly shares information regarding Israel-Palestine that is later debunked (low quality, high bias)
vs. a publication that reliably fact-checks its information before reporting, but only shares information that makes one side or the other look bad (high quality, high bias).

In this way, “Bias” might also be interpreted as “imbalanced” reporting regardless of factuality, while “Quality” might be interpreted as “Reliability.”

Across the sample, GPT-4’s scoring demonstrated nuanced discernment of variations in both partisan sympathies and journalistic standards. To create the interpretable visualization, aggregated results were plotted as (x,y) coordinates in a customized javascript chart using chart.js, then labeled and further touched up in Adobe Photoshop.

An example prompt at the fairly restrictive temp=0.5. Note the use of the “sandwich method” which states the bot task at the beginning and end of the sample list at { … }.

Interpreting AI-Generated Media Bias Assessments

When interpreting the results, it is crucial to recognize the experiment’s limitations and maintain responsible skepticism. AI has no intrinsic opinions on a given subject, but the bulk of its training data necessarily does. Thus, GPT-4’s scores constitute its synthesized computation of available data on each media source, not definitive declarations.

With those caveats, analysis of the AI-generated bias chart yields several intriguing observations:

Authoritative news wires like AP and Reuters ranked closest to the center, aligning with their reputations for premier objectivity and reliable first reporting. This potentially confirms their value to news consumers seeking neutral coverage.
Well-resourced (i.e. well-funded) outlets tended to score higher on “Quality” despite leaning partisan. This discrepancy leans more favorably toward pro-Israel sources, shifting the overall curve of the chart to the right. Notable exceptions to this include Fox News and AIPAC.
Mainstream media classified as centrist still evidenced subtle leanings one way or another.
BDS and Al-Jazeera (both English and Arabic) were outliers consistently placed in the high-quality, pro-Palestinian region.
AI rated advocacy groups at pro-Israel ideological extremes as significantly less reliable than pro-Palestine advocacy groups.
Pro-Palestinian perspectives concentrated in lower-resourced more grassroots outlets, reflecting imbalances in media infrastructures.

Collectively, the AI evaluation illuminates how biases influence tendencies and reporting standards ultimately resulting in a preference for pro-Israeli narratives.

Weaknesses, discoveries, and future improvements

GPT-4 contains a “sweet spot” for the number of tokens it can parse. Ideally, this ended up being around 10–15 items in the sample list. The reason, presumably, is lack of context; i.e. if there were not enough items in the sample list, it struggled to establish a useful spectrum of placement.
Publications with a long history and a wealth of available data seemed to tend toward center, creating a “Zero-7” zone just beneath Reuters and AP containing many publications. This could be addressed with greater nuance in the future.
Occasionally, where an item was located on the list affected its assessment, especially for smaller publications with less available data. Developing an automated process to run the same queries several times with list items randomly reordered many times would significantly speed up addressing this.
Another issue that arose was newer media sources having less data than older sources, the former tending toward pro-Palestine and the latter toward pro-Israel. Older media sources would often have language that would be considered neutral in e.g. the 1980s, but pro-Israel nowadays. To address this, a line was added to the prompt to point out the objective terminology shift over the decades regarding the region.
A version of this exercise could exist via analysis of scraped data directly from the sample sources. Using vector spaces, for example, would allow analysis of a large body of publicly available text without so heavily relying on GPT-4 out-of-the-box training.

Conclusion: AI as an Imperfect But Illuminating Lens

This exploratory study demonstrates the potential of AI to uncover illuminating patterns in the complex media bias landscape surrounding Israel-Palestine. While imperfect, responsible application of algorithms can provide a valuable perspective on an issue beset with misinformation and selective narration.

Key insights from this analysis include recognizing the enduring value of credentialed news wires, scrutinizing framing choices even of mainstream media, considering niche sites to diversify perspectives, and grappling with structural inequities that advantage certain narratives.

As AI capabilities continue advancing, machine learning presents intriguing opportunities for mitigating media bias and empowering news audiences with data-driven media literacy. With thoughtful implementation, AI tools can complement human discernment to promote coverage quality and ideological diversity on enormously consequential issues like Israel-Palestine. This study provides promising indications of how AI assessment can not only enhance our understanding of bias tendencies in media ecosystems, but help to circumvent our own inherent biases as well.

For a Google Sheet containing data results, example prompt, output, and other details from this exercise, click here.

For my socials and other links, click here.