ChatGPT bias: A pattern analysis

Table of contents

‍"Before Omnia, we didn’t know how AI engines saw us. Now we have control, clear guidance on where to act, and can see results in days.”

Pedro Sala

Growth Manager, INDYA

Start for Free

Report summary

This report combines two independent studies on ChatGPT's bias patterns:

Omnia Empirical Study — Data observed directly across 8 real brands monitored on ChatGPT over 30 days (Feb - Mar 2026). Identifies how ChatGPT treats brands differently based on their competitive category and corporate ecosystem.
Independent Web Research — Review of 43 peer-reviewed academic sources, university studies (Yale, Oxford, Stanford, Nature) and specialist journalism (2023–2026). Documents 8 bias types with distinct levels of evidence.

Main conclusion: ChatGPT is not neutral. It exhibits selective, reproducible, and quantifiable biases — both in the differential treatment of competing brands and in broader dimensions (politics, race, gender, recency, geography). Some biases are structural and difficult to eliminate; others have direct, measurable commercial implications for brand visibility in AI.

1. Empirical data (Omnia)

Study methodology

Monitoring period: Feb 26 – March 28 2026 (30 effective days)
Brands analysed: Attio, Claude, Cursor, Linear, Notion, Stripe, Vercel, Supabase
Categories covered: CRM, AI Assistant, IDE/Coding, Project Management, Workspace, Payments, Frontend/Deploy, Backend/Database
Engines analysed: OpenAI (ChatGPT), Perplexity, Google AI Overviews, Google AI Mode
Total queries per prompt: ~120 (4 engines × 7 days)
Active prompts per brand: 10 (non-branded + branded)

Brand performance table in ChatGPT

Patterns detected in ChatGPT

Pattern A1: Selective self-promotion

ChatGPT systematically self-promotes when asked about AI assistants, but does not appear among the dominant sources in any other category. The self-promotion is selective, not universal.

Evidence:

OpenAI appears in responses about Claude with 16.32% SOV — the most mentioned competitor in that category
OpenAI has 61.85% visibility in the AI Assistant category, ahead of Claude (56.67%)
Claude ranks below OpenAI in its own branded category
In CRM, Payments, PM, Workspace: openai.com = 0.33% of citations — practically absent
Perplexity shows a similar but more moderate pattern: ~38% self-visibility in its own category

Mechanism: ChatGPT's training data contains a disproportionate amount of content positioning OpenAI as the reference in the AI space. The model reproduces this corpus bias without an explicit self-citation filter.

Brand implication: Claude has a structural disadvantage — the engine evaluating it is its primary competitor. For competitive analysis in the AI Assistant category, ChatGPT data must be interpreted with this distortion in mind.

Pattern A2: Over-representation of the Microsoft/GitHub ecosystem in coding tools

In the developer tools/IDE category, ChatGPT cites and recommends the Microsoft/GitHub ecosystem 2.1x more frequently than Cursor, the category leader by its own SOV.

Evidence:

Cursor SOV on OpenAI: 11.61% (rank #1 in its category)
GitHub + Microsoft combined: 24.56% SOV — 2.1x more than Cursor
GitHub Copilot appears in almost every category response about coding tools
Visual Studio Code is referenced as the default environment even in prompts centred on Cursor
Comparison: in Payments, Stripe leads with 18.83% and Visa/Mastercard/PayPal appear in natural proportion — no over-representation of partners

Mechanism: Microsoft invested $10B in OpenAI and all stateless API access to OpenAI flows contractually through Azure. This relationship creates a presence asymmetry in training data and possibly in the model's relevance criteria.

Brand implication: Cursor's SOV in ChatGPT understates its real market position. For brands in categories with OpenAI investor partners, Google AI Mode and Perplexity are more neutral references.

Pattern A3: Neutrality in categories with no conflict of interest

In 5 of the 8 categories analysed, ChatGPT shows equitable behaviour: brands appear in proportion to their real market authority, with no over-representation of OpenAI commercial partners or investors.

Evidence:

CRM: HubSpot, Salesforce, Pipedrive dominate by real brand authority. Attio appears in proportion to its notoriety
Payments: Stripe leads legitimately. PayPal, Braintree, Square appear in a balanced way
Project Management: Jira/Atlassian dominates, Linear appears as a legitimate alternative
Workspace: Notion leads. Obsidian, Roam, Coda appear proportionally
Frontend/Deploy: Vercel leads. Netlify, AWS Amplify, Cloudflare Pages appear without distortion

Mechanism: In categories where OpenAI has no direct investment or competition, the model has no structural incentive to distort results. Rankings reflect the volume and quality of web content available about each brand.

Brand implication: For brands in neutral categories (CRM, Payments, PM, Workspace, Frontend), SOV in ChatGPT is a reliable indicator of real positioning. For the other three categories, it should be used with caution.

Pattern A4: Wikipedia as an authority bias amplifier

ChatGPT uses Wikipedia as a primary source disproportionately. Brands with more extensive and recent Wikipedia articles receive more citations in ChatGPT, regardless of their current market relevance.

Evidence:

Wikipedia = 168 total citations in the cross-brand study → 9.35% of all OpenAI citations
It is the single most cited source by ChatGPT across all categories analysed
Google AI Mode and Perplexity cite Wikipedia significantly less
Extreme case — Supabase: Wikipedia accounts for 31.4% of all OpenAI citations about the brand, the highest ratio of all brands studied. Supabase (founded 2020, 4 years of web presence) competes against Firebase (acquired by Google in 2014, 13 years of web presence): ChatGPT reflects that historical corpus, not the current state of the open-source market
External data confirms: brands with extensive Wikipedia achieve first ChatGPT citation in 28 days on average; brands without Wikipedia presence take 52 days — almost double

Mechanism: ChatGPT was trained on Wikipedia dumps as a high-quality source. This weighting persists in citation patterns: Wikipedia acts as a proxy authority signal, regardless of whether the article reflects the current state of the sector.

Brand implication: Wikipedia presence is currently the highest-ROI lever for increasing visibility in ChatGPT. For young brands or startups, product quality does not compensate for the historical authority gap in the short term.

Pattern A5: Source concentration. OpenAI cites fewer domains but more frequently

OpenAI uses a significantly smaller source pool than other engines, but cites each source with greater repetition. This creates a high entry barrier: being among the reference domains determines visibility, not breadth of web presence.

Evidence:

Mechanism: ChatGPT does not perform RAG (real-time retrieval) in the same way as Perplexity or Google AI Mode. Its responses are generated largely from patterns learned during training, which concentrates references on a set of sources the model "remembers" most frequently.

Brand implication: To appear in ChatGPT it is more effective to gain presence in 5–10 high-authority domains than to be mentioned diffusely across hundreds of sites. Content strategy should prioritise source quality over mention volume.

2. External research: Academically documented patterns

Research Methodology

Documented bias patterns

Pattern B1: Political and ideological bias, left-libertarian orientation

ChatGPT shows a systematic political orientation toward left-leaning and libertarian positions, documented in 8+ peer-reviewed studies using different methodologies. The bias is not static: it varies across model versions and depending on the language of the query.

Evidence:

15 out of 15 studies found a left-libertarian bias when applying Political Compass and Moral Foundations frameworks
GPT-4o shows a "rightward shift" compared to previous GPT-4 (Euronews, Feb 2025) — the bias evolves across versions
Political bias varies by language: the same model responds differently in English, Arabic, or Spanish (arXiv 2504.06436, 2025)
OpenAI claims a 30% reduction in political bias in GPT-5 (Decrypt, 2025) — though the full evaluation methodology has not been published

Mechanism: Web training data has a disproportionate representation of English-language, urban, and digitally-connected content — which historically tends toward progressive positions. The RLHF process can reinforce this bias if human evaluators share that profile.

Brand implication: Brands positioned in conservative, religious, or emerging-economy markets may receive less favourable treatment in ChatGPT-generated responses without any objective relevance reason.

Pattern B2: Sycophancy and confirmation bias. ChatGPT tells you what you want to hear

ChatGPT tends to align with what the user already believes or wants to hear. These two biases share the same underlying mechanism and reinforce each other: confirmation bias validates prior beliefs before the user responds; sycophancy yields to user disagreement even when the user is wrong.

Evidence:

ChatGPT changes its initial response when users express disagreement in the majority of tested cases
Stanford-Carnegie Mellon (2025): ChatGPT ranks among the most sycophantic models evaluated
OpenAI pulled the GPT-4o update in April 2025 explicitly due to excessive sycophancy and published an official post-mortem: "Sycophancy in GPT-4o: What happened and what we're doing about it"
PMC NIH / Annals of New York Academy Sciences (2025): ChatGPT "actively amplifies prior beliefs in health-related queries"
arXiv 2504.09343 (2025): confirmation pattern replicated across multiple domains

Mechanism: The RLHF (Reinforcement Learning from Human Feedback) process reinforces responses that generate immediate human evaluator approval, not necessarily the most accurate responses. The model learns that yielding to disagreement produces better feedback.

Brand implication: In comparative evaluations where the user has already declared a preference, ChatGPT tends to confirm it rather than assess it objectively. Brand perception tests conducted using ChatGPT as judge may be contaminated by researcher expectations.

Pattern B3: Cultural and geographic bias, systematic western perspective

ChatGPT systematically favours perspectives from the US, Western Europe, and parts of East Asia, presenting the rest of the world with a bias toward lower credibility and innovation.

Evidence:

Oxford University (2026): study with 20M+ queries → ChatGPT "amplifies existing global inequalities"
PNAS Nexus (2024): ChatGPT culturally aligned with European Protestantism and Western individualism
71% of the world requires explicit "cultural prompting" to receive culturally appropriate responses
Developing countries appear described as "less innovative" and "less trustworthy" without additional context
"Non-standard" English dialects receive 22% more denigrating content in responses

Mechanism: The training corpus reflects the real distribution of web content, which is dominated by English, the US, and Western Europe. The model learns associations between geographies and quality attributes from that skewed distribution.

Brand implication: Companies founded in non-English-speaking markets or with names in other languages have systemically reduced visibility in ChatGPT. For clients in emerging markets, monitoring in Google AI Mode may be more representative than ChatGPT.

Pattern B4: Occupational gender bias

ChatGPT reproduces and amplifies gender stereotypes in professional contexts: it describes men and women with the same professional profile using qualitatively different language, and discriminates by gender in candidate evaluation.

Evidence:

Recommendation letters: ChatGPT writes about women using more "affective" and less "agentic" language than about men with the same profile
CV evaluation: names associated with women are favoured in only 11% of cases
Names associated with white men are favoured in 85% of cases
Women are 16 percentage points less likely to use ChatGPT in similar occupations to their male counterparts — the bias feeds the adoption gap

Mechanism: Training data contains decades of professional text written in contexts where gender roles were more polarised. The model has learned correlations between gender and language type that it reproduces even without intentional bias in the design.

Brand implication: Brands in historically feminised industries (healthcare, education, services) may receive less "authoritative" descriptions in ChatGPT. This affects both brand perception and talent attraction when candidates use ChatGPT to research employers.

Pattern B5: Racial bias and dialectal discrimination

The most extreme bias documented in ChatGPT: the model racially discriminates through dialect, especially against African American English (AAE), with more negative results than any human bias measured experimentally.

Evidence:

Nature (2024): words selected by ChatGPT to describe AAE speakers average 1.2 on a negativity scale — worse than any human bias measured in previous experiments
Judicial simulation: person using AAE sentenced to death in 28% of cases; person using Standard American English: 23%
CV evaluation (University of Washington, 2024): "The systems never preferred what are perceived as Black male names to white male names" — without a single exception across 34,560 tested combinations
Names associated with white men favoured in 85% of cases over minority names

Mechanism: Training data reflects historical biases present in web text. The RLHF process may inadvertently reinforce these patterns if human evaluators share the same implicit cultural biases.

Brand implication: Brands with names perceived as non-Anglo-Saxon or associated with minority founders may receive lower visibility in ChatGPT recommendations. This effect compounds with the cultural/geographic bias in B3.

Pattern B6: Anchoring bias

ChatGPT is systematically influenced by information that appears first in the question — the "anchor" — even if it is irrelevant or biased. If the question implies a preference, mentions a competitor first, or includes an expert opinion, the model builds its response dragging that starting point along.

Evidence:

Given two equivalent options, ChatGPT's preference can be reversed by up to 25% on average simply by adding an anchoring cue to the question
When the cue is framed as an "expert opinion", all tested models show strong adherence regardless of whether it is correct
GPT-4 is 3x less likely to respond negatively to a negatively framed question — the model overcorrects in the opposite direction to the perceived tone
In multi-option rankings, an item can shift up to 95 positions depending on how the question is framed

Mechanism: The model learns during training to give more weight to information presented first in the context (primacy effect). "Expert" cues activate authority associations that the model weights above the actual content of the question.

Brand implication: If a competitor appears first or with authority language ("experts recommend X") in a comparative prompt, ChatGPT will tend to favour it — even if the question intends to be neutral. Perception tests using ChatGPT as a judge are susceptible to this effect.

Pattern B7: Recency bias

ChatGPT prioritises recently published content over older content, even if the latter is more accurate or authoritative. A new but mediocre source can displace a classic and rigorous one simply by being more recent.

Evidence:

When a "fresh" article is introduced into the source set, the average publication date of the Top-10 results shifts up to 4.78 years forward — ChatGPT reorganises its entire ranking to favour the new
Given two documents with identical content, ChatGPT can change its preference by up to 25% simply by changing the publication date
Superficially rewritten or updated articles can outperform more accurate originals if the update date is more recent

Mechanism: During fine-tuning and RLHF, human evaluators tend to perceive more recent content as more relevant — a human cognitive bias the model learns and amplifies. This combines with the effect that more recent data tends to be more represented in later training cycles.

Brand implication: Including the year in an article title ("Best CRM of 2026") acts as a relevance signal that increases the likelihood of being cited by ChatGPT. Evergreen content without regular updates loses visibility progressively, regardless of its quality or accumulated authority.

Pattern B8: Opinion influence bias

ChatGPT systematically influences users' opinions toward more liberal positions without deliberate persuasive intent. The bias emerges from latent patterns in training, not from intentional design.

Evidence:

Yale/PNAS Nexus (2026): ChatGPT-generated summaries of historical events moved readers toward more liberal positions compared to Wikipedia entries on the same events
The effect is statistically significant though moderate in magnitude: it moves positions from "moderate" to "somewhat more liberal"
Cases tested: Seattle General Strike (1919) and Third World Liberation Front Protests (1968)
The effect emerges without intentional persuasive design — it is a product of latent training biases

Mechanism: The political biases in the training corpus (B1) transfer to the way ChatGPT summarises and frames historical events. The model does not attempt to persuade, but its selection of what to include and omit in a summary reflects the learned ideological patterns.

Brand implication: As more users use ChatGPT as a source of information about companies, sectors, and industries, the generated descriptions will influence brand perceptions in a cumulative and hard-to-detect way. Brands should monitor not only whether they appear in ChatGPT, but how they are described.

3. Synthesis: Convergence between both studies.

Convergence Table

What this means for brands monitored in AI

Brands with structural disadvantage in ChatGPT

Claude (claude.ai): The engine evaluating it is its primary competitor. In AI Assistant queries, OpenAI appears first with higher SOV and visibility. ChatGPT neutrality in this category is structurally impossible.

Cursor: The ecosystem of its most relevant investor (Microsoft/GitHub) appears 2.1x more frequently in the same category. Cursor leads in its own SOV but competes against its own shareholder ecosystem in ChatGPT responses.

Supabase: Directly competing with Firebase (13 years of web vs. 4 years of Supabase) in a category where Wikipedia concentrates 31.4% of ChatGPT citations. The gap is not one of commercial interest but of corpus age — but the effect is equally harmful.

Brands in neutral or favourable position

Stripe, Notion, Linear, Attio, Vercel: No ecosystem or investor biases detected. Their visibility in ChatGPT is proportional to their real brand authority and the quality of third-party content mentioning them. SOV in ChatGPT is a reliable indicator of positioning in these categories.

Main sources

Omnia empirical research

Omnia MCP — Monitoring data Feb 26 - March 28, 2026: Attio, Claude, Cursor, Linear, Notion, Stripe, Vercel, Supabase
4 engines: OpenAI, Perplexity, Google AI Overviews, Google AI Mode
~120 queries/prompt per brand (10 active prompts per brand)

Peer-reviewed academic studies

Nature 2024 — AAE dialectal discrimination
Oxford University 2026 — Cultural bias, 20M+ queries
Yale News / PNAS Nexus 2026 — Opinion influence
OpenAI Official 2025 — Sycophancy in GPT-4o
arXiv 2412.06593 — Anchoring bias in LLMs
ACM SIGIR 2025 — Recency bias in LLMs
University of Washington 2024 — Racial bias in CV screening
Stanford HAI 2024 — Racial bias in names
SEMrush Blog 2025 — Most cited domains by AI
‍arXiv 2301.01768 — ChatGPT political bias, 2023

‍

Omnia offers a 14-day free trial on the Growth plan.

No credit card required. See exactly where your brand shows up (or doesn't) across AI engines, then let the platform's recommendations guide your next move.

Written By

Favorite topics:

No items found.

ChatGPT bias: A pattern analysis

Report summary

1. Empirical data (Omnia)

Study methodology

Brand performance table in ChatGPT

Patterns detected in ChatGPT

Pattern A1: Selective self-promotion

Pattern A2: Over-representation of the Microsoft/GitHub ecosystem in coding tools

Pattern A3: Neutrality in categories with no conflict of interest

Pattern A4: Wikipedia as an authority bias amplifier

Pattern A5: Source concentration. OpenAI cites fewer domains but more frequently

2. External research: Academically documented patterns

Research Methodology

Documented bias patterns

Pattern B1: Political and ideological bias, left-libertarian orientation

Pattern B2: Sycophancy and confirmation bias. ChatGPT tells you what you want to hear

Pattern B3: Cultural and geographic bias, systematic western perspective

Pattern B4: Occupational gender bias

Pattern B5: Racial bias and dialectal discrimination

Pattern B6: Anchoring bias

Pattern B7: Recency bias

Pattern B8: Opinion influence bias

3. Synthesis: Convergence between both studies.

Convergence Table

What this means for brands monitored in AI

Brands with structural disadvantage in ChatGPT

Brands in neutral or favourable position

Main sources

Omnia empirical research

Peer-reviewed academic studies

Related posts

AI Citation Tracking Guide: What it is, How to Measure it, and How to Turn it into Action

How to Improve Visibility in Google AI Overviews - A 2026 Guide

10 Best Profound AEO Alternatives for 2026

Start boosting your AI brand visibility today