Knowledge · criteria

Duplicate Content Blocks: When You Copy Yourself, AI Stops Trusting You

Repeating the same paragraph in multiple sections of the same page is a quality signal that AI engines detect and penalize. Duplicate Content Blocks measures within-page content repetition. When more than 5% of your content is duplicated across sections, AI engines reduce citation confidence because the page appears auto-generated or padded.

Audit each page for paragraphs that appear in multiple sections. If two sections contain paragraphs with more than 40% shingle similarity, rewrite one of them with unique language. This criterion (5% weight, Answer Readiness pillar) acts as a quality gate - severe duplication caps your overall AEO Site Rank at 35/100 regardless of how well you score on everything else.

What this article answers

  • What are Duplicate Content Blocks and how do they affect my AEO Site Rank?
  • How does within-page duplication differ from cross-page duplication?
  • Can duplicate content cap my overall AEO Site Rank?

Key takeaways

  • Duplicate Content Blocks checks for repeated paragraphs WITHIN a single page, not across pages (that is Cross-Page Duplication).
  • The scorer uses shingle-based Jaccard similarity: paragraphs with 40%+ overlap are flagged as duplicates.
  • This criterion acts as a quality GATE: a score of 0/10 caps your overall AEO Site Rank at 35, score of 1-2 caps at 45, score of 3-4 caps at 55, score of 5-6 caps at 65.
  • Three or more duplicate paragraphs scores 0/10 automatically - this is treated as a serious content quality issue.
  • Common offenders: CTA blocks copied into every section, boilerplate disclaimers repeated mid-content, and AI-generated content with repetitive phrasing.

What Are Duplicate Content Blocks?

Duplicate Content Blocks detects within-page content repetition. It looks for paragraphs in different sections of the same page that contain substantially similar text. This is not about cross-page duplication (two different URLs with the same content) - that is a separate criterion. This is about a single page repeating itself.

The scorer extracts all meaningful sections from the page (identified by headings), then compares every paragraph in section A against every paragraph in section B using shingle-based Jaccard similarity. Shingles are overlapping word sequences that capture phrase-level similarity, not just individual word overlap. When two paragraphs from different sections exceed 40% shingle similarity, they are flagged as duplicates.

Why does this matter? Within-page duplication is a strong signal of low-quality or auto-generated content. AI engines use it as a quality filter. A page where the same marketing paragraph appears in three different sections looks like it was assembled from templates without human editing. AI engines deprioritize such pages because they signal low editorial investment.

The Quality Gate Effect

Duplicate Content is one of the most impactful criteria in the scoring system because it acts as a quality gate. The overall AEO Site Rank has a hard cap based on the duplicate content score:

  • Score 0/10 (3+ duplicates): Overall AEO Site Rank capped at 35/100
  • Score 1-2/10: Overall AEO Site Rank capped at 45/100
  • Score 3-4/10: Overall AEO Site Rank capped at 55/100
  • Score 5-6/10: Overall AEO Site Rank capped at 65/100
  • Score 7+/10: No cap applied

This means a site can score perfectly on all other 48 criteria and still be capped at 35/100 if it has severe within-page duplication. The gate exists because AI engines fundamentally distrust pages that repeat themselves - it undermines the reliability of every other signal on the page.

At 5% weight in the Answer Readiness pillar, this is one of the highest-weighted criteria. Combined with the gate effect, it is arguably the single most important criterion to get right.

How Do You Fix Duplicate Content Blocks?

Step 1: Identify duplicated sections

The audit report shows which section pairs contain duplicated paragraphs. Look for the duplicate evidence in your detailed findings - it shows the headings of both sections and a sample of the duplicated text.

Step 2: Rewrite one version

For each duplicate pair, rewrite one of the two paragraphs with unique language. Do not just swap synonyms - restructure the sentence to make a different point or provide different details.

Step 3: Remove repeated CTAs

The most common offender is CTA blocks (call-to-action paragraphs) copied into every section. Instead, use a single CTA at the end or vary the CTA language per section:

  • Section 1 CTA: “See pricing plans”
  • Section 2 CTA: “Compare features in our interactive table”
  • Section 3 CTA: “Start a free trial - no credit card required”

Step 4: Check AI-generated content

If your content was generated by AI, review it for the repetitive phrasing patterns that language models sometimes produce. Look for paragraphs that start with similar templates (“When it comes to…,” “It is important to note that…”) across multiple sections.

Step 5: Template audit

If your CMS uses content blocks or templates, check whether a shared block is being inserted into multiple sections of the same page. Move shared blocks to the footer or sidebar instead of injecting them into body content.

How AI Engines Evaluate This

ChatGPT processes pages top-to-bottom and builds a content model as it reads. When it encounters a paragraph it already processed earlier on the same page, it reduces the page’s overall quality score. Repeated content also means ChatGPT has fewer unique passages to select for citation - the effective content pool shrinks even though the page length stays the same.

Claude runs content quality checks before citation. Pages with within-page duplication get flagged as lower quality, which reduces Claude’s willingness to cite any content from that page - including the unique, non-duplicated portions. Claude treats duplication as a page-level quality signal, not a paragraph-level one.

Perplexity processes content at the passage level for real-time answer assembly. Duplicate passages waste processing budget - Perplexity encounters the same information twice and has to deduplicate before assembling its response. Pages that require deduplication get lower source quality scores because they impose extra processing cost.

Google AI Overviews uses content uniqueness as a quality signal for source selection. Pages with substantial within-page duplication are less likely to be selected as AI Overview sources because they suggest thin or template-generated content.

External Resources