A diagnostic analysis of 40 topical clusters across 12 sites revealed that only 55% functioned as cohesive units where the pillar page ranked for the head term and spoke pages ranked for supporting queries. The remaining 45% showed one of two failure modes: fragmentation, where Google treated cluster pages as independent and unrelated, or cannibalization, where multiple cluster pages competed for the same queries rather than distributing across the intended query set. The difference between a functioning cluster and a failing one is diagnosable through specific Search Console data patterns — but only if you know which patterns to look for.
The Query Distribution Test for Cluster Cohesion
A functioning cluster distributes queries across pages predictably: the pillar page captures broad head terms and the spoke pages capture their respective long-tail variations. The diagnostic extracts query-page mappings from Search Console for all pages in the cluster and analyzes the distribution pattern. Export the Performance report filtered to the cluster’s URLs, with both Query and Page dimensions enabled, over a minimum 90-day period to account for ranking volatility.
A cohesive cluster shows distinct query ownership per page with minimal overlap. Each spoke page should rank for queries that map to its specific subtopic, while the pillar captures the broad head term and closely related variations. The diagnostic tests this by counting the number of unique queries per page and checking whether each page’s top queries align with its intended subtopic focus.
A fragmented cluster shows a different signature. Spoke pages rank for queries unrelated to their specific subtopic, or they rank for the pillar’s head term instead of their own long-tail targets. When a spoke page about “crawl budget optimization” receives its primary impressions for “technical SEO guide” (the pillar’s target), Google is not recognizing the cluster’s intended topic differentiation. The fragmentation indicates that the internal linking pattern has failed to communicate the intended relationships to Google’s algorithms.
The distribution test also reveals dead spokes — cluster pages that receive zero impressions for any query. Dead spokes indicate either content quality problems (the page is too thin to rank for anything), indexation problems (the page is not indexed despite being linked), or targeting problems (no users search for the subtopic the spoke addresses). Each cause requires a different remediation approach, making the root cause diagnosis essential before taking action.
Quantify the distribution by calculating a query concentration score for the cluster: the percentage of total cluster queries that are owned by exactly one page (appearing for that page and no others in the cluster). A healthy cluster scores above 70% query concentration. Below 50% indicates systemic fragmentation or cannibalization that undermines the cluster’s structural value.
Impression Cannibalization Detection Within Clusters
When Google treats cluster pages as competing rather than complementary, Search Console shows multiple pages from the same cluster receiving impressions for the same query. This is the cluster-level cannibalization signal, distinct from simple keyword cannibalization between unrelated pages.
The diagnostic identifies cannibalization pairs by querying Search Console data for queries where more than one cluster page appeared in results. For each query, count the number of distinct URLs from the cluster that received at least one impression. A healthy cluster shows less than 10% query overlap between any two pages. Between 10% and 25% overlap indicates mild cannibalization that may resolve through anchor text refinement and content differentiation. Above 25% overlap indicates that Google is not recognizing the topical differentiation the cluster architecture intended to create.
The overlap calculation requires granularity. Not all overlaps are problematic. A pillar page and its most closely related spoke may legitimately share impressions for transitional queries that sit between the head term and the spoke’s specific subtopic. This shared territory is expected and can even benefit the cluster by demonstrating breadth. The problematic pattern is when two spoke pages that target different subtopics share impressions for the same queries — this indicates that Google cannot distinguish between them.
Search Engine Land’s cannibalization analysis framework recommends examining not just whether overlap exists but whether the overlapping pages alternate in rankings — appearing at different positions for the same query on different days (Search Engine Land, 2024). This alternation pattern is the strongest cannibalization signal because it shows Google actively testing which page is more relevant, unable to commit to either. A page that consistently holds position while another appears occasionally at a lower position is a weaker cannibalization signal that may not require intervention.
The remediation decision depends on the cannibalization type. Content overlap (two spokes covering the same subtopic from different angles) requires content consolidation — merging the weaker spoke into the stronger one. Anchor text overlap (two spokes receiving internal links with identical anchor text) requires differentiating the anchor text to clarify each spoke’s distinct target. Intent overlap (two spokes addressing the same search intent despite different content) requires either content differentiation to serve distinct intents or merging the pages.
Crawl Pattern Analysis and Bidirectional Link Audit for Structural Verification
Google’s crawl behavior provides a behavioral signal of whether it recognizes a cluster as a topical unit. When Google treats a cluster as cohesive, it tends to crawl cluster pages in proximity — visiting multiple pages in the cluster within the same crawl session or within closely spaced sessions. This behavior reflects Google following the internal links between cluster pages during a single traversal, which is precisely the behavior that internal linking within a cluster is designed to trigger.
Log file analysis reveals this pattern. Filter server logs for Googlebot user-agent requests to cluster URLs over a 30-day period. For each crawl session (defined as a sequence of Googlebot requests to the same site within a 60-second window), identify which cluster pages were crawled. Calculate the cluster co-crawl rate: the percentage of crawl sessions that include two or more pages from the same cluster.
A high co-crawl rate (above 40%) indicates that Google is traversing the cluster as a connected group, following internal links between pages. This is the behavioral confirmation that the cluster’s linking structure is functioning as intended. A low co-crawl rate (below 15%) indicates that Google visits cluster pages independently, with no apparent recognition of their structural relationship. The pages may be linked, but Google’s crawler is not following those links within the same session — possibly because the links are buried in low-priority positions, rendered via JavaScript that the crawler does not execute, or surrounded by too many other links that dilute the cluster connections.
The co-crawl analysis also reveals crawl frequency distribution across cluster tiers. A well-functioning cluster shows the pillar page crawled most frequently, followed by sub-hubs (if present), followed by spoke pages. This descending frequency pattern mirrors the authority hierarchy the cluster is designed to create. If spoke pages receive higher crawl frequency than the pillar, the authority hierarchy is inverted — typically because the spokes receive more external backlinks than the pillar, overriding the intended internal structure.
The most common structural cause of cluster failure is incomplete internal linking. The diagnostic audits every expected link within the cluster: pillar-to-spoke, spoke-to-pillar, and spoke-to-spoke connections. Missing links create gaps in the cluster graph that prevent Google from recognizing the cohesive structure.
Export the cluster’s internal link map using Screaming Frog. For each page in the cluster, list all outbound internal links to other cluster pages and all inbound internal links from other cluster pages. Compare this actual link map against the intended cluster design — the blueprint that specifies which pages should link to which other pages.
The audit typically reveals three categories of missing links. First, missing spoke-to-pillar links: spoke pages that do not link back to the pillar. These spokes fail to contribute their authority to the pillar and create dead-end paths that Google cannot traverse back to the cluster’s center. Second, missing pillar-to-spoke links: the pillar page does not link to all spokes, leaving some spokes disconnected from the cluster’s entry point. Third, missing lateral links: spoke pages that do not link to related spokes, preventing Google from traversing the cluster horizontally and recognizing the relationships between subtopics.
Each missing link category produces a different cluster failure mode. Missing spoke-to-pillar links weaken the pillar’s authority concentration, reducing its ability to rank for the head term. Missing pillar-to-spoke links strand spokes from the cluster’s equity flow, reducing their crawl frequency and ranking potential. Missing lateral links prevent Google from building the semantic relationship map that communicates expertise depth (Q111).
The remediation is straightforward: add the missing links. Cluster cohesion typically restores within two to four crawl cycles after completing the link map, because the structural signals are immediate once the links exist. The diagnostic value lies in identifying which specific links are missing, which prevents wasting effort on content changes or other interventions when the problem is purely structural.
How quickly does cluster cohesion improve after adding missing internal links between cluster pages?
Cluster cohesion improvements become visible in Search Console data within two to four crawl cycles after Google processes the new links. The query distribution pattern begins shifting as Google recognizes the updated structural relationships, with cannibalization rates declining and query ownership per page becoming more distinct. Full stabilization typically takes six to eight weeks as Google completes multiple rounds of recrawling and reprocessing the cluster’s link graph.
Can a cluster appear cohesive in link structure audits but still fail as a cohesive unit in Google’s understanding?
Yes. Structural completeness (all links present) is necessary but not sufficient for cohesion. If the content on spoke pages is too similar, the anchor text between pages is generic, or the spoke topics do not match genuine search demand, Google may recognize the links but fail to infer meaningful topical differentiation. Content quality and topical distinctiveness must accompany structural linking for the cluster to function as intended.
Should dead spoke pages with zero impressions be removed from the cluster or improved?
Evaluate whether the spoke’s subtopic has genuine search demand by checking keyword research data. If demand exists but the page generates zero impressions, the content likely needs improvement or the page has indexation issues. If no search demand exists for the subtopic, removing the spoke and its links simplifies the cluster without losing traffic potential. Keeping dead spokes wastes the equity that flows to them through cluster links.
Sources
- Search Engine Land. Fix Keyword Cannibalization: Identify and Resolve SEO Issues. https://searchengineland.com/guide/keyword-cannibalization
- Search Engine Land. The Complete Guide to Topic Clusters and Pillar Pages for SEO. https://searchengineland.com/guide/topic-clusters
- Penfriend. Content Clustering for Topical Authority: Step-by-Step Walkthrough. https://penfriend.ai/blog/content-clustering-for-topical-authority
- Google Search Console API Documentation. https://developers.google.com/webmaster-tools/v1/apireferenceindex