1. What Is Entity Architecture?
Entity architecture is the practice of designing structured data markup and cross-property relationships so that search engines and AI systems can recognize, understand, and cite your entity, whether that entity is a person, organization, product, or concept, with confidence. It is the bridge between what you declare about yourself and what machines believe about you.
The word "architecture" is deliberate. This is not "adding schema markup to a page." Adding schema is like placing a brick. Entity architecture is designing the building. It encompasses how entities are defined, how they reference each other, how they establish authority through cross-domain verification, and how the resulting entity graph becomes durable enough to survive the compression of AI training pipelines.
When Google's Knowledge Graph recognizes an entity, it creates a persistent node that aggregates information from across the web. When an AI training run ingests your structured data, the entity relationships you have declared become weighted connections in the model's understanding. When a user asks ChatGPT or Perplexity a question about your domain of expertise, entity architecture is what determines whether your name appears in the response.
This is the defining skill of AI-era SEO. Not because schema markup is new. Because the consumers of schema markup have changed. Google used to be the only reader. Now GPTBot, ClaudeBot, PerplexityBot, and dozens of other AI crawlers are reading your structured data, and they are using it to build the models that generate the responses that are replacing search results.
Traditional SEO asked: "How do I rank a page for a keyword?" Entity architecture asks: "How do I establish an entity that gets cited?" Keywords are ephemeral. Entities are permanent. A keyword ranking can vanish with an algorithm update. An entity, once recognized by the Knowledge Graph and embedded in AI training data, persists across model versions.
2. Why Entities Matter More Than Keywords
The keyword era of SEO is not dead, but it is dying. Google still uses keywords for matching queries to documents. But the trajectory is unmistakable: Google's systems increasingly think in entities, not strings. And AI systems think exclusively in entities.
Consider what happened with Google's Knowledge Graph, launched in 2012. Google began associating search queries not just with documents containing matching keywords, but with entities: recognized things with attributes, relationships, and canonical identifiers. Search for "Barack Obama" and Google does not just return pages containing those words. It returns a Knowledge Panel: a structured representation of the entity, with birth date, education, family, and a web of relationships to other entities.
AI Overviews (formerly SGE) accelerated this shift. When Google generates an AI Overview for a query, it is synthesizing information from its entity graph, not just concatenating text from top-ranking pages. The sources it cites are sources it trusts at the entity level. And this is where entity architecture becomes decisive: if your entity is not in the graph, it cannot be cited in the overview.
LLM citations follow the same pattern at a deeper level. When ChatGPT, Claude, or Perplexity answers a question and attributes information to a source, that attribution traces back to entity recognition in the training data. The models do not cite URLs. They cite entities. A well-architected entity that appears consistently across multiple authoritative sources, with coherent structured data, becomes a reliable node in the model's knowledge representation.
The practical implications are stark:
- Keywords are rented. You rank for a keyword only as long as Google's algorithm agrees. One update can erase months of effort. You are a tenant in Google's house.
- Entities are owned. Once your entity is recognized by the Knowledge Graph, once it has been ingested by AI crawlers and embedded in training data, that recognition persists. You have created a persistent node in the machine's understanding of the world.
- Keywords are per-page. Each page targets specific keywords. You need hundreds of pages to cover a keyword space.
- Entities are per-identity. A single well-architected entity can be cited across thousands of queries. The entity "Guerin Green" + "AI Strategy Consultant" + associated knowsAbout topics creates relevance for every query that intersects with those topics.
3. The Schema.org Foundation
Schema.org is the vocabulary. JSON-LD is the syntax. Together, they are the language you use to communicate entity architecture to machines. Schema.org defines the types (Person, Organization, Article, Course, Event) and the properties (name, url, knowsAbout, knows, sameAs, @id) that describe entities and their relationships.
The critical properties for entity architecture are not the obvious ones. Everyone knows to add a name and url. The properties that separate amateurs from architects are these:
@id: The Persistent Entity Identifier
The most important property in entity architecture. @id creates a canonical, persistent identifier for an entity that can be referenced from anywhere on the web. When multiple pages, on multiple domains, reference the same @id, they are all pointing to the same entity. This is how crawlers and AI systems connect declarations across properties into a unified entity graph.
Without @id, every schema declaration is an island. With @id, every declaration becomes a node in a connected graph. This is the difference between "Guerin Green is mentioned on this page" and "Guerin Green, the canonical entity at novcog.com, authored this article on hiddenstatedrift.com, which is published by Novel Cognition, which he founded."
knowsAbout: Topical Authority Declaration
knowsAbout declares the topics an entity has expertise in. But declaring "knowsAbout": "SEO" is nearly worthless. The power comes from linking knowsAbout to recognized entities via sameAs references. When you declare that a Person knows about "Knowledge Graph" and link it to the Wikipedia article for Knowledge Graphs, you are connecting your entity to a recognized concept in the Knowledge Graph itself.
knows: Entity Graph Connections
knows creates relationships between entities. Person A knows Person B. Organization A is affiliated with Organization B. This property is what turns isolated entity declarations into a network. And networks are what AI systems use to establish confidence in entity recognition.
sameAs: Cross-Platform Verification
sameAs declares that this entity is the same entity represented on other platforms. LinkedIn profile. GitHub account. Wikipedia page. Ballotpedia entry. Crunchbase profile. Each sameAs link is a verification signal. The more platforms that confirm the entity exists and matches the declared attributes, the higher the confidence score in the Knowledge Graph.
4. Entity Architecture Patterns
Entity architecture is not a single technique. It is a set of patterns that can be composed into increasingly powerful configurations. HSD practitioners learn to recognize and deploy these patterns across their domain portfolios.
Pattern A: Hub-and-Spoke
The most fundamental pattern. A canonical entity site (the hub) with multiple properties (spokes) that all reference the hub entity. The hub defines the canonical @id for Person and Organization entities. Every spoke references those @id values in its author, publisher, and founder declarations.
The power of hub-and-spoke is cumulative. Each spoke that references the hub entity adds a signal. Five spokes is noticeable. Twenty spokes is authoritative. Sixty spokes, each with consistent @id references, proper schema, and unique content in the entity's domain of expertise, is dominant.
Pattern B: Bidirectional knows
A critical pattern that most practitioners miss. When Person A declares knows Person B, that is one directional signal. When Person B also declares knows Person A, the signal is bidirectional. The strength of the relationship doubles because both entities independently confirm the connection.
This pattern is especially powerful for establishing authority within a niche. When five practitioners in the AI-SEO space all declare bidirectional knows relationships, they form a cluster that AI systems recognize as a professional community. Each entity in the cluster benefits from the authority of the others.
Pattern C: Shared Publisher
Multiple websites declaring the same Organization as their publisher. This establishes the Organization entity as a media entity with a portfolio of publications. Each site in the portfolio inherits credibility from the parent, and the parent accumulates authority from the combined content of all sites.
Pattern D: sameAs Constellation
The entity declared across every platform where it has a verified presence. LinkedIn. GitHub. Ballotpedia. Wikipedia (if notable enough). Crunchbase. Skool community. Each platform acts as an independent verification node. When crawlers encounter the entity on hiddenstatedrift.com and see sameAs links pointing to LinkedIn, GitHub, and Ballotpedia, and those platforms confirm the entity exists with matching attributes, the confidence score increases with each verification.
Each sameAs link is not just a reference. It is a potential crawl target. GPTBot follows sameAs links. When it crawls from your site to your LinkedIn, finds matching entity data, and follows the link back, it has completed an independent verification loop. Five verification loops across five platforms creates an entity that the Knowledge Graph treats as established fact, not unverified claim.
5. From Schema to Citation
This is the section most entity architecture guides leave out, because the mechanism is not fully documented and the timelines are unpredictable. But understanding the pipeline from schema declaration to AI citation is critical for setting realistic expectations and measuring progress.
The pipeline works like this:
- You deploy schema markup. JSON-LD blocks on your pages declaring entities with @id, knowsAbout, knows, sameAs, and all other relevant properties.
- Crawlers visit your pages. Googlebot reads your schema for Knowledge Graph updates. GPTBot reads your schema for OpenAI's training pipeline. ClaudeBot reads it for Anthropic. PerplexityBot reads it for Perplexity's index.
- Schema enters the training pipeline. This is the opaque step. The structured data from your pages is processed, deduplicated, weighted against other sources, and potentially included in the next training run or index update.
- Entity recognition occurs. If your entity data is consistent, cross-verified, and appears across enough sources, the system recognizes it as a distinct entity with attributes and relationships.
- Citation becomes possible. When a user asks a question that intersects with your entity's declared expertise, and the system has sufficient confidence in your entity, your name or brand appears in the response.
The lag between step 1 and step 5 varies enormously. For Google's Knowledge Graph, changes can appear within days if the entity is already partially recognized. For LLM training pipelines, the lag is weeks to months, because training runs happen on schedules, not continuously. For real-time systems like Perplexity that index dynamically, the lag can be as short as hours.
const pipeline = {
schemaDeployment: "Day 0",
crawlDetected: "Day 1-7 (varies by bot)",
googleKnowledgeGraph: "Days to weeks",
perplexityCitation: "Hours to days",
chatGPTCitation: "Weeks to months (training cycles)",
claudeCitation: "Weeks to months (training cycles)",
compounding: "Each crawl reinforces previous signals"
};
The key insight: this is not a one-time optimization. It is an ongoing architectural practice. Each new page with consistent entity markup adds a signal. Each new domain in your network that references your canonical @id adds a signal. Each crawl that encounters your entity across multiple verified sources adds a signal. The signals compound. The entity gets stronger. The citations become more frequent and more confident.