--- title: The Invisible Tags Shaping Your AI Citations — A Metadata Bundle for 2026 description: title, meta description, og:image, canonical — four tags decide whether ChatGPT and Perplexity render your brand on the citation card or just a grey rectangle. Here's the production-ready bundle. date: 2026-05-05 tags: [aeo, seo, metadata, opengraph, nextjs] --- The search results page of 2026 is no longer "ten blue links." It's a citation card in Perplexity, an inline link in ChatGPT with browsing, a source list in Google AI Overviews. And when an AI decides whether to surface your brand among the sources, it doesn't just read the article body. It reads metadata. A title without an `og:image`. A description that got truncated mid-sentence. Two URLs pointing at the same content. Each of these mistakes costs you a citation — and the AI traffic that's currently being redistributed in favor of teams that bothered to clean up their ``. Let's break down which tags actually matter, why, and how to assemble them into a single metadata bundle. ## 1. How LLM engines build the "citation card" Perplexity, ChatGPT with search, Claude with the web tool, Google AI Overviews — they all work as RAG systems. First retrieval (search), then synthesis, then a rendered answer with links. According to architectural notes on Perplexity, its orchestration engine embeds citation markers, source metadata (URL, publication date), and ranked fragments directly into a structured prompt before generation. Citations are not added post-hoc; they are architecturally bound to specific documents at the context-assembly stage. When the model assembles the source card itself, it pulls three things: the page's ``, the text from `meta description` (often reused as a snippet), and `og:image` for the preview. Researchers at Analyze AI, who studied 83,670 AI citations, found that pages with a clear `og:title`, accurate `og:description`, and well-structured `og:type` metadata give crawlers several converging signals about content. When those signals align with the visible content, the engine interprets and cites the page more accurately. The conclusion is simple. If you don't have `og:image`, your source in Perplexity will look like a grey rectangle with a favicon. Next to a competitor with a branded preview. Guess what the user clicks. ## 2. Optimal sizes: tag anatomy - **`<title>` — 10–60 characters** The most visible element of the card. Too short (under 10 characters) is too sparse for semantic query matching. Too long (over 60) gets truncated in SERPs, in Slack previews, in Perplexity's citation card. *Working formula:* Primary intent | Brand context. Example: `Metadata Bundle for AI Citations | DocsHub`. Forty-three characters. Fits everywhere. - **`<meta name="description">` — 50–160 characters** This tag doesn't directly affect Google rankings. But it's the primary candidate for the snippet an LLM returns in your card. When an AI system decides whether to cite your page, meta tags often shape first impressions, and a well-phrased `meta description` can be the deciding factor — your source gets quoted or skipped. Fifty characters gives you a clear claim, 160 gives you room for an argument. - **`og:image` — 1200×630, absolute HTTPS, under 5 MB** The universal size that works on nine of ten platforms — 1200×630 pixels. Less than 600×315 and the engine either ignores the image or shows a thumbnail instead of a full card. Keep key details inside the central 80% of the canvas. Format: JPG for photos, PNG for graphics with text. *The most common mistake:* a relative URL in `og:image`. Crawler bots cannot resolve it relative to your domain. Always write `https://example.com/og/article.png`, not `/og/article.png`. ## 3. Canonical — the main defense against duplication Here's where SEO careers go to die and AI citations get lost. LLM engines aggressively deduplicate content. They see the same article on `?utm_source=newsletter`, on `/blog/article/`, on the `m.example.com` mirror — and have to decide which version is canonical. If you didn't tell them explicitly, they decide for you. For example, ChatGPT uses Bing's index for most real-time queries, and the URL Bing considers canonical is the one that ends up in ChatGPT's citation — regardless of what your `<link rel="canonical">` says, if Bing already indexed a parameterized variant. This is foundational. The `canonical` tag is not cosmetics. It's a signal-consolidation tool. When the `canonical` tag points to one URL, the sitemap lists another, and internal links lead to a third — you're sending three contradictory signals. AI systems may split the citation potential between the three versions. You need three-way alignment: `canonical = sitemap entry = internal links`. One URL. One source of truth. Separately, syndication. If an external portal republishes your article with a cross-domain canonical pointing back to your domain — that's still not a guarantee. From the model's perspective, the partner's domain may have higher brand authority, and the citation decision can still go to the partner. Defense is: structured author data, an explicit author attribution in the article body itself, and consistent entity signals. ## 4. The full metadata bundle — production code Two implementations below. The first for Next.js 16 (App Router). The second is plain HTML as a fallback or reference for static sites. ### Next.js App Router — server-rendered via `generateMetadata` ```tsx title="app/blog/[slug]/page.tsx" import type { Metadata } from "next"; interface PageProps { params: Promise<{ slug: string }>; } interface Article { title: string; excerpt: string; contentHtml: string; ogImageUrl: string; publishedAt: string; modifiedAt: string; authorName: string; slug: string; } const SITE_URL = "https://example.com"; async function fetchArticle(slug: string): Promise<Article> { const response = await fetch(`${SITE_URL}/api/articles/${slug}`, { next: { revalidate: 3600 }, }); if (!response.ok) { throw new Error(`Article not found: ${slug}`); } return response.json(); } export async function generateMetadata( { params }: PageProps ): Promise<Metadata> { const { slug } = await params; const article = await fetchArticle(slug); const canonicalUrl = `${SITE_URL}/blog/${article.slug}`; return { title: article.title.slice(0, 60), description: article.excerpt.slice(0, 160), metadataBase: new URL(SITE_URL), alternates: { canonical: canonicalUrl, }, openGraph: { type: "article", url: canonicalUrl, title: article.title.slice(0, 60), description: article.excerpt.slice(0, 160), siteName: "DocsHub", locale: "en_US", publishedTime: article.publishedAt, modifiedTime: article.modifiedAt, authors: [article.authorName], images: [ { url: article.ogImageUrl, width: 1200, height: 630, alt: article.title, type: "image/png", }, ], }, twitter: { card: "summary_large_image", title: article.title.slice(0, 60), description: article.excerpt.slice(0, 160), images: [article.ogImageUrl], }, robots: { index: true, follow: true, googleBot: { index: true, follow: true, "max-image-preview": "large", "max-snippet": -1, }, }, other: { "article:author": article.authorName, "article:published_time": article.publishedAt, "article:modified_time": article.modifiedAt, }, }; } export default async function ArticlePage({ params }: PageProps) { const { slug } = await params; const article = await fetchArticle(slug); return ( <article className="max-w-3xl mx-auto py-10 px-4"> <header className="mb-8 border-b pb-4"> <h1 className="text-4xl font-bold text-gray-900">{article.title}</h1> <div className="mt-4 flex items-center text-sm text-gray-500"> <time dateTime={article.publishedAt}> {new Date(article.publishedAt).toLocaleDateString("en-US")} </time> <span className="mx-2">•</span> <span>{article.authorName}</span> </div> </header> <div className="prose prose-blue lg:prose-lg" dangerouslySetInnerHTML={{ __html: article.contentHtml }} /> </article> ); } ``` ### Plain HTML — reference implementation for static sites ```html <!doctype html> <html lang="en"> <head> <meta charset="utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1" /> <title>Metadata Bundle for AI Citations | DocsHub

Metadata Bundle for AI Citations

Properly configured metadata ensures that AI can correctly identify and parse your source.

``` ## 5. Pre-deploy checklist | Check | Tool | Red flag | |---|---|---| | canonical = sitemap = internal links | Screaming Frog (impersonating GPTBot) | Three different URLs for the same page. | | `og:image` returns 200 OK, absolute HTTPS | curl + browser debugger | Redirect chain or 403 for the bot. | | `` length ≤ 60, description ≤ 160 | SERP simulator | Truncation in Slack/Discord previews. | | Edge renders don't strip `<head>` | Logs + emulate user-agent | Edge workers that serve a simplified HTML to AI crawlers may inadvertently strip canonical tags during rendering. If the HTML GPTBot receives doesn't contain your canonical tag, it processes the content with no canonicalization signal. | | AI bots see the same version humans do | Logs of GPTBot, OAI-SearchBot, ClaudeBot | Cloaking or JS-only render. | ## Closing Title, description, image, canonical. Four tags. Sounds trivial — but it's exactly on these four lines that 80% of AI visibility breaks for sites. Not because teams don't know about SEO. Because between *"the tag exists technically"* and *"the model sees the same canonical the human does"* lies a gulf filled with parameterized URLs, edge CDN behavior, JS rendering, and syndication. Fix this bundle. It's the cheapest AEO optimization with the best ROI. --- **Verify your project's metadata right now.** If your extension showed red status on basic tags — don't put it off. Follow me on [LinkedIn](https://linkedin.com/in/alexturik) for more practical breakdowns of AEO algorithms. Need help with a corporate portal audit or Next.js architecture optimization? [Get in touch](mailto:alexturik@gmail.com) for deep technical consulting.