The JavaScript Trap — Why AI Bots See Your Site as an Empty Page | Oleksii Turovskyi

Your site looks flawless. Animations, interactivity, content. The browser purrs. Now open it through the eyes of GPTBot, ClaudeBot, or PerplexityBot — and you'll see emptiness. Literally: <div id="root"></div> and nothing more. That's why your AI audit fails with the same diagnosis every time — the ratio of raw HTML to rendered DOM is below 0.3.

This is the JavaScript trap.

1. What humans see vs. what bots see#

A user's browser is a full-blown runtime. It pulls the HTML, executes JavaScript, mounts the React tree, makes API calls, updates the DOM, paints animations. The output: thousands of words, headings, metadata, structure, interactive elements. It works.

The AI crawler works differently. The majority of modern bots — GPTBot, ClaudeBot, PerplexityBot, CCBot — deliberately do not execute JavaScript. It's a cost decision at the scale of millions of pages: spinning up a Chromium instance per URL would cost hundreds of times more than a plain HTTP request. So they take the raw HTML — the same one you see via View Page Source — and work only with that.

Open your SPA via Ctrl+U. If all you see is <script> tags, a skeleton with <noscript>, and an empty root <div>, then bingo, you are invisible to AI.

Here's the typical raw HTML of a CSR (Client-Side Rendering) app:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>My App</title>
  </head>
  <body>
    <div id="root"></div>
    <script type="module" src="/src/main.tsx"></script>
  </body>
</html>

Twenty words. The rendered DOM has two thousand. Ratio: 0.01. Fail.

2. Why 0.3 is the red line#

Auditors measure the ratio because it's the cheapest, most honest marker of whether an LLM will receive your content at all. The logic is simple: if raw HTML contains at least 30% of what a human sees, the bot extracts something meaningful. Less than that — it skips your page as an empty stub.

The consequences are concrete and unpleasant:

your site doesn't appear in ChatGPT, Claude, Perplexity, or Gemini answers;
your content doesn't get cited in generated overviews;
your pages don't get indexed as a knowledge source for training or RAG.

This is the new SEO reality. Old Googlebot eventually learned to render JS — slowly, through two-pass indexing, but it does. The new AI crawlers do not. And they won't any time soon: the cost of executing JavaScript at LLM-training and real-time-inference scale is not economical. It's faster to skip your site and pick the competitor that returns HTML on the first byte.

3. The solution: SSR and static generation#

The conclusion is obvious — render HTML on the server. Three approaches, each with its own use case:

Server-Side Rendering (SSR) — HTML is generated on the fly per request. Fits dynamic, personalized content: dashboards, live prices, A/B tests.
Static Site Generation (SSG) — HTML is generated at build time and served as a static file from a CDN. Fast, cheap, ideal for blogs, documentation, marketing pages.
Incremental Static Regeneration (ISR) — a compromise: static files that rebuild on a schedule or via a trigger. Combines SSG performance with SSR freshness.

All three give the bot a complete HTML response from the very first byte. The ratio jumps to 0.9+, the audit goes green, the content starts showing up in AI answers.

4. Personal take: Next.js App Router as the gold standard#

This site is built on Next.js with the App Router. Three reasons it's the right stack for AI-friendly architecture:

Server Components by default. In the App Router, every component is server-rendered until you explicitly write 'use client'. By default you're in SSR mode without any extra configuration. Content lands in HTML automatically.
Streaming HTML. Next.js streams HTML through React Suspense — bots get the heading, metadata, and base structure instantly, then the main body streams. For crawlers with hard timeouts, this matters.
Metadata as code. The generateMetadata function gives you control over <title>, <meta>, OpenGraph, and JSON-LD straight from the page component. AI bots read it from raw HTML.

Here's a sample article page that humans and bots both see equally well:

app/blog/[slug]/page.tsx

import { notFound } from 'next/navigation';
import type { Metadata } from 'next';
 
interface Post {
  title: string;
  excerpt: string;
  contentHtml: string;
  publishedAt: string;
  author: string;
}
 
async function getPost(slug: string): Promise<Post | null> {
  const res = await fetch(`https://api.example.com/posts/${slug}`, {
    next: { revalidate: 3600 },
  });
 
  if (!res.ok) return null;
  return res.json();
}
 
export async function generateMetadata(
  { params }: { params: Promise<{ slug: string }> }
): Promise<Metadata> {
  const { slug } = await params;
  const post = await getPost(slug);
 
  if (!post) return { title: 'Not found' };
 
  return {
    title: post.title,
    description: post.excerpt,
    openGraph: {
      title: post.title,
      description: post.excerpt,
      type: 'article',
      publishedTime: post.publishedAt,
      authors: [post.author],
    },
  };
}
 
export default async function PostPage(
  { params }: { params: Promise<{ slug: string }> }
) {
  const { slug } = await params;
  const post = await getPost(slug);
 
  if (!post) notFound();
 
  return (
    <article>
      <header>
        <h1>{post.title}</h1>
        <p>
          <time dateTime={post.publishedAt}>{post.publishedAt}</time>
          {' · '}
          <span>{post.author}</span>
        </p>
      </header>
      <div dangerouslySetInnerHTML={{ __html: post.contentHtml }} />
    </article>
  );
}

What's load-bearing here for AI bots:

fetch() runs on the server — data lands in the HTML before it's ever sent to a client;
the entire content (heading, metadata, body) appears in raw HTML;
no 'use client' is needed because there's no interactivity, so no hydration cost;
revalidate: 3600 provides ISR — static rebuilds once an hour without a redeploy.

If you do need an interactive piece (a comment form, a like button), pull it into a separate client component:

app/blog/[slug]/like-button.tsx

'use client';
 
import { useState } from 'react';
 
interface LikeButtonProps {
  postId: string;
  initialLikes: number;
}
 
export function LikeButton({ postId, initialLikes }: LikeButtonProps) {
  const [likes, setLikes] = useState(initialLikes);
  const [isPending, setIsPending] = useState(false);
 
  async function handleClick() {
    setIsPending(true);
    setLikes((n) => n + 1);
 
    try {
      await fetch(`/api/posts/${postId}/like`, { method: 'POST' });
    } catch {
      setLikes((n) => n - 1);
    } finally {
      setIsPending(false);
    }
  }
 
  return (
    <button onClick={handleClick} disabled={isPending}>
      ♥ {likes}
    </button>
  );
}

The server component renders the article into HTML — bots get the full text. The client component hydrates after load — users get interactivity.

5. Validation checklist#

Check your site in a minute:

Open the page, hit Ctrl+U (View Page Source).
Search for the article's main heading in the raw HTML via Ctrl+F.
Found? You're safe. Not found? You have a CSR trap.
Run curl -A "GPTBot" https://your-site.com/page | wc -w and compare to the word count in the live browser.

Target ratio — from 0.7 upwards. Anything below 0.3 is a critical fail. (Of course, if the crawler is blocked in robots.txt, it never even reaches this stage — see my earlier post on configuring robots.txt for AI.)

Closing#

JavaScript frameworks rewrote UX forever, but the cost is invisibility to a new generation of crawlers. Move rendering to the server and your content gets a chance to be read and quoted again. Next.js App Router makes this transition almost painless — it's an architecture where the right decision is the default.

Has your audit revealed rendering issues on your project? Follow me on LinkedIn to keep up with new AEO tools. If your site is stuck in the "CSR trap" and you need help migrating to a modern SSR architecture, get in touch for a deep audit.