Korpora

Korpora · The framework

What we measure

The thesis, the four metrics, the mechanism, the deliverable, and why the measurement instrument has to be separate from the content vendor.

The bet, stated plainly

        ORGANIC AI MINDSHARE
              │
              ├──▶  AGENT TRAFFIC     (tools fetching data on a user's behalf)
              │
              └──▶  HUMAN TRAFFIC     (AI assistants sending users to your site)
                            │
                            ▼
                    REVENUE · USAGE · DISCOURSE
                            │
                            ▼
                    fuel for the next training cycle

Ask your lead engineer about the last token, dev tool, or AI library they added to the stack. The answer is almost always some variant of “I asked Claude” or “Cursor recommended it.” Sophisticated buyers with real budget research and decide via AI assistants today. The category leader on Google is no longer the category leader on Claude, and the gap is increasingly the strategic story.

AI assistants index the public discourse on roughly a 6 to 12 month lag. The signals that determine your AI-channel mindshare in late 2026 are the foundation-channel posts, owned-domain content, and authority-source mentions that exist by the end of Q2. The brand that ships into that window receives disproportionate inbound for the next two or three cycles. The brand that does not gets locked out by the competitor whose name the model already knows.

Korpora's job is to tell you exactly what to ship into that window, in priority order, with the measurement that says why.

Four metrics.
One output, three inputs.

One output number tells you where you stand today. Three input metrics tell you what is driving it and what to change. Findings come from the deltas between metrics; no single number in isolation is the headline.

Output

Organic AI mindshare

the metric that ultimately matters

Share of unprompted, category-shaped queries where frontier models surface your brand without being given the name. Wilson 95% CIs, split organic vs evaluation framing. Slowest to respond to intervention (bounded by training-corpus cycles, roughly 6 to 12 months past the latest model cutoff), and the metric that ultimately matters.
Input 1/3

Human channel velocity

leading indicator for the next training cycle

Post-cutoff publication rate and acceleration on the channels that feed training corpora: Hacker News, Reddit, X, GitHub, arXiv, Substack. The input most likely to produce organic mindshare lift on the next training cycle. Observable weekly.
Input 2/3

Ecosystem health

developers, integrations, users + agent sessions

Three sub-signals that together say whether the category cares: developer ecosystem (GitHub dependents, package downloads, community projects), integration partners (other companies embedding your product), and end-user base (active accounts, sessions, agent traffic). Leading indicator of mindshare; an ecosystem-vs-mindshare gap usually signals that the attachment is invisible to indexed surfaces.
Input 3/3

Agent-readiness

what happens when an agent meets your product

Task-completion pass rate of the frontier-model panel (Sonnet, Haiku, GPT-5.5, plus the vertical-matched fourth model) on real tasks against your API. Names the specific failure modes docs and DevRel can fix before the next training corpus closes. High organic mindshare paired with low agent-readiness is the most dangerous configuration: agents recommend, users hit walls, reputation erodes by the next cycle.

How it works

┌──────────────────┐    ┌──────────────────┐    ┌──────────────────┐
│  1. SCRAPE       │    │  2. QUERY        │    │  3. TRIANGULATE  │
│                  │    │                  │    │                  │
│  Reddit · X      │───▶│  108-cell        │───▶│  cutoff-split    │
│  GitHub · arXiv  │    │  battery         │    │  velocity +      │
│  HN · Trends · G2│    │  3 models, 3rds  │    │  framing split   │
└──────────────────┘    └──────────────────┘    └──────────────────┘
   foundation              agent layer             cross-channel
  1. 1

    Scrape your category's foundation channels

    Reddit, X, GitHub, arXiv, Hacker News, Google Trends, G2: the instrument draws the subset your category actually surfaces on, not all of them at once. The corpus-feeding ones are the same buyer signal already in the AI training corpora.

  2. 2

    Run install-decision queries through the frontier-model panel

    Claude Sonnet, Claude Haiku, and GPT-5.5 (three always-on models): 108 cells per report across four query framings, 3 rounds. The panel is vertical-matched, so developer subjects add a fourth model (GPT-5.3-Codex, 144 cells) and consumer subjects add a vertical-matched fourth model once validated.

  3. 3

    Cross-channel triangulate and cutoff-split

    Foundation activity from after each model's training cutoff feeds the velocity metric that predicts which brands enter the next training cycle.

What a real finding
actually looks like

Open Claude or ChatGPT right now. Ask the question a new buyer in your category would ask, without naming your brand. Watch what surfaces. If you don't appear, or a competitor confidently shows up in your place, you're looking at the most common pattern Korpora surfaces.

The findings that move the needle are deltas between metrics. The shape we surface most often: a brand leads on the foundation channels where buying-decision conversations happen, and trails on the AI-assistant layer that indexes those channels on a 6 to 12 month lag.

From a recent published report

Same category, three measured brands, two different measurement surfaces.

OPERATOR DISCOURSE vs AI-ASSISTANT RECALL

                  FOUNDATION CHANNELS       AI-ASSISTANT LAYER
                  (operator discussion)     (buyer research)

Brand             HN 924   Reddit 1,375   Organic mindshare   7.4%
Competitor A      HN  21   Reddit    93   Organic mindshare  86.1%
Competitor B      HN 534   Reddit    33   Organic mindshare  80.6%

The brand leads on the channels where staffing-decision conversations
actually happen. AI assistants index those channels on a 6 to 12 month
lag, so the recall catch-up arrives one to two training cycles out,
provided the right indexable signal is in place when the next corpus
closes.

The gap is the work. The report ranks the top three actions by lift potential (channel weight × current deficit × ship feasibility × time-to-corpus) and scopes each to land before the next training corpus closes.

What lands in your inbox

Three artifacts from a single measurement. The first is the shareable thing. The second is the actionable thing. The third is the optional thing you ask for if a specific decision-maker on your team needs the same data shaped for their lens.

A live dashboard

Per-finding cards with severity, permalink, lift-potential ordering, and copy-pasteable artifacts. The thing your team works from.

A prioritized lift-potential fix list

Top 3 actions ranked by channel weight × current deficit × ship feasibility × time-to-corpus. Each scoped to land before the next training corpus closes.

An audience re-frame on request

Same measurement re-framed for a specific reader (founder, sales lead, eng lead). The version a department head can forward inside their team without translating.

Live samples

For brands that sign on as a design partner, cards push directly into your team's Trello, Linear, or Slack on a quarterly re-measurement cadence.

We hand you the fix, not just the score

Every finding ships with a ranked, paste-ready engineer fix for the exact gap we measured: the hypothesis, the channel, the anchor terms, and the spec to prove it moved. The score always comes with the move that changes it.

Most AEO tools hand a marketing team a dashboard and leave the work to them. We aim each fix at the specific surface or channel your measurement says you are losing, so a founder without a content team still knows exactly what to ship next.

Why build this

We built Korpora after watching our own buying behavior shift. Every tool we picked to build this product (the deployment platform, the database, the data-pull API, the AI SDKs, the libraries) came from asking an AI assistant. None came from Google, G2, Reddit, or any of the channels traditional CI tools measure.

Real buyers making real tooling decisions worth real money, and zero of those decisions touched the channels every other CI tool measures. Korpora is the answer to the obvious next question: if we're buying like this, how is what we sell showing up for the buyers who buy the same way?

Get measured before
the next training
corpus closes

Submit your brand and a company email. For AI infrastructure teams (vector DBs, AI gateways, eval / observability, AI-agent platforms) and AI-aligned crypto projects with engineers shipping. Limited free measurements per month. Response within a few business days.

Measurement intake is paused while Korpora digests the current prospect pipeline.

Engagements payable in USD, USDC, or your project's native token.