Korpora

Data Provenance

Our core measurement ingests zero proprietary platform content.

Korpora measures where AI points. We do that by running our own prompts through commercial AI models and reading licensed citation data, not by scraping the platforms themselves. Below is exactly what we collect, where it comes from, and on what basis.

The measurement, in one paragraph

Your mindshare score comes from three inputs, none of which ingest a platform’s content: our own prompt battery sent to commercial AI models (we measure their answers), licensed citation data that tells us which domains those models cite, and our own synthesis. That is how we can show that AI cites Reddit heavily for your category without ingesting a single Reddit post.

Every source we use

AI model panel

What
The models' answers to our own prompts.
From
Anthropic, OpenAI and other commercial model APIs.
Basis
Standard API terms. Our prompts, their outputs.

Citation and AI-search-demand data

What
Which domains AI assistants cite, and AI-era search demand.
From
DataForSEO, a licensed data provider.
Basis
Commercial license.

Demand trend

What
Category and brand search interest over time.
From
Google Trends via DataForSEO.
Basis
Licensed and aggregated.

Ad creative

What
The count and creative of ads a brand is currently running.
From
The Meta Ad Library, a public transparency tool.
Basis
Public by design.

Organic social engagement

What
Aggregate engagement on a brand's own public posts.
From
Instagram and TikTok, via Apify.
Basis
Public posts, aggregate metrics only.

What we deliberately do not do

  • We do not ingest or store a platform’s content to produce your numbers.
  • Where we cannot source a signal on clean terms, we remove it rather than scrape it. We measure how often AI cites Reddit, but we do not ingest Reddit posts. We dropped that until we hold a commercial license, and we are moving our X signal to the official API on the same principle.
  • We do not buy or use scraped personal data.

Your data stays yours

  • For Radar, our measurement product, we use only public and licensed data. We never connect to, see, or store anything inside your systems.
  • For Co-pilot, our hands-on tier, you share only what you choose. We use it only for the work you have asked us to do, and only with your permission.
  • We do not sell your data, share it, or train models on it.

Why we publish this

If you are going to trust a measurement, you should be able to see how it is made. Transparency about method is part of the product, not a disclosure we bury. And we think clean measurement helps everyone: when brands earn genuine, high-quality presence, the platforms and the AI systems that read them give better answers to the people using them.

Last updated 1 June 2026