Data Provenance

Our core measurement ingests zero proprietary platform content.

Korpora measures where AI points. We do that by running our own prompts through commercial AI models and reading licensed citation data, not by scraping the platforms themselves. Below is exactly what we collect, where it comes from, and on what basis.

The measurement, in one paragraph

Your mindshare score comes from three inputs, none of which ingest a platform’s content: our own prompt battery sent to commercial AI models (we measure their answers), licensed citation data that tells us which domains those models cite, and our own synthesis. That is how we can show that AI cites Reddit heavily for your category without ingesting a single Reddit post.

Every source we use

AI model panel

What: The models' answers to our own prompts.
From: Anthropic, OpenAI and other commercial model APIs.
Basis: Standard API terms. Our prompts, their outputs.

Citation and AI-search-demand data

What: Which domains AI assistants cite, and AI-era search demand.
From: DataForSEO, a licensed data provider.
Basis: Commercial license.

Demand trend

What: Category and brand search interest over time.
From: Google Trends via DataForSEO.
Basis: Licensed and aggregated.

Ad creative

What: The count and creative of ads a brand is currently running.
From: The Meta Ad Library, a public transparency tool.
Basis: Public by design.

Organic social engagement

What: Aggregate engagement on a brand's own public posts.
From: Instagram and TikTok, via Apify.
Basis: Public posts, aggregate metrics only.

What we deliberately do not do

We do not ingest or store a platform’s content to produce your numbers.
Where we cannot source a signal on clean terms, we remove it rather than scrape it. We measure how often AI cites Reddit, but we do not ingest Reddit posts. We dropped that until we hold a commercial license, and we are moving our X signal to the official API on the same principle.
We do not buy or use scraped personal data.

Your data stays yours

For Radar, our measurement product, we use only public and licensed data. We never connect to, see, or store anything inside your systems.
For Co-pilot, our hands-on tier, you share only what you choose. We use it only for the work you have asked us to do, and only with your permission.
We do not sell your data, share it, or train models on it.

Why we publish this

If you are going to trust a measurement, you should be able to see how it is made. Transparency about method is part of the product, not a disclosure we bury. And we think clean measurement helps everyone: when brands earn genuine, high-quality presence, the platforms and the AI systems that read them give better answers to the people using them.

Last updated 1 June 2026

See the full methodology