Inside the Black Box: Measuring Brand Visibility in AI

As AI-driven search becomes a genuine source of traffic and discovery, one question keeps coming up from clients:

“Are we visible in ChatGPT, Gemini, and Claude?”

The honest answer is — it’s complicated.

That’s why we built the Door4 LLM Visibility Checker. Not as a traditional “SEO tool”, but as a way to observe how brands appear inside large language models when real-world style questions are asked.At its core, the tool runs a structured set of prompts across multiple LLMs (ChatGPT, Gemini, Claude), simulating how a user might look for a product, supplier, or service. Each report involves 30+ individual prompts, designed to test different types of intent — from broad category searches to more specific, use-case driven queries.

We’ve trialled major third-party tools in this space — including platforms that attempt to “track” LLM visibility in a similar way to search rankings. But we found them limiting. They tend to rely on fixed datasets, static prompt sets, or black-box methodologies that don’t integrate well with how we actually work with clients.

So instead, we built our own approach.

Because we’re working directly with the APIs, we can:

Create custom prompt sets aligned to each client’s market
Test new queries dynamically as strategies evolve
Run analysis in real time, rather than relying on pre-compiled datasets
Integrate findings into broader performance and content workflows

Why this matters

Right now, most interaction with LLMs is still human-driven — much like traditional Google search. Users ask questions, review options, and make their own decisions.

But that behaviour is already starting to shift.

Over the coming years, both consumers and businesses will increasingly delegate selection to AI. Instead of just asking for options, they’ll ask for decisions.

We’re already seeing early versions of this in prompts like:

“Find me a holiday provider for a family of four in Spain with a £3k budget”
“Recommend a football team near me for my 11-year-old daughter to join”
“Shortlist three DBS providers suitable for a small UK business with low admin”
“Choose the best CRM for a 20-person B2B sales team and explain why”

In these scenarios, the model isn’t just listing brands — it’s filtering, prioritising, and recommending.

If your brand isn’t present — or isn’t understood correctly — you’re not just missing visibility. You’re being excluded from the decision entirely.

The limitations (and why this isn’t an audit)

There are important constraints to acknowledge.

Firstly, LLMs are inherently non-deterministic. The same prompt can produce different outputs on different runs.

Secondly, and more importantly, you can never truly emulate a real user session.

A real interaction isn’t a single prompt — it’s a multi-turn conversation. A user might start broadly, refine their needs, introduce constraints, and build context over several steps. Each turn shapes the final outcome.

This is what we mean by multi-turn: a sequence of connected prompts where context accumulates and influences the model’s reasoning.

Most tools — including ours today — operate primarily on single-turn prompts. That gives useful insight, but it’s only a partial view of reality.

Which is why we push back on the term “audit”.

An audit suggests completeness and precision. LLM visibility analysis is neither. It is directional, not definitive.

What it actually helps with

Despite those limitations, the Door4 LLM Visibility Checker provides something genuinely useful:

A snapshot of whether your brand appears in AI-driven discovery
Insight into which competitors are being surfaced instead
Understanding of where your positioning is strong — and where it drops off
A baseline to track change over time

Our roadmap is focused on going further — particularly into multi-turn simulation, where we model more realistic user journeys across several steps of interaction. While this will never fully replicate real users, it moves closer to how decisions are actually formed within AI conversations.

Ultimately, this isn’t about chasing a “ranking”.

It’s about understanding how your brand is interpreted, surfaced, and recommended by AI systems — and using that insight to shape better content, positioning, and strategy in a rapidly changing discovery landscape.

Try the Door4 LLM Visibility Checker for your brand, today.

Door4 opinions and insight.

We have a lot to talk about.

Our latest articles, features and ramblings.
We explore performance marketing, AI, communications and optimisation.

Inside the Black Box: Measuring Brand Visibility in AI

Why this matters

The limitations (and why this isn’t an audit)

What it actually helps with

Try the Door4 LLM Visibility Checker for your brand, today.

Door4 opinions and insight.

Proud to work with

Ready to discuss your growth plans?

Innovation Explorer - Request

About you

Your website