What 182 LLM Prompt Tests Reveal About How AI Recommends B2B SaaS

Q: Do these findings apply to industries outside B2B SaaS?

The core dynamics we observed, particularly the dominance of training data over live search and the importance of multi-page web presence, are likely to hold across most categories where buyers use AI for product research. The specific ratios (79% training data, 21% web search) may vary by category.

Q: Can a new product improve its AI visibility quickly?

Yes, but it depends on which discovery mechanism you are targeting. For the 21% of prompts that trigger web search, improving your multi-page web presence and third-party coverage can show results in weeks. For the 79% that rely on training data, you need to build presence in sources that AI training pipelines prioritize, which is a longer-term effort measured in months.

Why We Ran This Study

Our clients — B2B SaaS companies — kept asking the same question: how does AI decide which tools to recommend?

It is a fair question. When a buyer asks ChatGPT, Gemini, or Perplexity to recommend software for their problem, the AI produces a confident answer with specific product names. But what governs which products appear? Is it the quality of the product? The company’s marketing spend? Whether the AI searched the web in real time, or simply recalled what it learned during training?

We did not find satisfying answers in the existing literature. So we designed a study to find out. We tested 182 prompts across major AI platforms, covering the kinds of questions real buyers ask when researching B2B SaaS products. The prompt set included 150 single-turn prompts — direct questions like “what is the best project management tool for a remote team” — and 32 compound prompts that mimic a multi-step research workflow, such as asking a follow-up question about pricing after an initial recommendation.

For every response, we logged which products were recommended, whether the AI searched the web or answered from training data, what sources were cited, and how the product was framed — as a primary recommendation, a passing mention, or a comparison point. (If you’re new to AI visibility, our complete guide covers the fundamentals.)

The results changed how we advise our clients. We think they will change how you think about your own AI visibility, too.

Key Findings at a Glance

182

Prompts tested across major AI platforms

79%

Relied on training data, not live search

21%

Triggered real-time web search

100%

Tool comparison prompts triggered search

These four numbers frame the rest of the study. The vast majority of AI responses about B2B SaaS come from what the model already knows — not from what it finds in real time. The exception is tool comparison prompts, which triggered web search every single time. The type of question a buyer asks determines whether AI can even find you.

Finding 1 — Most AI Answers Come From Training Data, Not Live Search

This was the most consequential finding: 79% of the prompts we tested were answered entirely from the AI model’s training data. No web search. No retrieval. No citations. The AI simply wrote an answer based on patterns it learned during training.

Problem-oriented prompts — the kind real buyers ask most often, such as “how do I reduce churn in my SaaS product” or “what’s the best way to manage a distributed engineering team” — almost never triggered search. The AI treated these as general knowledge questions. It drew from whatever it had internalized during training and produced a fluent, confident answer. If your product was not in that training data, you were not in the answer.

Only 21% of prompts triggered real-time web search. Those tended to be explicit comparison queries (“what are the best tools for X”) or prompts that asked about recent developments. For a closer look at how AI visibility scores are calculated and why nondeterminism makes measurement hard, see our scoring methodology explainer.

The implication is stark: if your product is not represented in AI training data, you are invisible to the majority of buyer queries. Traditional SEO, content marketing, and paid acquisition do not address this. They optimize for the 21% of queries where AI actually searches the web. The other 79% is a different game entirely — one that depends on third-party coverage, reviews, and presence in the kinds of sources that AI training pipelines ingest.

Finding 2 — The Competitive Landscape According to AI

We tracked how often each product was mentioned across all 182 prompts. The distribution was dramatically uneven.

The most-mentioned product (Product A) appeared 101 times across our test set. Product B appeared 74 times. Products C and D appeared 37 and 27 times respectively. At the other end of the spectrum, a newer entrant (Product F) appeared only 5 times — but when it was found, it was recommended as a primary pick 57% of the time.

The most-mentioned products are not necessarily the best. They are the ones with the deepest footprint in AI training data. Products with extensive third-party coverage, long histories of being discussed in blog posts and review sites, and broad web presence dominated the mention counts — regardless of current product quality or customer satisfaction.

Subscription-based products with established market presence dominated generic queries. When a buyer asked a broad question (“what’s the best tool for X”), AI defaulted to the products it had the most data on. Smaller or newer entrants, even those with strong user reviews and competitive features, were crowded out simply because they had less historical presence in the training corpus.

Product F is instructive. It barely appeared in raw mention counts, but its recommendation rate when it did appear was the highest in the study. This suggests that when AI does encounter a newer product — typically through web search — it evaluates it on merit. The problem is that AI rarely encounters it in the first place, because 79% of responses never trigger search.

Finding 3 — What Triggers AI to Actually Search

Not all queries are created equal. The type of prompt determines whether AI draws from memory or goes looking for current information.

Tool comparison prompts triggered web search 100% of the time. Any prompt that asked AI to compare, list, or evaluate tools in a category — “what are the best tools for X,” “compare A vs. B,” “top 5 tools for Y” — reliably kicked the AI into retrieval mode. These prompts generated cited sources, pulled recent information, and produced answers that reflected the current web landscape rather than historical training data.

Problem-oriented and educational prompts, by contrast, triggered search near 0% of the time. When a buyer asked “how do I solve problem X” or “what should I consider when choosing a tool for Y,” the AI answered from training data. It offered general frameworks, listed common approaches, and mentioned products it already knew about. It did not search. It did not cite sources. It simply produced an answer.

The compound prompts in our set — multi-step conversations where a buyer starts with a broad question and narrows down — showed a mixed pattern. The initial broad question typically did not trigger search. But follow-up questions asking for specific tool recommendations or pricing comparisons often did. This means the entry point of a conversation is critical: the first question a buyer asks often determines whether AI uses training data or web search for the entire thread.

For B2B SaaS companies, the implication is clear. Your content strategy needs to address both modes. For the 21% of queries where AI searches, you need strong web presence with the right content in the right places. For the 79% where it does not, you need to be embedded in the sources that AI trains on.

Finding 4 — Multi-Page Presence Beats Single-Page Citation

When AI did search the web, we tracked the relationship between how many pages mentioned a product in search results and whether the AI recommended it as a primary pick or merely mentioned it in passing.

The pattern was consistent: brands appearing across 3 or more pages in search results were recommended as primary picks. Brands appearing on a single page were mentioned in passing, listed as alternatives, or cited with qualifications.

This makes intuitive sense. When an AI model retrieves web results and finds a product mentioned on a single page — even a high-authority page — it treats that as a data point. When it finds the same product discussed across multiple independent pages (a review site, a comparison blog, the company’s own content, a case study), it treats that as a pattern. Patterns become recommendations. Isolated mentions become footnotes.

One landing page, no matter how well-optimized, is not enough. A single G2 listing is not enough. A lone blog post is not enough. What matters is a distributed content footprint: multiple independent pages that discuss your product in different contexts, from different perspectives, with consistent information about what your product does and who it serves.

A distributed content strategy matters more than one great landing page. The companies in our data that earned primary AI recommendations had coverage across review sites, industry blogs, their own content, and comparison pages. Single-source presence led to passing mentions at best.

This finding also explains why newer products struggle. They often have strong product pages and a few early reviews, but lack the breadth of web coverage that established competitors have accumulated over years. Building that breadth is a deliberate, multi-channel effort — and it is now a prerequisite for AI visibility, not just traditional SEO.

Finding 5 — Pricing Model as a Differentiator

An unexpected finding: when a product’s pricing model was distinctive and clearly stated on the web, AI actively surfaced it as a differentiating factor.

Products with pay-per-report or one-time pricing stood out in our test results. When AI encountered these pricing structures during web search, it did not just note them — it cited them as a reason to consider the product. Responses included language like “unlike most competitors which charge monthly subscriptions, this tool offers one-time pricing” or “a pay-per-use model that may be more cost-effective for occasional needs.”

Pricing transparency is a trust signal that AI models actively surface. When a product’s pricing is clearly documented, consistently described across multiple sources, and meaningfully different from the category default, AI treats it as a recommendation-worthy attribute.

The reverse is also true. Products with opaque pricing (“contact us for a quote,” “custom pricing”) received no pricing-related differentiation in AI responses. AI cannot cite what it cannot find. If your pricing is hidden behind a sales wall, AI will describe your product without one of the most important buying criteria — and competitors with visible pricing will look more trustworthy by comparison.

For B2B SaaS companies, this is a straightforward opportunity. If your pricing model is a genuine differentiator, make it explicit, consistent, and findable across your web presence. AI will do the rest.

What This Means for Your AI Visibility Strategy

The five findings above converge on a clear set of priorities for any B2B SaaS company that wants to show up in AI recommendations.

1. Get into training data

79% of buyer queries are answered from training data. If you are not in that data, you are invisible to the majority of potential buyers using AI. Training data comes from publicly available web content, and AI training pipelines prioritize authoritative, widely-cited sources.

This means investing in the sources that training pipelines ingest: third-party coverage from industry publications, presence on review platforms like G2 and Capterra, mentions in Wikipedia and other high-authority reference sources, and coverage from analysts and journalists. Your own blog content helps, but third-party validation carries disproportionate weight in training data.

2. Build multi-page web presence

When AI does search the web (21% of queries), it promotes products with distributed coverage across multiple independent sources. A single landing page is not enough. You need a content footprint that spans review sites, comparison articles, your own educational content, customer case studies, and industry-specific publications.

The goal is not just to be mentioned, but to be mentioned consistently across multiple contexts. AI synthesizes information from multiple sources, and consistency of messaging across those sources increases your likelihood of being recommended rather than merely cited.

3. Make pricing and differentiators explicit and consistent

AI actively surfaces distinctive pricing models and clear product differentiators. If your pricing is transparent and meaningfully different from competitors, ensure it is stated clearly and consistently across every page where your product is discussed. If your differentiators are buried in marketing jargon or hidden behind gated content, AI will not find them — and it will recommend competitors whose strengths are more visible.

Consistency is critical. If your website says one thing about pricing, your G2 listing says another, and a review blog says a third, AI will either pick the most authoritative source or hedge with caveats. Unified messaging across all touchpoints gives AI a clear signal to relay to buyers.

The companies that win AI recommendations are not necessarily the best products. They are the ones with the deepest training data footprint, the broadest web presence, and the clearest, most consistent messaging about what they do and how they are different. These are all things you can influence — starting today.

Frequently Asked Questions

How were the 182 prompts selected for this study?

We designed prompts to reflect real user behavior when searching for B2B SaaS products. The set included 150 single-turn prompts (direct questions a buyer would ask) and 32 compound prompts (multi-step queries that mimic a real research workflow). Prompts covered problem-oriented questions, tool comparisons, feature-specific queries, and pricing questions.

Which AI platforms were tested?

We tested across major AI platforms including ChatGPT, Gemini, and Perplexity. Results were aggregated across platforms to identify patterns that hold regardless of which AI a buyer happens to use.

Do these findings apply to industries outside B2B SaaS?

The core dynamics we observed — particularly the dominance of training data over live search and the importance of multi-page web presence — are likely to hold across most categories where buyers use AI for product research. The specific ratios may vary by category, but the structural patterns should generalize.

Why did 79% of prompts rely on training data instead of searching the web?

AI models default to their training data whenever they estimate they already have a confident answer. Only queries that signal a need for current or comparative information — such as tool comparison prompts — reliably triggered web search. Problem-oriented and educational prompts almost never triggered search because the model treated them as general knowledge questions it could answer from memory.

How often should companies test their AI visibility?

AI training data and retrieval behavior change with every model update. We recommend testing at least quarterly, or whenever a major model release occurs. Companies in competitive categories may benefit from monthly monitoring rather than one-time audits to track shifts in which products AI surfaces.

Can a new product improve its AI visibility quickly?

It depends on which discovery mechanism you are targeting. For the 21% of prompts that trigger web search, improving your multi-page web presence and third-party coverage can show results in weeks. For the 79% that rely on training data, you need to build presence in sources that AI training pipelines prioritize — a longer-term effort measured in months rather than days.

Methodology: We tested 182 prompts across major AI platforms including ChatGPT, Gemini, and Perplexity in March 2026. Prompts were designed to reflect real user behavior when searching for B2B SaaS products, including 150 single-turn and 32 compound prompts. For each response, we logged product mentions, recommendation framing, whether web search was triggered, and cited sources. All competitive data has been anonymized. Results are directional — they identify consistent patterns but should not be treated as statistically representative of all AI behavior.