The State of B2B Chatbots

A First-Party Buyer Education Study of 100 Mid-Market B2B SaaS Websites

Based on first-party Navless research · December 2025

Executive Summary

Why this research matters now

The B2B funnel has changed. Buyers now use LLMs to create shortlists, visit websites to evaluate, and expect to self-educate without talking to sales. They arrive with AI-informed questions and want answers immediately. This is the AI funnel, and the post-click evaluation stage is where most pressure now sits.

Chatbots are often used to handle this pressure, promising to answer buyer questions, capture demand, and help people learn without forms. However, most visitors don't use them because the experience often leads to, "Hmm, I can't answer that. Want to talk to sales?"

This led to the core question: if a new prospect arrives with a question, how likely is it that a chatbot educates them instead of pushing them to a demo?

The experiment

Navless evaluated 100 chatbot experiences on mid-market B2B SaaS websites, scoring whether each could handle four top-of-funnel questions: What is this? What makes this different? Can I see a case study? How can I learn more?

What we found

66% of chatbots failed to answer "What is [Company Name]?"
70% failed to explain how the company is different
83% failed to surface a case study when asked
Only 9 of 34 chatbots labeled as "AI" passed all four questions

"We ran this study because we wanted to test, not assume. The result confirmed what buyers have been telling us for years: most chatbots don't educate. They qualify. And in a world where buyers want to learn first and convert later, that's the wrong job." — Jim Milton, Founder & CEO, Navless

Methodology & Scoring

Sample

100 mid-market (100–1,000 employees), US-based B2B SaaS websites
Tested in Q4 2025
US-based testers, desktop, business hours

Conversation flow

5 testers asked each chatbot four distinct questions with varying levels of intent
All four questions were asked in a single conversation

Question set, mapped to buyer intent

Awareness: "What is [Company Name]?"
Consideration: "How is [Company Name] different?"
Intent: "Can you send me a case study?"
Evaluation: "How can I learn more?"

Scoring definitions

Passed: the chatbot answered the question without an error or a forced handoff to a live agent or form
Failed: the chatbot returned an error, refused to respond, or routed the conversation to a live agent or a form before answering
Auto-fail: the chatbot asked for an email or auto-routed to a live agent without first providing an answer

What we tested and what we didn't

We tested a chatbot's ability to educate and warm up a buyer. We did not test its ability to schedule a meeting or hand off to a live sales agent. Both jobs matter, but they are different jobs, and most chatbots are clearly built for the second one.

Why We Focused on Buyer Education

Most B2B chat experiences are primarily built to book demos with the highest-intent visitors. But buyers have changed. They want to self-educate and confirm fit before talking to sales. The shortlist they arrive with was built inside an LLM. Their questions are specific and AI-informed. They expect the website to act as a guide, not a gate.

That is why we focused on education. Not because qualifying web visitors is wrong, but because education is the make-or-break job of the modern B2B website. When qualification becomes the only goal, new prospects self-qualify out, often quietly. They were interested. They just got pushed too early.

Testers echoed real buyers:

"It only pushes me to book a demo or talk to sales."
"I can't even ask my question. I can only select from predefined options."
"I'm unable to have any semblance of a conversation."
"It's a dead end."

Findings

Finding 1 — Most chatbots fail at teaching buyers the basics

Instead of helping someone get oriented, many chatbots defaulted to scripts, predefined responses, or a fast redirect to "talk to sales."

"What is [Company Name]?" — 34% Passed, 66% Failed
"What makes [Company Name] different?" — 30% Passed, 70% Failed

These are foundational questions. Any B2B SaaS company should be able to answer what it does and what makes it different before nudging a visitor toward a demo.

If the chatbot cannot reliably answer the two questions every net-new buyer asks first, it is worth asking what role it is really playing on the site.

So what? If a buyer clicks chat and still cannot get the basics answered, they will not keep digging. When chat blocks learning, it teaches buyers a simple lesson: this site will not help me unless I am ready to convert. That is a tough ask in a world where the same buyer can get instant, personalized answers from an LLM in another tab.

Finding 2 — Some chatbots were never designed to handle real buyer questions

A meaningful portion of the chat experiences failed every question because they were never designed for buyer education. They generally fell into three buckets:

Rule-based decision trees: The visitor cannot type. They are forced to click from a preset menu of options.
Live chat intake forms: The visitor cannot proceed without giving up personal information. The chat turns into a form.
Buggy or broken experiences: The chat errors out or cannot handle basic prompts.

What these have in common is that they run on manual setup and rigid scripts configured by humans. None of that is inherently AI, and none of it matches how modern buyers expect to self-educate.

So what? If a chatbot cannot answer basic buyer questions, it is worth asking whether it is worth keeping live, let alone investing in further. These experiences do not run themselves. Someone has to spend real time and effort to hardwire a system that still cannot do the one job buyers expect when they click chat: help them learn.

Finding 3 — Chatbots that claim to be "AI-powered" performed better, but still disappointed

Of the 100 chat experiences, 34 carried an "AI" label. They outperformed the non-AI group, but most still could not pass the full set of buyer questions.

"What is [Company Name]?" — 24 AI-labeled passed, 9 not AI-labeled passed
"How is this different?" — 22 AI-labeled passed, 8 not AI-labeled passed
"Can you send me a case study?" — 12 AI-labeled passed, 5 not AI-labeled passed
"How can I learn more?" — 16 AI-labeled passed, 8 not AI-labeled passed

Only 9 of the 34 AI-labeled chat experiences passed all four questions. Many handled simpler prompts but dropped off when asked for a case study. Even when chat is positioned as "AI," it often falls back on the same old playbook, treating case studies and customer stories as gated content.

So what? AI improves performance, but it also raises expectations. When a vendor calls something "AI," they borrow trust from tools like ChatGPT. When an "AI assistant" cannot link to content or pushes a demo too early, the disappointment is sharper. A bad chatbot is annoying. A bad chatbot wearing an AI label is reputational.

Finding 4 — Most chatbots can tell, but cannot show

Videos, infographics, and case studies are still trapped in the resource hub.

We tested: "Send me a case study." Only 17% of chat experiences surfaced a case study summary or relevant link. 83% failed.

What "failed" looked like:

Redirecting to "Talk to sales" or "Book a demo" instead of sharing anything
Gating the case study behind an email
Linking to the homepage or resource hub and leaving the buyer to hunt

The buyer created an opportunity for the brand to show off, and received nothing back.

So what? This is rarely a "we don't have content" problem. Most B2B sites are packed with case studies, logos, testimonials, and customer wins. But when a chatbot cannot surface the content the team has already invested in, those assets become dead weight. The moment a buyer asks for a case study, they are signaling readiness to learn. If they hit a gate or a dead end, they move on.

Bottom line: 83% of chatbot experiences interrupt or block the learning journey at the exact moment the buyer is most willing to be convinced.

What 12 Chatbots Did Right

The top performers did three things differently:

Answered all four questions clearly.
Shared relevant links — case studies, guides, and next-step pages — without forcing a form.
Recommended talking to sales only after the buyer was oriented.

Chatbot gut check

If a chat experience cannot do these reliably, it is not educating buyers:

Answer "What is [Company]?" in plain language
Explain how the company is different in one clear sentence
Share a relevant case study or guide on request
Deliver value before asking for personal information
Work without errors or auto-routing to sales

What to Do If Your Chatbot Fails This Test

Chatbots are at their best when converting the highest-intent visitors — people who already arrived ready to book time. That is real value, but it is a narrow slice of any site's traffic.

Most buyers arrive earlier in their evaluation. They need to learn before they are ready for the next step. If the website only prioritizes buyers who are ready now, the rest self-qualify out and rarely come back.

There are two reasonable paths forward.

Path 1 — Fix the chat experience

Rework the chatbot around education. Make sure it can answer the basics, surface case studies and guides on request, avoid gating, and stop erroring out. This can work, but it takes ongoing maintenance and the upper bound is still a chat widget that only the small fraction of visitors who choose to click on it can benefit from.

Path 2 — Rethink the website itself

The deeper problem is not chatbot quality. It is that the website was never built to guide a buyer through an evaluation. A chat widget bolted to a brochure site is still a brochure site with a chat widget. Most visitors never click chat at all. The ones who do often hit the failure patterns above.

The alternative is to make the website itself the guide.

Where Navless Fits

Navless is the AI-powered digital marketing platform built for the AI funnel. It is one platform with two solutions — Signal and Guide — that cover the three stages of the AI funnel: getting found and recommended by LLMs (Stage 1), guiding buyers through personalized evaluations on the website (Stage 2), and helping existing customers self-educate through the knowledge base or customer portal (Stage 3).

Guide is the solution that addresses what this study measures. It is an AI agent that sits on top of the existing website and adapts the experience to each visitor in real time, so a buyer who arrives with an AI-informed question gets a personalized path to the answer instead of a navigation maze or a chat dead end.

What that looks like in practice:

Guide explains what the company does, and why it matters, in plain language for any stage of intent.
Guide surfaces the most relevant case studies, guides, and proof points in seconds — without gating them.
Guide moves buyers forward with a clear next step that fits their level of intent, instead of routing every visitor to "book a demo."

Guide is not another chatbot. A chatbot is a widget that waits for a click. Guide is the website experience itself, adapting from the first page load. It works for every visitor, not only the small percentage who opt into a chat window.

FAQ: Can Guide coexist with our chatbot?

Yes. Many Navless customers keep an existing chatbot live for sales routing and lean on Guide to handle education and exploration for everyone else.

Pilot

Navless runs a 90-day paid pilot. A digital twin of the Guide experience deploys on the customer's own site in 2 to 3 business days, and the pilot fee credits toward an annual plan if the customer moves forward.

To learn more, visit navless.ai.

Appendix

A) Full question set

Awareness: "What is [Company Name]?"
Consideration: "How is [Company Name] different?"
Intent: "Can you send me a case study?"
- Testers were allowed to ask "Can you send me a link to a case study?" if the original prompt failed.
Evaluation: "How can I learn more?"

B) Scoring rubric & definitions

Passed: the score given when a chatbot answered a question without an error or a handoff to a live agent or form.
Failed: the score given when a chatbot would not respond to a question, either by erroring out or by routing the chat to a live agent or form before answering.

C) Visual examples

The original report included anonymized screenshots of passed and failed examples for each of the four questions, organized by question category. These are visual artifacts and are not reproduced in this text version. A link to the anonymized survey responses is available on request.