How to Make your Help Center AI-Readable

The content problems that make AI agents give wrong answers, and how to find them before your customers do.

A view best appreciated before you have to cross it.

Your AI support agent doesn't browse your help center the way a customer does. It doesn't skim headings or click through related articles to piece together an answer. When a customer asks a question, the agent runs a semantic search across your entire knowledge base, pulls the closest-matching chunk of content, and generates an answer from that chunk.

The quality of the answer tracks the quality of that chunk. An article that was last updated before your product redesign will generate instructions for a UI that no longer exists. An article that contradicts another article in a different folder will produce an answer that's wrong and fully confident about it. A better model reading an outdated article will produce a more fluently written wrong answer.

A 2025 paper from Google Research, presented at ICLR, studied when and why retrieval-augmented generation (RAG) systems fail. RAG is the architecture behind Intercom Fin, Zendesk AI, and most other support agents. The paper introduced "sufficient context" as a framework: the model answers correctly when the retrieved content is sufficient to answer the question, and hallucinates when it is not. Most of the failure modes trace back to the content, not the model.

Gartner's 2024 survey of customer service leaders found that 61% had a backlog of knowledge articles waiting to be edited. More than a third had no formal process for revising outdated content at all.

These are the same organizations deploying AI agents on top of those knowledge bases and watching resolution rates sit at 30%. Oh well, we did see that coming.

The content patterns that break AI retrieval

Five content patterns cause the majority of wrong AI answers. They show up differently in your analytics and each one has a different fix.

Contradictions are the most damaging. Two articles give different answers to the same question: your billing FAQ says refunds take 5-7 business days, your getting-started guide says 3 days. A human might notice the conflict and call support. The AI picks whichever article the vector search ranked higher and answers with certainty. Intercom (now Fin) built a contradiction detection tool into their Optimize dashboard specifically for this.

If you're struggling with this, the fix is straightforward, although a bit boring: search for topics that have changed in the past year and check whether the old answer still lives somewhere else.

Stale instructions happen when the product ships a redesign and nobody updates the article. The AI serves old instructions for a UI that no longer exists.

Unlike contradictions, these are easy to find: search your help center for the names of features, buttons, or navigation elements that changed in your last few releases.

Missing scope means an article covers multiple plans, regions, or product versions without labeling which section applies to whom. The AI can't mentally filter for the reader's situation the way a human can.

Adding explicit markers ("On the Pro plan:" or "If you're using API v2:") at the top of each section gives the retrieval system enough to pull the right chunk.

Assumed context is when an article only makes sense alongside other articles. "To set up SSO, configure your identity provider and paste the callback URL into the field below." A support engineer knows exactly what this means. The AI, retrieving this chunk in isolation, will parrot it back without the surrounding knowledge.

The fix is to read each article as if you've never seen any other page in your help center, and add the missing context inline.

Missing coverage is the simplest pattern (yay!): no article exists for the question. The AI either escalates to a human or generates an answer from loosely related content. Robb Clarke, Head of Technical Operations at RB2B, described identifying frequently asked questions that the help center didn't cover as a significant part of reaching a 65% Fin resolution rate.

How to find the worst articles first

You don't need to audit every article. You need to find the ones that are actively causing wrong answers.

If you're on Fin, the Optimize dashboard shows which content Fin used, how often each article resolved vs. didn't resolve a conversation, and where contradictions exist. Start with articles that have high involvement and low resolution. Those are the ones Fin keeps pulling but customers keep rejecting. Try to focus on the top 20% of content by AI involvement rate and flag anything with a resolution rate below 50%.

The dashboard groups its suggestions into categories: content gaps (questions no article answers), contradictions (articles that conflict), and duplicates (articles covering the same topic). Each category points to a different fix, which means you can work through them systematically instead of reading articles at random.

On Zendesk, content performance insights under AI agents show similar signals: which articles are powering automated replies and which ones are falling short.

If your platform doesn't surface this data, you can get partway there manually. Pull your last month of support tickets, look at the 20 most frequent questions, and search your help center for each one. If the help center doesn't have a good answer, write one. If it has two contradictory answers, reconcile them and archive the outdated version.

What the ongoing maintenance looks like

After the initial cleanup, resolution rates go up. They don't stay up on their own. Every product release can make existing articles wrong again, and the gap between "product shipped" and "docs updated" is where wrong AI answers accumulate.

Intercom (now Fin) publishes their own team's numbers for context: roughly 20 hours per week of documentation work to maintain an 80% Fin resolution rate. Their deployment data also shows that teams spending 2-4 weeks on knowledge base cleanup before launching Fin saw resolution rates average 12 percentage points higher than teams that deployed on existing content untouched.

Twenty hours a week is realistic when you have dedicated documentation headcount. Most support teams at 50-to-200-person SaaS companies don't. The work still needs to happen, but it's competing with ticket queues and everything else the support team owns. What actually happens is someone updates the docs after a release, the next release ships, the docs fall behind again, and three months later the help center is back where it started.

The structural problem is the trigger. Your engineering team closes a ticket in Linear that changes how billing disputes work. Your product team ships a redesign that moves the settings page. These events happen in your ticket tracker and your codebase. Your help center doesn't know about them. Nobody searches for every article that mentions billing disputes or the settings page after every release, because that work doesn't fit into any existing workflow.

Intercom's Optimize dashboard and Zendesk's content insights both catch the downstream effect: articles that customers are bouncing off, or that the AI agent keeps pulling without resolving conversations. They're useful for finding what's already broken. (Stop Blaming the Bot covers why the documentation layer is almost always the root cause, if you want the longer argument.)

Pageloop works from the other direction. When that Linear ticket about billing disputes closes, Pageloop flags which published articles describe the affected behavior, so your team can review them before the AI agent starts serving the old answer.

If you want a method for finding what's already drifted, How to Find Stale Content in Your Knowledge Base walks through that process.

Image Courtesy - Unsplash and The Cleveland Museum of Art
A View from Moel Cynwich, 1850, William Turner (British, 1789–1862)

Author

Fatema

Fatema works across marketing and content at Pageloop. She has an academic background in Ecology, a side-life in fashion, and an irrational loyalty to milk coffee.