The case against building your own help center maintenance

Writing the articles was never the hard part. Keeping them from contradicting each other is.

From here it looks intact. Look closer and it dissolves.

Most teams have reached the same conclusion in the last couple of years: keeping the help center current matters more now than it used to, because an AI agent is reading it and answering customers from it. Some of those teams have also concluded they can handle it themselves with ChatGPT and a weekend of plumbing, and that it does not warrant a real budget. Both conclusions feel correct. But are they?

What everyone agrees on now

A year ago, "keep your documentation up to date" was advice nobody argued with and nobody funded. It sat in the same category as "write good commit messages." Correct, virtuous, perpetually deprioritized.

That changed for a specific reason. The help center stopped being a page humans occasionally read and became the source a machine reads on every single query. When support runs through an AI agent, the knowledge base is no longer reference material. It is the model's memory. A stale article used to be a page that underperformed without anyone noticing. Now it is an input that produces a confident wrong answer to a customer, at scale, without anyone watching. Losing customers' confidence while paying for a bot that answered them incorrectly was a gamble nobody wanted to take.

So the importance is settled. Teams feel it, and they are investing attention they were not spending eighteen months ago. That part is a good sign for everyone who has spent years arguing documentation deserves more care.

The disagreement starts at the next step.

Writing is the easy part

Ask a team how they are handling it and the answer is some version of the same thing. We use ChatGPT or Claude to write the articles. It works, and it should: generating a help center article is a back-and-forth a good model is good at, and the early results are good enough that the demo to leadership goes fine. Our own customers write this way and like it. They are not asking anyone to write the articles for them, because writing an article is not the hard part.

That is the trap. Because generation is easy, the whole job looks easy, and a team concludes the rest of it is the same weekend project. You can produce ten solid articles in an afternoon. What you cannot do in an afternoon is keep two hundred of them from contradicting each other as the product moves underneath them.

We worked with a team that did exactly this. They built a custom GPT to maintain their knowledge base, and it held up while the base was small. As it grew, the same feature ended up described incorrectly in one article and correctly in two others, and nobody could keep track of every place it appeared.

Why it breaks at scale

Ask the bot about that feature and it retrieves all three articles, then contradicts itself inside a single answer. The correct information and the stale information sit in the same knowledge base, and the wrapper has no way of knowing they disagree, so it serves the wrong version with the same confidence it serves the right one. That was what the team saw: not loud failures, which are easy to catch, but fluent answers that blended two versions of a feature into one that never existed.

The reason this is hard is the reason it resists a DIY fix. When you change a feature, you update the articles you can think of and miss the ones you forgot were there. The larger the base gets, the more places the same feature is mentioned and the more of them you miss. Holding every mention of every feature across a few hundred articles in your head is not a discipline problem you solve by trying harder. Catching every scattered reference is exhaustive cross-referencing, the kind of work a person will always eventually miss and a system will not, which is why this is the part you cannot reliably do on your own.

Why support loses the budget fight, and why that is now a mistake

Here is the belief underneath the build-it-yourself reflex, stated plainly: this does not deserve a real line item. Support tooling rarely does. Support is a cost center, the work is invisible when it goes well, and a problem you cannot see is a problem you do not fund.

That logic was defensible when the downside of stale docs was a slightly worse customer experience. The cost was diffuse and the risk was low, so a free-tier model and someone's spare Friday afternoon was a rational allocation. Underfunding it was not negligence. It was a reasonable read of the stakes.

The stakes moved. The same shift that made everyone agree documentation matters more also changed the math on what underfunding it costs. When your docs feed an AI agent, a maintenance failure is no longer a quiet underperformance. It is wrong answers issued to customers in your name, at the speed and volume of automation. The thing you decided was not worth a budget is now the input layer for the most customer-facing system you run.

The DIY GPT is what "this does not deserve a budget" looks like in practice. It is the visible, cheap, good-enough-looking version of solving the problem, and it is good enough right up until the knowledge base gets big enough to matter, which is the same moment it starts to fail. The false economy is not that you saved money. It is that you bought a tool that works in exactly the conditions where the problem is trivial and breaks in exactly the conditions where it is serious.

What the decision comes down to

None of this is an argument that ChatGPT is bad at documentation, or that no team should build anything. A model that writes well is genuinely useful, and for a small, slow-moving knowledge base, a light internal setup may be all you ever need.

The argument is narrower and harder to dodge. If your help center feeds an AI agent, and your product changes faster than one person can track in their spare time, then the part you are treating as a weekend project is the part that determines whether your most-used customer channel tells the truth. That is not a writing problem a better prompt solves. It is a maintenance problem, it has a real cost when it fails, and the cost arrives precisely when you have grown enough to have something to lose. Deciding it does not warrant investment is itself a decision about how much a wrong answer to a customer is worth. Most teams are making that decision without noticing they are making it.

Image courtesy Unsplash and Art Institute of Chicago
The White Bridge, Henry Twachtman, American, 1853–1902

Author

Fatema

Fatema works across marketing and content at Pageloop. She has an academic background in Ecology, a side-life in fashion, and an irrational loyalty to milk coffee. Connect with her on Linkedin.