The Challenge
A Digital InsurTech Firm's 12-person support team was processing over 800 inbound queries per week across web chat, email, and a newly launched WhatsApp Business channel. The volume itself was not the problem — the problem was the composition of that volume. When we analysed their ticket data from the prior 6 months, 68% of all queries fell into just 12 categories: policy status checks, renewal date lookups, premium calculation requests, claim document checklists, cancellation process enquiries, and variations on "how do I...?" questions answerable directly from their public documentation.
These queries were consuming the majority of a skilled 12-person team's working hours. Agents with the knowledge to handle complex dispute escalations and sensitive claim investigations were spending most of their day answering the same questions they had answered the day before. Average first-response time had crept to 4.5 hours — acceptable for a complex query, deeply frustrating for a customer who simply wants to know their renewal date. A customer satisfaction survey showed response time as the top-cited frustration, ahead of even claim outcomes.
The 60% of queries arriving outside business hours — evenings, weekends, Australian public holidays — received no response until the next morning. For a digital-first insurer, this was inconsistent with the brand promise. The client had looked at off-the-shelf chatbot platforms and found them inadequate: they required extensive manual intent mapping, produced brittle responses that broke under slight rephrasing, and had no mechanism for maintaining context across a multi-turn conversation. The CEO wanted something that felt like talking to a knowledgeable team member, not navigating a decision tree.
Our Approach
The first three weeks were spent entirely on data work, not bot development. We ingested 18,000 historical support conversations, labelled them by query type, mapped the resolution path for each category, and identified the 40 most common answer patterns. This corpus became the training foundation. We also processed The client's full policy documentation library — 3,200 pages across 14 product variants — through a document chunking and embedding pipeline so the bot could retrieve and cite specific policy terms when answering coverage questions.
BotStudio's NLP engine uses a retrieval-augmented approach rather than a purely generative one. For factual queries — policy lookups, renewal dates, premium calculations — the bot retrieves the answer directly from The client's policy database or document library. This is critical: a purely generative bot will occasionally confabulate plausible-sounding but incorrect answers, which in an insurance context could constitute a mis-selling event. By grounding responses in retrieved facts, we eliminate that risk category entirely. The generative layer handles natural language formatting and follow-up, not answer creation.
Confidence thresholds were designed conservatively. Any query where the bot's confidence falls below 85% triggers an automatic handoff to a human agent — the bot says "I want to make sure you get the right answer on this one, so I'm going to connect you with a specialist" rather than guessing. We tracked handoff rates weekly and used them as the primary training signal, systematically closing the gaps until the 60-day deflection target of 65% was reached and exceeded.
The Solution
BotStudio was deployed simultaneously across The client's web chat widget, WhatsApp Business API, and an email-to-chat routing layer that converts incoming email queries into live chat sessions. All three channels share the same NLP engine and conversation context, so a customer who starts a query on WhatsApp can continue it on web without repeating themselves — a capability The client's previous support stack couldn't offer on any channel.
The admin console gives The client's support team full visibility into bot performance without requiring technical knowledge. Conversation logs are grouped by topic and outcome. Missed or mishandled queries are surfaced in a review queue where agents can record the correct response — this directly feeds the training loop, improving accuracy without requiring a separate data science process. Weekly performance reports are auto-generated and emailed to the support manager with deflection rate, CSAT, handoff reasons, and volume by channel.
The human agents' workflow changed significantly. Their queue now contains only the queries that genuinely require human judgment: complex claims, coverage disputes, complaints, and sensitive personal circumstances. Response times for these queries improved because agents are no longer interrupted by trivial lookups. In the 90-day post-launch review, The client's support manager noted that two agents who had previously been considering leaving due to the repetitive nature of the work had reversed that decision — the job had, in their words, "become interesting again."
Results & Impact
BotStudio
This project was built on our pre-built BotStudioplatform — customised for this client's exact needs.
Explore BotStudio Try Live DemoWant something similar?
Get an estimate for a project like this in 3 minutes.
Free AI EstimateTalk to us