What actually drives a high resolution rate
An 86% AI resolution rate means that 86 out of every 100 customer conversations are resolved end-to-end by the AI - without human involvement. The conversation opens, the customer's need is met, and the conversation closes. No escalation. No agent callback. No follow-up required.
Achieving this consistently requires four things working together: live system integration, accurate intent detection, a well-designed escalation boundary, and continuous improvement from real conversation data. Remove any one of these and the number drops.
Component 1: live system integration
The single biggest driver of failed resolutions is not poor AI - it's disconnected data. A chatbot that cannot see real-time order status, current inventory, live account balance, or actual appointment availability cannot resolve queries that depend on that information.
Every Eshal deployment integrates with the customer's operational systems on day one. The AI does not answer from a knowledge base - it queries live data. When a customer asks "where is my order?", the AI calls the OMS API, reads the current status, and responds with the actual, real-time answer. Not a template. Not a cached response from yesterday's sync.
- Order management systems (Shopify, SAP, custom OMS)
- CRM and customer data (Salesforce, HubSpot, custom CRM)
- Appointment scheduling systems
- TMS for logistics and freight tracking
- Core banking and payment systems (with appropriate access controls)
Component 2: accurate intent detection
Intent detection is the process of determining what a customer actually wants from their message. In Arabic, this is harder than in English for the structural reasons covered in our Arabic NLP guide - morphological complexity, dialect variation, and code-switching.
Eshal's intent model is trained on real MENA customer service conversations - not translated English training data. The difference is meaningful: a model trained on translated data will systematically misclassify intent categories that map differently across languages and cultures.
Component 3: escalation boundary design
Counterintuitively, a higher resolution rate is not achieved by trying to resolve more things with AI. It is achieved by being very precise about what the AI should and should not handle.
Resolution rate is a function of success within the AI's scope, not the breadth of that scope. If the AI attempts to handle complex complaints and fails, the resolution rate falls. If it identifies complex complaints early and routes them to agents with full context, the resolution rate for its actual scope rises - and agent handle time on escalated contacts falls because no context-gathering is required.
- Categorise every contact type by complexity and system dependency. High-volume, low-complexity, fully data-driven = AI scope. Low-volume, high-complexity, judgment-required = human scope.
- Set explicit triggers, not volume thresholds. "Escalate if customer has complained three times in 30 days" is better than "escalate if conversation exceeds 10 turns."
- Use keyword and sentiment triggers sparingly. Detecting anger is useful; acting on every vaguely negative statement creates unnecessary escalations.
- Review the escalation log weekly. The most common escalation reasons reveal gaps in AI scope - fix the root cause, not the symptom.
Component 4: continuous improvement
A resolution rate of 86% is not achieved at launch - it is built over time. Most deployments start at 70–75% and improve through a structured review cycle.
Every week, the quality dashboard highlights: conversations where the AI escalated unnecessarily, conversations where the AI attempted to resolve but failed, conversations where the AI resolved but customer satisfaction was low. Each category points to a specific type of improvement - new workflow coverage, intent model refinement, or response quality improvement.
At 90 days post-launch, the average Eshal deployment has improved resolution rate by 8–12 percentage points from its initial baseline. The curve flattens naturally as coverage approaches the theoretical maximum for the use case.