How Eshal Achieves an 86% AI Resolution Rate (And How You Can Too)

Platform Eshal Research Team October 2025 13 min read Last reviewed April 2026

The architecture, training approach, escalation design, and continuous improvement process behind Eshal's 86% AI resolution rate. A practical breakdown for CX leaders.

What actually drives a high resolution rate

An 86% AI resolution rate means that 86 out of every 100 customer conversations are resolved end-to-end by the AI - without human involvement. The conversation opens, the customer's need is met, and the conversation closes. No escalation. No agent callback. No follow-up required.

Achieving this consistently requires four things working together: live system integration, accurate intent detection, a well-designed escalation boundary, and continuous improvement from real conversation data. Remove any one of these and the number drops.

86%
Eshal average AI resolution rate across deployed customers
This is the live average across all active Eshal deployments - not a best-case number from a controlled demo. It ranges from 78% (complex healthcare bookings) to 93% (logistics tracking). Industry and use case drive the variance.

Component 1: live system integration

The single biggest driver of failed resolutions is not poor AI - it's disconnected data. A chatbot that cannot see real-time order status, current inventory, live account balance, or actual appointment availability cannot resolve queries that depend on that information.

Every Eshal deployment integrates with the customer's operational systems on day one. The AI does not answer from a knowledge base - it queries live data. When a customer asks "where is my order?", the AI calls the OMS API, reads the current status, and responds with the actual, real-time answer. Not a template. Not a cached response from yesterday's sync.

  • Order management systems (Shopify, SAP, custom OMS)
  • CRM and customer data (Salesforce, HubSpot, custom CRM)
  • Appointment scheduling systems
  • TMS for logistics and freight tracking
  • Core banking and payment systems (with appropriate access controls)

Component 2: accurate intent detection

Intent detection is the process of determining what a customer actually wants from their message. In Arabic, this is harder than in English for the structural reasons covered in our Arabic NLP guide - morphological complexity, dialect variation, and code-switching.

Eshal's intent model is trained on real MENA customer service conversations - not translated English training data. The difference is meaningful: a model trained on translated data will systematically misclassify intent categories that map differently across languages and cultures.

🧠
Intent detection in Arabic requires domain trainingGeneral-purpose Arabic NLP models achieve lower intent detection accuracy on customer service queries than domain-specific models. The vocabulary of a banking conversation, a logistics dispute, and a healthcare booking are all different - and differ further between dialects. Domain-specific training on real customer data is non-negotiable for 80%+ resolution.

Component 3: escalation boundary design

Counterintuitively, a higher resolution rate is not achieved by trying to resolve more things with AI. It is achieved by being very precise about what the AI should and should not handle.

Resolution rate is a function of success within the AI's scope, not the breadth of that scope. If the AI attempts to handle complex complaints and fails, the resolution rate falls. If it identifies complex complaints early and routes them to agents with full context, the resolution rate for its actual scope rises - and agent handle time on escalated contacts falls because no context-gathering is required.

How to define the escalation boundary
  1. Categorise every contact type by complexity and system dependency. High-volume, low-complexity, fully data-driven = AI scope. Low-volume, high-complexity, judgment-required = human scope.
  2. Set explicit triggers, not volume thresholds. "Escalate if customer has complained three times in 30 days" is better than "escalate if conversation exceeds 10 turns."
  3. Use keyword and sentiment triggers sparingly. Detecting anger is useful; acting on every vaguely negative statement creates unnecessary escalations.
  4. Review the escalation log weekly. The most common escalation reasons reveal gaps in AI scope - fix the root cause, not the symptom.

Component 4: continuous improvement

A resolution rate of 86% is not achieved at launch - it is built over time. Most deployments start at 70–75% and improve through a structured review cycle.

Every week, the quality dashboard highlights: conversations where the AI escalated unnecessarily, conversations where the AI attempted to resolve but failed, conversations where the AI resolved but customer satisfaction was low. Each category points to a specific type of improvement - new workflow coverage, intent model refinement, or response quality improvement.

At 90 days post-launch, the average Eshal deployment has improved resolution rate by 8–12 percentage points from its initial baseline. The curve flattens naturally as coverage approaches the theoretical maximum for the use case.

FAQ

For a standard e-commerce or logistics deployment with good system integration, expect 70–78% at launch. Banking and healthcare start slightly lower (65–72%) due to more complex workflows. With a structured improvement cycle, most deployments reach 82–86% within 90 days.
In every Eshal deployment to date, the existing team has been retained and redeployed - not reduced. With 86% of repetitive contacts automated, agents focus entirely on the 14% that genuinely needs human judgment: complex issues, high-value customers, sensitive situations. Agent job quality and satisfaction typically improve significantly.

Put this into practice.

Eshal deploys in one day. Book a demo for your industry.