Voice AI & Telephony

Replacing Exotel with a Self-Hosted Voice AI Gateway — 60% Cost Reduction at 500K Calls/Day

A production-grade telephony gateway that bridges PSTN networks with AI voice applications, enabling a large social-impact foundation to cut per-minute call costs by 60% while handling half a million calls daily — with zero changes to existing AI application code.

Platform

On-Premises / Cloud (DigitalOcean + Azure)

Duration

4 Months

60%

Cost reduction

500K

Calls/day capacity

<200ms

One-way audio latency

Project overview

Replaced Exotel Voicebot with a self-hosted Asterisk-based gateway that connects directly to BSNL SIP trunks. Achieved 60% cost reduction (from Rs 0.50 to Rs 0.20 per minute), 100% Exotel WebSocket protocol compatibility with zero AI code changes, and a distributed architecture that handles the BSNL VPN constraint.

Platform

On-Premises / Cloud (DigitalOcean + Azure)

Duration

4 Months

Type

Voice AI & Telephony

Stack

12 technologies

The challenge

A large social-impact foundation running AI-powered voice surveys across India was spending over Rs 10 million per month on Exotel Voicebot at Rs 0.50 per minute. Migrating away from Exotel seemed impossible because the AI applications were tightly integrated with Exotel’s proprietary WebSocket protocol. Additionally, the cheaper BSNL SIP trunk required a Pritunl VPN that blocked the gateway’s outbound internet access — creating an impossible networking constraint.

Exotel vendor lock-in at Rs 0.50/min — 60% higher than direct BSNL trunk rates

AI applications hardcoded to Exotel’s WebSocket protocol — migration would require rewriting every voice bot

BSNL SIP trunk requires Pritunl VPN which blocks internet access — the gateway cannot reach AI apps

Need to scale from 100K to 500K calls/day (3,000–6,000 concurrent) with high availability

Real-time audio quality requirements: <200ms latency, <1% packet loss across the bridge

What we set out to do

  • 01

    Build a drop-in Exotel replacement with 100% WebSocket protocol compatibility

  • 02

    Connect directly to BSNL SIP trunks at Rs 0.20/min — cutting costs by 60%

  • 03

    Solve the VPN networking constraint without compromising security or reliability

  • 04

    Achieve sub-200ms audio latency for natural AI voice conversations

  • 05

    Design for 500K calls/day with horizontal scaling and zero single points of failure

How we solved it

01

Exotel Protocol Reverse Engineering

Built a WebSocket CLIENT that connects to existing AI app servers using the exact Exotel protocol format — connected, start, media events with Base64 PCM audio. AI apps see the gateway as Exotel — zero code changes required.

Key decision

Gateway as WebSocket client (not server) to match Exotel pattern

Result

100% protocol compatibility. Instant migration with zero AI code rewrites.

02

Custom Audio Transcoding Pipeline

Built a real-time bidirectional audio bridge: PSTN μ-law RTP (8kHz, 160 samples/20ms) ↔ 16-bit PCM ↔ Base64 string. Custom codec functions process each 20ms chunk in under 1ms — 30x faster than real-time.

Key decision

Custom codec over FFmpeg for minimal latency and zero external dependencies

Result

Sub-1ms transcoding per chunk. <150ms end-to-end audio latency.

03

Distributed Architecture for VPN Constraint

Separated the system into two servers: Gateway (DigitalOcean, internet-connected, no VPN) and Asterisk PBX (Azure, VPN-connected to BSNL). RTP media flows between them over a private network link, adding only 1–2ms latency.

Key decision

Split Gateway and Asterisk across servers to solve VPN/internet conflict

Result

Both BSNL trunk access and AI app connectivity work simultaneously.

04

Capacity-Aware Session Management

Built a stateless Session Manager with PostgreSQL persistence and Redis caching. Node pair allocator distributes calls across Gateway-Asterisk pairs using least-connections routing. Session state survives gateway restarts.

Key decision

Stateless orchestrator + stateful workers with Redis capacity tracking

Result

Even load distribution. Session persistence across restarts.

05

Pluggable Trunk Provider Interface

Designed an abstraction layer supporting BSNL, Exotel, and Twilio as interchangeable SIP trunk providers. Switching providers requires only a YAML config change — no code modifications.

Key decision

TrunkProvider interface with YAML-based configuration

Result

Zero vendor lock-in. Failover between providers in seconds.

Measurable impact

60%

Cost reduction (Rs 0.50 to Rs 0.20/min)

0

AI application code changes required

<200ms

One-way audio latency achieved

500K

Daily call capacity (design target)

<1%

Packet loss rate

$400/mo

Infrastructure cost vs Rs 10M/mo Exotel

Tech stack

NNestJSAAsterisk PBXPPJSIPTTypeScriptPPostgreSQLRRedisBBullMQWWebSocketsRRTP/UDPDDockerPPM2PPrometheus

What we learned

This project proved that enterprise telephony vendor lock-in is solvable with precise protocol engineering. By replicating Exotel’s exact WebSocket format, we eliminated the migration barrier entirely. The distributed architecture — born from a real-world VPN constraint — became a strength: it enables independent scaling of the media gateway and PBX layers.

  • 01

    Protocol-level compatibility eliminates migration friction — the AI apps never knew they switched providers

  • 02

    Network constraints (VPN conflicts) can be solved architecturally by separating concerns across servers

  • 03

    Custom audio codecs outperform generic libraries when latency budgets are tight (<200ms)

  • 04

    Pluggable provider interfaces prevent future lock-in — switching trunks is now a config change, not a code change

Ready to build something that matters?

We solve problems that don't have Stack Overflow answers. Let's talk.

Book a Discovery Call