A voice AI agent in 2026 costs roughly $0.05 to $0.15 per minute on self serve platforms like Retell, Bland, and Vapi, or a monthly retainer on a managed agency plan that includes audit, deployment, tuning, and integration. Per-minute platforms scale cheaply for low volume testing. Agencies make sense for businesses taking the deployment seriously and expecting ROI in month one.
Below is every pricing model in the market, the hidden costs, and how to figure out which option makes financial sense for your situation.
The Four Pricing Models in Voice AI
Voice AI pricing doesn't work like SaaS. There are four distinct models, and they stack differently depending on your call volume.
1. Per-Minute Pricing
The most common model on infrastructure platforms like Vapi, Retell AI, and Bland AI. You pay for every minute the AI is on a call. Typical rates:
- → Vapi: ~$0.05 to $0.10/minute (depending on voice and LLM choice)
- → Retell AI: ~$0.07 to $0.12/minute
- → Bland AI: ~$0.09/minute flat
On paper, those numbers look tiny. A 3-minute call at $0.09/min is $0.27. Run 500 calls/month at 4 minutes each, that's $180/month in usage, before you add telephony, LLM tokens, voice synthesis, and whatever it cost to build the thing.
Per-minute pricing rewards low call volume and short calls. If your average call runs 6+ minutes or monthly volume goes over 1,000 calls, model the math carefully.
2. Per-Call Pricing
Some platforms and agencies charge per completed call rather than per minute. This model is more predictable if your call durations vary a lot. Typical rates run $0.50 to $2.50 per call depending on complexity.
Per-call pricing tends to appear in more opinionated products, platforms where the flow is constrained and call length is somewhat controlled. It can get expensive fast at scale, but it's easy to forecast.
3. Monthly Retainer (Managed Agency)
Agencies charge a monthly retainer that covers the full deployment: setup, integration, voice design, ongoing optimization, support. This model trades variable cost for predictability. You know what you're spending every month. The agency handles the infrastructure, keeps the agent updated, and manages issues. You don't need a technical team to maintain it.
Retainers vary widely by scope. A bare bones FAQ bot looks nothing like a multi-location clinic with CRM integration and multilingual coverage. Any agency worth talking to will scope and quote based on your actual call volume and integration surface, not a shelf price.
4. ROI-Based / Revenue Share
A smaller number of agencies, mostly in high value verticals like medical, legal, and real estate, price on outcomes rather than usage. If the agent books a $3,000 consultation, the agency takes a percentage. This model aligns incentives but is rare, and usually reserved for high ticket businesses where attribution is clean.
DIY Platforms vs. Agencies: What You're Actually Comparing
The real tradeoff isn't price per minute. It's who does the work.
A platform like Vapi gives you the infrastructure. API, dashboard, docs. Building a functional, production grade voice agent on top of that requires prompt engineering (more complex than it sounds for voice), voice selection and tuning, telephony setup (phone number provisioning, call routing), integration with your CRM or booking system, error handling and fallback logic, ongoing monitoring, and iteration for the times the agent fails in unexpected ways.
None of that is free time. A competent developer charges $75 to $150 an hour. A full build usually takes 40 to 80 hours the first time, which puts you at $3,000 to $12,000 in dev cost before you've made a single call. Then someone has to maintain it.
An agency builds and maintains all of this for you. The monthly retainer looks more expensive until you account for the hidden cost of doing it yourself.
The Hidden Costs No One Puts in Their Pricing Page
Every platform pricing page shows you one number. Here's what else goes into the actual monthly cost.
Telephony Costs
Your voice AI still needs a phone number and a telephony provider to make and receive calls. Typically separate from the AI platform cost. Twilio is the most common provider. Expect to pay $1.15/month per number plus $0.0085/minute for inbound calls and $0.013/minute for outbound. On 2,000 minutes of inbound, that's about $17 to $18/month in telephony alone, before the AI.
Some platforms bundle telephony. Most don't. Read the fine print.
LLM Token Costs
Every conversational AI response costs LLM tokens. On most platforms, you choose the underlying model (GPT-4o, Claude 3 Haiku, Gemini Flash) and pay token costs on top of the platform fee. A 4-minute call might generate 800 to 1,500 tokens of LLM usage.
At GPT-4o pricing ($5/million input tokens, $15/million output), 500 calls at 1,200 tokens each is 600K tokens, roughly $6 to $9/month. Not huge, but it's a real cost that compounds at scale. Switch to a cheaper model (Haiku, Gemini Flash) and you can cut this by 80%.
Voice Synthesis
High quality voice synthesis from ElevenLabs or similar runs roughly $0.003 to $0.006 per 1,000 characters of output text. A 4-minute call generates maybe 600 to 900 characters of AI speech. At scale, $3 to $10/month on 500 calls. Low, but worth knowing about.
Integration Work
This is where costs get real. If you want the AI to book appointments directly into your scheduling software, push leads into your CRM, or trigger follow up workflows, that's custom integration work. Each integration is typically $500 to $2,000 in developer time per system, depending on API quality and complexity.
A business connecting to Salesforce, Calendly, and a custom EHR might spend $5,000 to $8,000 on integration development before the agent ever takes a call.
What Does "Good" Look Like at Each Price Point?
Low-Volume DIY
Works if you have very low call volume (under 200 calls/month), a developer in house who can build and maintain the agent, and a simple use case (FAQ handling, basic call routing, not integrated booking). Don't plan on production quality voice or reliable CRM integration at this tier.
Mid-Tier Managed or Hybrid
Where most small businesses land when they go with a managed agency. You get a properly built agent, real integrations, and support. The agent handles inbound calls, books appointments or qualifies leads, hands off to humans when needed. This tier typically pays for itself if you're recovering 5+ bookings a month that would otherwise be missed.
High-Volume or Complex
Enterprise deployments, multi-location businesses, or high ticket verticals (medical, legal, real estate) where call quality and integration depth matter. At this level you're typically getting dedicated support, custom voice personas, multi-language capability, and reporting dashboards.
Typical ROI Timelines
This is the number that actually matters. Here's what payback looks like across different business types, assuming a mid-tier managed deployment:
- Medical clinic (average visit value: $200): recovering 10 missed appointments/month = $2,000/month recovered. Month 1 positive ROI.
- Law firm (average consultation value: $800): recovering 3 after hours leads/month = $2,400/month. Month 1 positive ROI.
- Home services (average job value: $350): recovering 8 missed calls/month = $2,800/month. Month 1 positive ROI.
- E-commerce (average order value: $120): recovering 20 support calls that would have churned = $2,400 retained. Payback in month 1 or 2.
The pattern is consistent. Businesses missing at least 10% of their inbound calls with an average ticket over $150 typically see month one positive ROI on a managed deployment.
When DIY Actually Makes Sense
There are legitimate cases for building on Vapi or Retell yourself:
- → You have an in-house developer with AI/voice experience who wants a project
- → Your use case is simple, low stakes, and unlikely to change (a static FAQ bot, for example)
- → You're an early stage startup with no call volume yet and you want to experiment
- → You already have a tech infrastructure team that can absorb the maintenance cost
If none of those fit and you're a business that needs the phone answered reliably and integrated with real systems, DIY is almost always more expensive than it looks once you price in the full cost of your time and developer hours.
What to Ask Before Signing Anything
Whether you're evaluating a platform or an agency, these are the questions worth asking:
- → What does total monthly cost look like at my actual call volume (include telephony, tokens, voice)?
- → What integrations are included vs. billed separately?
- → Who handles it when the agent fails or gives a wrong answer?
- → How long does the build take before the agent is live?
- → What's the minimum contract term?
- → Do you provide call analytics and recording?
Any reputable provider answers all of these without hesitation. Vague answers on total cost, or a list of add-ons that weren't mentioned upfront, is a signal.
Bottom Line
Voice AI agent cost in 2026 isn't a single number. It's a function of your call volume, your technical resources, the complexity of your use case, and how much of the work you want to own vs. outsource.
For most SMBs (clinics, law firms, home service companies, real estate offices), a fully managed agency ends up cheaper than DIY once you account for build time, maintenance, and the opportunity cost of calls you keep missing while you're still figuring out the infrastructure.
The question isn't "how much does voice AI cost?" It's "how much does it cost compared to what you're losing without it?"