Enterprise Voice AI: STT, TTS Agent APIs | Deepgram
Plataforma de IA de voz empresarial com APIs STT e TTS para experiências conversacionais dinâmicas.
Sobre Enterprise Voice AI: STT, TTS Agent APIs | Deepgram
Deepgram oferece uma plataforma de IA de voz empresarial com APIs precisas e econômicas para Speech-to-Text (STT), Text-to-Speech (TTS) e agentes de voz. Disponível em tempo real e em lote, na nuvem e auto-hospedado, permite extrair insights de áudio e criar experiências conversacionais, otimizando operações e satisfação do cliente.
Principais Recursos
Casos de Uso
Planos e Preços
⚠️ Valores estimados pela IA — confirme no site oficial
Free
$200 of credit
- Access all endpoints in public models
- Up to 100 for the REST API (Speech to Text)
- Up to 150 for the WSS API (Speech to Text)
- Up to 5 for Deepgram Whisper Cloud (Speech to Text)
- Up to 45 for the REST API + WSS API (Text to Speech)
- Up to 45 for the WSS API (Voice Agent API)
Growth
$4k+/year
- Save up to 20% With pre-paid credits for the year. Credits are redeemed against actual usage.
- Access all endpoints in public models
- Up to 100 for the REST API (Speech to Text)
- Up to 225 for the WSS API (Speech to Text)
- Up to 5 for Deepgram Whisper Cloud (Speech to Text)
- Up to 60 for the REST API + WSS API (Text to Speech)
Enterprise
Consulte
- For businesses with large volumes, data or deployment requirements, or support needs.
Flux (Speech to Text - Streaming)
$0.0077/min (Pay As You Go)
- Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Flux (Speech to Text - Streaming)
$0.0065/min (Growth)
- Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Nova-3 (Monolingual) (Speech to Text - Streaming)
$0.0077/min (Pay As You Go)
- Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Nova-3 (Monolingual) (Speech to Text - Streaming)
$0.0065/min (Growth)
- Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Nova-3 (Multilingual) (Speech to Text - Streaming)
$0.0092/min (Pay As You Go)
- Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Nova-3 (Multilingual) (Speech to Text - Streaming)
$0.0078/min (Growth)
- Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Nova-1 & 2 (Speech to Text - Streaming)
$0.0058/min (Pay As You Go)
- Recommended for use with non-English transcription.
Nova-1 & 2 (Speech to Text - Streaming)
$0.0047/min (Growth)
- Recommended for use with non-English transcription.
Enhanced (Speech to Text - Streaming)
$0.0165/min (Pay As You Go)
- Recommended for lower word error rates than Base, high accuracy timestamps, and use cases that require keyword boosting.
Enhanced (Speech to Text - Streaming)
$0.0136/min (Growth)
- Recommended for lower word error rates than Base, high accuracy timestamps, and use cases that require keyword boosting.
Base (Speech to Text - Streaming)
$0.0145/min (Pay As You Go)
- Recommended for large transcription volumes and high accuracy timestamps.
Base (Speech to Text - Streaming)
$0.0105/min (Growth)
- Recommended for large transcription volumes and high accuracy timestamps.
Flux (Speech to Text - Pre-Recorded)
$0.0077/min (Pay As You Go)
- Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Tags
Galeria e Vídeos
Avaliações
🤖 Análise por IA
Deixar avaliação
💡 Dicas da Comunidade
Carregando dicas...
❓ Perguntas e Respostas
Carregando perguntas...
