Voltar

Enterprise Voice AI: STT, TTS Agent APIs | Deepgram

Freemium Áudio e Voz

Plataforma de IA de voz empresarial com APIs STT e TTS para experiências conversacionais dinâmicas.

Sobre Enterprise Voice AI: STT, TTS Agent APIs | Deepgram

Deepgram oferece uma plataforma de IA de voz empresarial com APIs precisas e econômicas para Speech-to-Text (STT), Text-to-Speech (TTS) e agentes de voz. Disponível em tempo real e em lote, na nuvem e auto-hospedado, permite extrair insights de áudio e criar experiências conversacionais, otimizando operações e satisfação do cliente.

Principais Recursos

APIs precisas e econômicas para conversão de fala em texto e texto em fala
Suporte para agentes de voz e experiências conversacionais dinâmicas
Disponibilidade em tempo real e processamento em lote para flexibilidade
Opções de implantação na nuvem e auto-hospedado para diversas necessidades
Capacidade de extrair insights valiosos de dados de áudio

Casos de Uso

Construir assistentes virtuais e chatbots com compreensão de voz avançadaAprimorar o atendimento ao cliente com agentes de voz que respondem em tempo realAnalisar grandes volumes de áudio para identificar tendências e insights do clienteDesenvolver aplicações interativas de voz para diferentes setores e necessidadesOtimizar operações empresariais por meio da automação de interações de voz

Planos e Preços

⚠️ Valores estimados pela IA — confirme no site oficial

Free

$200 of credit

  • Access all endpoints in public models
  • Up to 100 for the REST API (Speech to Text)
  • Up to 150 for the WSS API (Speech to Text)
  • Up to 5 for Deepgram Whisper Cloud (Speech to Text)
  • Up to 45 for the REST API + WSS API (Text to Speech)
  • Up to 45 for the WSS API (Voice Agent API)
Ver plano →

Growth

$4k+/year

  • Save up to 20% With pre-paid credits for the year. Credits are redeemed against actual usage.
  • Access all endpoints in public models
  • Up to 100 for the REST API (Speech to Text)
  • Up to 225 for the WSS API (Speech to Text)
  • Up to 5 for Deepgram Whisper Cloud (Speech to Text)
  • Up to 60 for the REST API + WSS API (Text to Speech)
Ver plano →

Enterprise

Consulte

  • For businesses with large volumes, data or deployment requirements, or support needs.
Ver plano →

Flux (Speech to Text - Streaming)

$0.0077/min (Pay As You Go)

  • Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Ver plano →

Flux (Speech to Text - Streaming)

$0.0065/min (Growth)

  • Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Ver plano →

Nova-3 (Monolingual) (Speech to Text - Streaming)

$0.0077/min (Pay As You Go)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-3 (Monolingual) (Speech to Text - Streaming)

$0.0065/min (Growth)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-3 (Multilingual) (Speech to Text - Streaming)

$0.0092/min (Pay As You Go)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-3 (Multilingual) (Speech to Text - Streaming)

$0.0078/min (Growth)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-1 & 2 (Speech to Text - Streaming)

$0.0058/min (Pay As You Go)

  • Recommended for use with non-English transcription.
Ver plano →

Nova-1 & 2 (Speech to Text - Streaming)

$0.0047/min (Growth)

  • Recommended for use with non-English transcription.
Ver plano →

Enhanced (Speech to Text - Streaming)

$0.0165/min (Pay As You Go)

  • Recommended for lower word error rates than Base, high accuracy timestamps, and use cases that require keyword boosting.
Ver plano →

Enhanced (Speech to Text - Streaming)

$0.0136/min (Growth)

  • Recommended for lower word error rates than Base, high accuracy timestamps, and use cases that require keyword boosting.
Ver plano →

Base (Speech to Text - Streaming)

$0.0145/min (Pay As You Go)

  • Recommended for large transcription volumes and high accuracy timestamps.
Ver plano →

Base (Speech to Text - Streaming)

$0.0105/min (Growth)

  • Recommended for large transcription volumes and high accuracy timestamps.
Ver plano →

Custom (Speech to Text - Streaming)

Consulte

Ver plano →

Flux (Speech to Text - Pre-Recorded)

$0.0077/min (Pay As You Go)

  • Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Ver plano →

Tags

assistente virtualtext-to-speechtranscriçãovoz (TTS)

Galeria e Vídeos

Avaliações

🤖 Análise por IA

Deixar avaliação

Carregando avaliações...

💡 Dicas da Comunidade

Carregando dicas...

❓ Perguntas e Respostas

Carregando perguntas...