Voltar

Enterprise Voice AI: STT, TTS Agent APIs | Deepgram

Pago Produtividade e Automação

IA de voz empresarial: STT, TTS e agentes virtuais em tempo real para insights e conversas.

Sobre Enterprise Voice AI: STT, TTS Agent APIs | Deepgram

Deepgram oferece uma plataforma de IA de voz empresarial com APIs de Speech-to-Text (STT), Text-to-Speech (TTS) e agentes de voz. Disponível em tempo real e em lote, na nuvem ou auto-hospedado, otimiza operações e satisfação do cliente, transformando áudio em insights e experiências conversacionais poderosas.

Principais Recursos

APIs de Speech-to-Text (STT) precisas
APIs de Text-to-Speech (TTS) com vozes naturais
Agentes de IA conversacionais robustos
Processamento em tempo real e em lote
Opções de implantação na nuvem e auto-hospedada

Casos de Uso

Transcrever chamadas de atendimento ao cliente para análiseCriar assistentes de voz interativos para automaçãoGerar conteúdo de áudio a partir de texto para podcasts ou audiolivrosExtrair insights e sentimentos de conversas de vozHabilitar controle por voz em aplicações e dispositivos

Planos e Preços

⚠️ Valores estimados pela IA — confirme no site oficial

Free

$200 of credit

  • Access all endpoints in public models
  • Up to 100 for the REST API (Speech to Text)
  • Up to 150 for the WSS API (Speech to Text)
  • Up to 5 for Deepgram Whisper Cloud (Speech to Text)
  • Up to 45 for the REST API + WSS API (Text to Speech)
  • Up to 45 for the WSS API (Voice Agent API)
Ver plano →

Growth

$4k+/year

  • Save up to 20% With pre-paid credits for the year. Credits are redeemed against actual usage.
  • Access all endpoints in public models
  • Up to 100 for the REST API (Speech to Text)
  • Up to 225 for the WSS API (Speech to Text)
  • Up to 5 for Deepgram Whisper Cloud (Speech to Text)
  • Up to 60 for the REST API + WSS API (Text to Speech)
Ver plano →

Enterprise

Consulte

  • For businesses with large volumes, data or deployment requirements, or support needs.
Ver plano →

Flux (Speech to Text - Streaming)

$0.0077/min (Pay As You Go)

  • Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Ver plano →

Flux (Speech to Text - Streaming)

$0.0065/min (Growth)

  • Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Ver plano →

Nova-3 (Monolingual) (Speech to Text - Streaming)

$0.0077/min (Pay As You Go)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-3 (Monolingual) (Speech to Text - Streaming)

$0.0065/min (Growth)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-3 (Multilingual) (Speech to Text - Streaming)

$0.0092/min (Pay As You Go)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-3 (Multilingual) (Speech to Text - Streaming)

$0.0078/min (Growth)

  • Our highest performing model. Recommended for most use cases, especially audio with multiple languages, background noise, crosstalk and far field audio.
Ver plano →

Nova-1 & 2 (Speech to Text - Streaming)

$0.0058/min (Pay As You Go)

  • Recommended for use with non-English transcription.
Ver plano →

Nova-1 & 2 (Speech to Text - Streaming)

$0.0047/min (Growth)

  • Recommended for use with non-English transcription.
Ver plano →

Enhanced (Speech to Text - Streaming)

$0.0165/min (Pay As You Go)

  • Recommended for lower word error rates than Base, high accuracy timestamps, and use cases that require keyword boosting.
Ver plano →

Enhanced (Speech to Text - Streaming)

$0.0136/min (Growth)

  • Recommended for lower word error rates than Base, high accuracy timestamps, and use cases that require keyword boosting.
Ver plano →

Base (Speech to Text - Streaming)

$0.0145/min (Pay As You Go)

  • Recommended for large transcription volumes and high accuracy timestamps.
Ver plano →

Base (Speech to Text - Streaming)

$0.0105/min (Growth)

  • Recommended for large transcription volumes and high accuracy timestamps.
Ver plano →

Custom (Speech to Text - Streaming)

Consulte

Ver plano →

Flux (Speech to Text - Pre-Recorded)

$0.0077/min (Pay As You Go)

  • Conversational speech recognition for real-time voice agents with built-in turn detection, natural interruption handling, and ultra-low latency.
Ver plano →

Tags

assistente virtualgeração de áudiovoz

Galeria e Vídeos

Avaliações

🤖 Análise por IA

Deixar avaliação

Carregando avaliações...

💡 Dicas da Comunidade

Carregando dicas...

❓ Perguntas e Respostas

Carregando perguntas...