Whisper API

Whisper API, also known as Audio Scripter, is a powerful tool designed to streamline audio transcription and scripting processes.

About

With its intuitive interface and robust functionality, Whisper API enables developers to effortlessly integrate audio transcription capabilities into their applications. Whether you're building a voice recognition system, developing a podcasting platform, or enhancing accessibility features, Whisper API provides the tools you need to convert audio files into accurate and editable text.

Using advanced machine learning algorithms, Whisper API delivers precise transcriptions with high accuracy, even for challenging audio recordings. Developers can customize transcription settings to optimize results for different languages, accents, and audio quality levels. With support for various audio formats, including MP3, WAV, and more, Whisper API ensures compatibility with a wide range of media sources.

Beyond transcription, Whisper API offers powerful scripting features to enhance the usability of transcribed text. Developers can segment audio transcripts into logical sections, add timestamps for easy navigation, and annotate content with metadata for organization and analysis purposes. With these scripting capabilities, developers can create rich, interactive experiences that leverage the insights extracted from audio content.

Whisper API is built with scalability and reliability in mind, allowing developers to process large volumes of audio data efficiently. With flexible pricing plans and straightforward integration options, Whisper API empowers developers to focus on building innovative audio-driven applications without worrying about the complexities of transcription technology.

Whether you're developing voice-enabled applications, conducting research, or creating multimedia content, Whisper API provides the foundation you need to unlock the potential of audio data. With its comprehensive features and developer-friendly design, Whisper API is the ideal solution for audio transcription and scripting needs.

image

Curl Requests and Responses

Process the API

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/magicapi/whisper/whisper' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
  "audio": "https://replicate.delivery/mgxm/e5159b1b-508a-4be4-b892-e1eb47850bdc/OSR_uk_000_0050_8k.wav",
  "model": "large-v3",
  "translate": false,
  "temperature": 0,
  "transcription": "plain text",
  "suppress_tokens": "-1",
  "logprob_threshold": -1,
  "no_speech_threshold": 0.6,
  "condition_on_previous_text": true,
  "compression_ratio_threshold": 2.4,
  "temperature_increment_on_fallback": 0.2
}'
{
  "request_id": REQUEST_ID
}

Get the result

curl -X 'GET' \
  'https://api.magicapi.dev/api/v1/magicapi/whisper/predictions/REQUEST_ID' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: API_KEY'
{
  "status": "succeeded",
  "result": "RESULT_URL"
}

Last updated