Audio

Aize Platform supports audio capabilities including text-to-speech (TTS) and speech-to-text (transcription), fully compatible with the OpenAI Audio API.

Text to Speech (TTS)

Generate spoken audio from text.

Endpoint

POST https://api.aize.dev/v1/audio/speech

Example

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.aize.dev/v1"
)

response = client.audio.speech.create(
    model="tts-1",
    voice="alloy",
    input="The quick brown fox jumped over the lazy dog."
)

response.stream_to_file("output.mp3")

curl

curl https://api.aize.dev/v1/audio/speech \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "The quick brown fox jumped over the lazy dog.",
    "voice": "alloy"
  }' \
  --output speech.mp3

Supported Voices

alloy
echo
fable
onyx
nova
shimmer

Transcription (Speech to Text)

Transcribe audio files into text.

Endpoint

POST https://api.aize.dev/v1/audio/transcriptions

Example

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.aize.dev/v1"
)

audio_file = open("speech.mp3", "rb")
transcript = client.audio.transcriptions.create(
  model="whisper-1",
  file=audio_file
)

print(transcript.text)

curl

curl https://api.aize.dev/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \
  -F model="whisper-1"

Supported Formats

mp3
mp4
mpeg
mpga
m4a
wav
webm

Max file size: 25 MB

Audio

On this page