Skip to main content
POST
/
api
/
v1
/
lightning
/
get_text
Convert speech to text
curl --request POST \
  --url https://waves-api.smallest.ai/api/v1/lightning/get_text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/octet-stream'
{
  "status": "success",
  "transcription": "Hello world.",
  "word_timestamps": [
    {
      "word": "Hello",
      "start": 0,
      "end": 0.5,
      "speaker": "speaker_0"
    },
    {
      "word": "world.",
      "start": 0.6,
      "end": 0.9,
      "speaker": "speaker_0"
    }
  ],
  "utterances": [
    {
      "text": "Hello world.",
      "start": 0,
      "end": 0.9,
      "speaker": "speaker_0"
    }
  ],
  "age": "adult",
  "gender": "male",
  "emotions": {
    "happiness": 0.8,
    "sadness": 0.15,
    "disgust": 0.02,
    "fear": 0.03,
    "anger": 0.05
  },
  "metadata": {
    "filename": "audio.mp3",
    "duration": 1.7,
    "fileSize": 1000000
  }
}
The ASR POST API allows you to convert speech to text using two different input methods:
  1. Raw Audio Bytes (application/octet-stream) - Send raw audio data with all parameters as query parameters
  2. Audio URL (application/json) - Provide only a URL to an audio file in the JSON body, with all other parameters as query parameters
Both methods use our Lightning ASR model with automatic language detection across 30+ languages.

Authentication

This endpoint requires authentication using a Bearer token in the Authorization header:
Authorization: Bearer YOUR_API_KEY

Input Methods

Choose the input method that best fits your use case:
MethodContent TypeUse CaseParameters
Raw Bytesapplication/octet-streamStreaming audio data, real-time processingQuery parameters
Audio URLapplication/jsonRemote audio files, webhook processingQuery parameters

Code Examples

Method 1: Raw Audio Bytes (application/octet-stream)

curl --request POST \
  --url "https://waves-api.smallest.ai/api/v1/lightning/get_text?model=lightning&language=en&word_timestamps=true&diarize=true&age_detection=true&gender_detection=true&emotion_detection=true" \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: audio/wav' \
  --data-binary '@/path/to/your/audio.wav'

Method 2: Audio URL (application/json)

curl --request POST \
  --url "https://waves-api.smallest.ai/api/v1/lightning/get_text?model=lightning&language=en&word_timestamps=true&diarize=true&age_detection=true&gender_detection=true&emotion_detection=true" \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "url": "https://example.com/audio.mp3"
  }'

Supported Languages

The Lightning ASR model supports automatic language detection and transcription across 30+ languages. For the full list of supported languages, please check ASR Supported Languages.
Specify the language of the input audio using its ISO 639-1 code. Use multi to enable automatic language detection from the supported list. The default is en (English).

Authorizations

Authorization
string
header
required

Query Parameters

model
enum<string>
required
Available options:
lightning
Example:
language
enum<string>
default:en
Available options:
it,
es,
en,
pt,
hi,
de,
fr,
uk,
ru,
kn,
ml,
pl,
mr,
gu,
cs,
sk,
te,
or,
nl,
bn,
lv,
et,
ro,
pa,
fi,
sv,
bg,
ta,
hu,
da,
lt,
mt,
multi
webhook_url
string<uri>
Example:
webhook_extra
string
Example:
word_timestamps
boolean
default:false
diarize
boolean
default:false
age_detection
enum<string>
default:false
Available options:
true,
false
gender_detection
enum<string>
default:false
Available options:
true,
false
emotion_detection
enum<string>
default:false
Available options:
true,
false

Body

Response

status
string
Example:
transcription
string
Example:
audio_length
number
Example:
word_timestamps
object[]
utterances
object[]
age
enum<string>
Available options:
infant,
teenager,
adult,
old
Example:
gender
enum<string>
Available options:
male,
female
Example:
emotions
object
metadata
object