Lightning

Convert speech to text

curl --request POST \
  --url https://waves-api.smallest.ai/api/v1/lightning/get_text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/octet-stream'

{
  "status": "success",
  "transcription": "Hello world.",
  "word_timestamps": [
    {
      "word": "Hello",
      "start": 0,
      "end": 0.5,
      "speaker": "speaker_0"
    },
    {
      "word": "world.",
      "start": 0.6,
      "end": 0.9,
      "speaker": "speaker_0"
    }
  ],
  "utterances": [
    {
      "text": "Hello world.",
      "start": 0,
      "end": 0.9,
      "speaker": "speaker_0"
    }
  ],
  "age": "adult",
  "gender": "male",
  "emotions": {
    "happiness": 0.8,
    "sadness": 0.15,
    "disgust": 0.02,
    "fear": 0.03,
    "anger": 0.05
  },
  "metadata": {
    "filename": "audio.mp3",
    "duration": 1.7,
    "fileSize": 1000000
  }
}

POST

api

lightning

get_text

Convert speech to text

curl --request POST \
  --url https://waves-api.smallest.ai/api/v1/lightning/get_text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/octet-stream'

{
  "status": "success",
  "transcription": "Hello world.",
  "word_timestamps": [
    {
      "word": "Hello",
      "start": 0,
      "end": 0.5,
      "speaker": "speaker_0"
    },
    {
      "word": "world.",
      "start": 0.6,
      "end": 0.9,
      "speaker": "speaker_0"
    }
  ],
  "utterances": [
    {
      "text": "Hello world.",
      "start": 0,
      "end": 0.9,
      "speaker": "speaker_0"
    }
  ],
  "age": "adult",
  "gender": "male",
  "emotions": {
    "happiness": 0.8,
    "sadness": 0.15,
    "disgust": 0.02,
    "fear": 0.03,
    "anger": 0.05
  },
  "metadata": {
    "filename": "audio.mp3",
    "duration": 1.7,
    "fileSize": 1000000
  }
}

The ASR POST API allows you to convert speech to text using two different input methods:

Raw Audio Bytes (application/octet-stream) - Send raw audio data with all parameters as query parameters
Audio URL (application/json) - Provide only a URL to an audio file in the JSON body, with all other parameters as query parameters

Both methods use our Lightning ASR model with automatic language detection across 30+ languages.

Authentication

This endpoint requires authentication using a Bearer token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Input Methods

Choose the input method that best fits your use case:

Method	Content Type	Use Case	Parameters
Raw Bytes	`application/octet-stream`	Streaming audio data, real-time processing	Query parameters
Audio URL	`application/json`	Remote audio files, webhook processing	Query parameters

Code Examples

Method 1: Raw Audio Bytes (application/octet-stream)

curl --request POST \
  --url "https://waves-api.smallest.ai/api/v1/lightning/get_text?model=lightning&language=en&word_timestamps=true&diarize=true&age_detection=true&gender_detection=true&emotion_detection=true" \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: audio/wav' \
  --data-binary '@/path/to/your/audio.wav'

Method 2: Audio URL (application/json)

curl --request POST \
  --url "https://waves-api.smallest.ai/api/v1/lightning/get_text?model=lightning&language=en&word_timestamps=true&diarize=true&age_detection=true&gender_detection=true&emotion_detection=true" \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "url": "https://example.com/audio.mp3"
  }'

Supported Languages

The Lightning ASR model supports automatic language detection and transcription across 30+ languages. For the full list of supported languages, please check ASR Supported Languages.

Specify the language of the input audio using its ISO 639-1 code. Use multi to enable automatic language detection from the supported list. The default is en (English).

Authorizations

Authorization

string

header

required

Query Parameters

model

enum<string>

required

Available options:

lightning

Example:

language

enum<string>

default:en

Available options:

it,

es,

en,

pt,

hi,

de,

fr,

uk,

ru,

kn,

ml,

pl,

mr,

gu,

cs,

sk,

te,

or,

nl,

bn,

lv,

et,

ro,

pa,

fi,

sv,

bg,

ta,

hu,

da,

lt,

mt,

multi

webhook_url

string<uri>

Example:

webhook_extra

string

Example:

word_timestamps

boolean

default:false

diarize

boolean

default:false

age_detection

enum<string>

default:false

Available options:

true,

false

gender_detection

enum<string>

default:false

Available options:

true,

false

emotion_detection

enum<string>

default:false

Available options:

true,

false

Body

Response

status

string

Example:

transcription

string

Example:

audio_length

number

Example:

word_timestamps

object[]

Show child attributes

utterances

object[]

Show child attributes

age

enum<string>

Available options:

infant,

teenager,

adult,

old

Example:

gender

enum<string>

Available options:

male,

female

Example:

emotions

object

Show child attributes

metadata

object

Show child attributes

Lightning V2 WebSocket Get Voices

⌘I

API References

Text to Speech

Speech to Text

Voices

Voice Cloning

Pronunciations dicts

Authentication

Input Methods

Code Examples

Method 1: Raw Audio Bytes (application/octet-stream)

Method 2: Audio URL (application/json)

Supported Languages

Authorizations

Query Parameters

Body

Response

API References

Text to Speech

Speech to Text

Voices

Voice Cloning

Pronunciations dicts

​Authentication

​Input Methods

​Code Examples

​Method 1: Raw Audio Bytes (application/octet-stream)

​Method 2: Audio URL (application/json)

​Supported Languages

Authorizations

Query Parameters

Body

Response

Authentication

Input Methods

Code Examples

Method 1: Raw Audio Bytes (application/octet-stream)

Method 2: Audio URL (application/json)

Supported Languages