Skip to main content
POST
/
api
/
generate
/
submit
Submit Get Timestamped Lyrics Task
curl --request POST \
  --url https://api.vidgo.ai/api/generate/submit \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "get-timestamped-lyrics",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "task_id": "task-unified-1757165031-uyujaw3d",
    "audio_id": "e231xxxx-xxxx-xxxx-xxxx-xxxx8cadc7dc"
  }
}
'
{
  "code": 200,
  "data": {
    "aligned_words": [
      {
        "word": "[Verse] Waggin'",
        "success": true,
        "start_s": 1.36,
        "end_s": 1.79,
        "palign": 0
      }
    ],
    "waveform_data": [
      0,
      1,
      0.5,
      0.75
    ],
    "hoot_cer": 0.38,
    "is_streamed": false
  }
}
  1. After submission, a task_id will be returned. If you provided a callback_url, when the task status becomes finished or failed, a POST request will be sent to the callback_url.
  2. Regardless of whether callback_url is provided, you can retrieve the response result through the unified Query Music Detail endpoint.

Usage Guide

  • This endpoint retrieves synchronized lyrics with precise timestamps for generated audio tracks
  • Use this to create karaoke-style displays, subtitles, or lyric visualizations
  • Requires a completed music generation task with vocals

Parameter Details

  • Required parameters:
    • task_id: The unique identifier from a previous music generation task
    • audio_id: The specific audio track identifier from the task result
  • Both identifiers are obtained from the response of music generation endpoints or their callbacks

Developer Notes

  • This endpoint only works with audio tracks that contain vocals
  • The hoot_cer (Character Error Rate) value indicates alignment precision - lower values mean better accuracy
  • Use waveform_data for creating audio visualizations alongside the lyrics

Response Fields

  • aligned_words (array): List of lyric words with timing information
    • word (string): The lyric text, may include section markers like [Verse], [Chorus]
    • start_s (number): Word start time in seconds
    • end_s (number): Word end time in seconds
    • success (boolean): Whether the word was successfully aligned
    • palign (number): Alignment confidence score
  • waveform_data (array): Numerical data for audio waveform visualization
  • hoot_cer (number): Alignment precision score (Character Error Rate)
  • is_streamed (boolean): Indicates if the audio is a streamed track

Authorizations

Authorization
string
header
required

All API endpoints require Bearer Token authentication

Get your API Key:

Visit the API Key Management Page to get your API Key

Add it to the request header:

Authorization: Bearer VIDGO_API_KEY

Body

application/json
model
enum<string>
required

API model identifier.

Must be get-timestamped-lyrics for this endpoint.

Available options:
get-timestamped-lyrics
Example:

"get-timestamped-lyrics"

input
object
required

Input parameters for retrieving timestamped lyrics

callback_url
string<uri>

Webhook callback URL for result notifications

Example:

"https://your-domain.com/callback"

Response

Timestamped lyrics retrieved successfully

code
integer
required

HTTP status code

Example:

200

data
object
required