How to transcribe your files to JSON


Step 1: Greate an account and get an API Key

Please go the API Key docs to learn how to get your API Key. You will need it to authenticate your requests to the Whisper API.

Step 2: Transcribe your audio files

We are going to use the following audio file for this example:

https://files.whisper-api.com/example.mp4

Using the API

To transcribe the above file using the API via cURL, you need to set the format option to json in the request body as follows:

Terminal window
curl \
-H "X-API-Key: YOUR_API_KEY" \
-F "language=en" \
-F "format=json" \
-F "model_size=large-v2" \
-F "url=https://files.whisper-api.com/example.mp4" \
https://api.whisper-api.com/transcribe

We’ve also added the model_size parameter to specify the model size and language parameter to specify the language of the audio file. The url parameter is used to specify the URL of the audio file you want to transcribe.

If we were using a local file instead of a URL, we would set the file parameter to the path of the file on your local machine. For example:

Terminal window
curl \
-H "X-API-Key: YOUR_API_KEY" \
-F "language=en" \
-F "format=json" \
-F "model_size=large-v2" \
https://api.whisper-api.com/transcribe

Why do we pass the parameters this way?

Our API parameters are sent as "multipart/form-data" format, which allows you to send files and data to a server in a single request.

All modern programming languages have features/libraries that support this format. For example, in JavaScript, we can make a similar request using the Fetch API and FormData as follows:

const apiKey = "YOUR_API_KEY";
const apiUrl = "https://api.whisper-api.com/transcribe";
const fileUrl = "https://files.whisper-api.com/example.mp4";
const formData = new FormData();
formData.append("language", "en");
formData.append("format", "json");
formData.append("model_size", "large-v2");
formData.append("url", fileUrl);
fetch(apiUrl, {
method: "POST",
headers: {
"X-API-Key": apiKey,
},
body: formData,
})
.then((response) => {
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
return response.json();
})
.then((data) => {
console.log("Success:", data);
})
.catch((error) => {
console.error("Error:", error);
});

Get the transcription result

Once we send the request above, we will get the following JSON response:

{
"task_id": "3e16cc10-2a2a-46ae-8454-1b86cfe02d5a",
"status": "queued",
"result": null,
"language": "en",
"format": "json"
}

This means that the transcription has been submitted and is currently being processed. The task_id is a unique identifier for the transcription task. You can use this ID to check the status of the transcription like so:

Terminal window
curl \
-H "X-API-Key: YOUR_API_KEY" \
https://api.whisper-api.com/status/3e16cc10-2a2a-46ae-8454-1b86cfe02d5a

If the transcription is complete, you will get a response like this, with status set to completed:

{
"task_id": "3e16cc10-2a2a-46ae-8454-1b86cfe02d5a",
"status": "completed",
"result": [
{
"start": 0.0,
"end": 4.8,
"text": "Hello guys! So today we're going to...."
}
],
"language": "en",
"format": "json"
}

The transcription result will be in the result field, which is an array of objects. Each object contains the start and end time of the transcription segment, as well as the text of the transcription.

Using the Whisper API Dashboard

If you prefer to use the Whisper API dashboard, you can do so by following these steps:

  1. Go to the Whisper API dashboard.

  2. Click on the “Upload” button to upload your audio/video file.

  3. Select the language of the audio file and the model size you want to use.

Transcription modal

  1. Click on the “Start Transcription” button to start the transcription process.

  2. Wait for the file to be uploaded and the transcription to be completed. You will see a progress bar indicating the status of the transcription.

Transcription progress bar

  1. Once the transcription is complete, you will see a “Download” button next to your transcription. Click on it and select the “Export as JSON” option to download the transcription result in JSON format.

Transcription download button

Conclusion

And that’s it! You have successfully transcribed your audio file to JSON using the Whisper API. You can now use this JSON data in your application as needed.

Looking to add a transcription API to your workflow?

Check out Whisper API, the fast, fully configurable transcription API with no limits powered by OpenAI's Whisper.

Learn More