How to handle large audio files with Whisper API in openai-python without hitting memory limits? #2547

Super-Mutec17 · 2025-08-10T14:18:09Z

Super-Mutec17
Aug 10, 2025

Hi,
I’m working with Whisper API in openai-python to transcribe long recordings (over 1 hour), and I’m running into memory and timeout issues when sending large files in one request.

Answered by SameerSenapati17

Aug 10, 2025

If you’re running into memory or timeout issues with Whisper API when transcribing long recordings (1+ hours), it’s best to avoid sending the entire file in a single request. Large uploads increase both memory usage and the chance of hitting request timeouts.

View full answer

danowar2 · 2025-08-10T15:34:56Z

danowar2
Aug 10, 2025

Think I need anything from the center

0 replies

SameerSenapati17 · 2025-08-10T15:39:50Z

SameerSenapati17
Aug 10, 2025

If you’re running into memory or timeout issues with Whisper API when transcribing long recordings (1+ hours), it’s best to avoid sending the entire file in a single request. Large uploads increase both memory usage and the chance of hitting request timeouts.

0 replies

manimovassagh · 2026-03-20T22:48:31Z

manimovassagh
Mar 20, 2026

The Whisper API has a 25MB file size limit per request. For long recordings (1+ hour), you need to split the audio into chunks before sending. Here's a reliable approach using pydub:

from pydub import AudioSegment
from openai import OpenAI
import io

client = OpenAI()

def transcribe_large_file(file_path: str, chunk_length_ms: int = 10 * 60 * 1000) -> str:
    """Transcribe a large audio file by splitting into chunks.

    Args:
        file_path: Path to the audio file
        chunk_length_ms: Chunk size in milliseconds (default 10 minutes)
    """
    audio = AudioSegment.from_file(file_path)

    full_transcript = []

    for i in range(0, len(audio), chunk_length_ms):
        chunk = audio[i:i + chunk_length_ms]

        # Export chunk to buffer (avoids writing temp files to disk)
        buf = io.BytesIO()
        chunk.export(buf, format="mp3", bitrate="64k")  # compress to stay under 25MB
        buf.seek(0)
        buf.name = "chunk.mp3"

        transcript = client.audio.transcriptions.create(
            model="whisper-1",
            file=buf,
            response_format="text",
        )

        full_transcript.append(transcript)
        print(f"Transcribed chunk {i // chunk_length_ms + 1}")

    return " ".join(full_transcript)

result = transcribe_large_file("long_recording.wav")

A few things to keep in mind:

10-minute chunks at 64kbps MP3 stay well under the 25MB limit while maintaining good transcription quality
Overlap chunks by a few seconds if you're getting words cut off at boundaries -- split at silence points when possible using pydub.silence.detect_silence()
Use response_format="verbose_json" if you need timestamps, then stitch them together by adding offsets
For memory efficiency, avoid loading the entire file with AudioSegment if it's extremely large. Use ffmpeg directly to split:

# Split a file into 10-minute chunks without loading into memory
ffmpeg -i long_recording.wav -f segment -segment_time 600 -c:a libmp3lame -b:a 64k chunk_%03d.mp3

Then iterate over the chunk files and transcribe each one.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle large audio files with Whisper API in openai-python without hitting memory limits? #2547

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to handle large audio files with Whisper API in openai-python without hitting memory limits? #2547

Uh oh!

Super-Mutec17 Aug 10, 2025

Replies: 3 comments

Uh oh!

danowar2 Aug 10, 2025

Uh oh!

SameerSenapati17 Aug 10, 2025

Uh oh!

manimovassagh Mar 20, 2026

Super-Mutec17
Aug 10, 2025

danowar2
Aug 10, 2025

SameerSenapati17
Aug 10, 2025

manimovassagh
Mar 20, 2026