Can FFmpeg handle AI-generated videos?

Yes, with pre-validation. AI files can have non-standard codecs. ffprobe check before every processing step.

How much does it cost vs cloud services?

FFmpeg is free (LGPL). At 1,500+ files/day we save over 80% vs AWS MediaConvert.

How does it integrate with Node.js?

Child process with spawn(). Dynamically generated command. BullMQ for job queue.

Back to Blog

Software & AIJuly 12, 2025

FFmpeg in the real world: how we use it in LetsAI to process AI video and audio

ffmpeg video audio letsai transcoding automazione

FFmpeg is one of those tools everyone knows by name but few use in production. We put it at the center of LetsAI's pipeline for transcoding, format conversion and compositing. Here's how it works handling thousands of files per day.

FFmpeg in the real world: how we use it in LetsAI to process AI video and audio - Software & AI | i3k

Why FFmpeg and not cloud services

The first temptation was a managed service — AWS MediaConvert, Cloudflare Stream. Then we did the math: with LetsAI's file volume, the cost would be unsustainable. A 30-second AI video at 1080p weighs 50-80MB. Hundreds of generations per day, terabytes per month. FFmpeg runs on our servers, costs zero in licenses (LGPL), and does everything. The trade-off is you must know how to configure it — docs aren't user-friendly. But after a few weeks we have a solid pipeline that hasn't failed in 18 months of production.

The pipeline: from raw AI file to final format

When an AI provider returns video or audio, the format is almost always different from what's needed. One model generates WebM, user wants MP4. Another generates WAV at 48kHz, needs MP3 at 44.1kHz. The pipeline: 1. Automatic analysis with ffprobe (codec, resolution, bitrate, duration) 2. Audio normalization: loudness at -14 LUFS (streaming standard), silence removal 3. Video transcoding: H.264 for universal compatibility 4. Thumbnail generation: frame at 1/3 duration, 640x360 5. Output in requested formats (MP4, WebM, MP3, WAV, FLAC) All in a separate BullMQ worker. Average time: 8-15 seconds for a 30-second video.

Concatenation: merging AI clips into a coherent video

One of the most requested features: generate multiple short clips and merge them. Sounds simple — in reality it's a nightmare if clips have different codecs, resolutions or frame rates. FFmpeg has the concat filter, but only works if inputs have the same specs. In the real world they don't. Solution: normalize every clip first (same resolution, codec, frame rate), then concatenate. Crossfade between scenes with xfade — 0.5 seconds is enough for continuity. For audio: loudness normalization, crossfade between tracks, final mix. The FFmpeg command is 20-30 lines, generated dynamically in Node.js.

Common mistakes and how we solved them

FFmpeg is powerful but punishes you. Our mistakes: Memory: can eat all RAM. We use -threads 2 and limited -bufsize. Max 2GB per worker. Corrupted files: AI providers sometimes return truncated files. ffprobe check first. If corrupted, discard and notify. Timeout: transcoding over 60 seconds gets killed. Better to fail fast than block the queue. Permissions: worker runs as dedicated user with access only to working directory. Every error became an automated CI test. Today: 1,500 files/day, error rate under 0.3%.

Related Services

See how we apply these technologies in our enterprise projects.

AI Enterprise Software AI Integration On-Premise Solutions Software Development

Interested?

All articles

Securvita S.r.l. — i3k.eu