A Python script that converts MP3 audio files to MP4 videos optimized for social media platforms like YouTube Shorts, Instagram Reels, and TikTok. Features include:
- Portrait format (9:16 aspect ratio) optimized for mobile viewing
- Auto-generated captions using OpenAI's Whisper speech recognition
- Brand logos automatically added to videos
- Ocean-themed gradient background fitting for marine biology content
- Transparent background support for overlay videos
- Batch processing with skip/force options
- Converts MP3 files to MP4 videos in portrait format (1080x1920)
- Automatically downloads and adds RummerLab and PhysioShark logos
- Generates captions from audio using Whisper AI
- Creates beautiful ocean-themed gradient backgrounds
- Processes multiple files in batch
- Skips already converted files (unless forced)
- Comprehensive logging and error handling
-
Clone or download this repository
git clone <repository-url> cd mp3-mp4
-
Install Python dependencies
pip install -r requirements.txt
-
Install FFmpeg (required by MoviePy)
- Windows: Download from FFmpeg website and add to PATH
- macOS:
brew install ffmpeg - Linux:
sudo apt install ffmpeg(Ubuntu/Debian) orsudo yum install ffmpeg(CentOS/RHEL)
-
Configure environment variables (optional)
python setup_env.py
Or manually copy
env.templateto.envand edit the values.
-
Create input and output folders (automatically created if they don't exist)
mp3-mp4/ ├── input/ # Place your MP3 files here ├── output/ # Converted MP4 files will be saved here └── mp3_to_mp4_converter.py -
Place your MP3 files in the
inputfolder -
Run the converter
python mp3_to_mp4_converter.py
# Use custom input/output folders
python mp3_to_mp4_converter.py -i /path/to/input -o /path/to/output
# Force conversion (overwrite existing files)
python mp3_to_mp4_converter.py -f
# Combine options
python mp3_to_mp4_converter.py -i /path/to/input -o /path/to/output -f-i, --input: Input folder containing MP3 files (default: "input")-o, --output: Output folder for MP4 files (default: "output")-f, --force: Force conversion even if output file already exists
- Resolution: 1080x1920 (portrait format)
- Frame rate: 30 FPS
- Codec: H.264 video, AAC audio
- Aspect ratio: 9:16 (optimized for mobile/social media)
- Background: Ocean-themed gradient (blue tones) or transparent
- Logos: RummerLab (top-left) and PhysioShark (top-right)
- Captions: White text with black outline, positioned at bottom
- Transparency: Optional alpha channel support for overlay videos
The script automatically downloads logos from:
- RummerLab: https://rummerlab.com/images/rummerlab_logo_transparent.png
- PhysioShark: https://physioshark.org/images/logo-physioshark-project.png
The converter supports environment variables for configuration. Create a .env file or use the setup script:
python setup_env.pyOPENAI_API_KEY: Your OpenAI API key for Whisper transcriptionVIDEO_WIDTH: Video width (default: 480)VIDEO_HEIGHT: Video height (default: 854)ENABLE_CAPTIONS: Enable/disable captions (true/false)ENABLE_LOGOS: Enable/disable logos (true/false)BG_TOP_RED/GREEN/BLUE: Background gradient top color (RGB)BG_BOTTOM_RED/GREEN/BLUE: Background gradient bottom color (RGB)TRANSPARENT_BACKGROUND: Enable transparent background (true/false)
See env.template for all available options.
The script uses OpenAI's Whisper model to:
- Transcribe the audio content (local model or API)
- Segment the text into readable chunks
- Time-sync captions with the audio
- Display captions with professional styling
- Local Whisper: Uses local model (requires more disk space)
- OpenAI API: Uses cloud API (requires API key, faster)
- Disabled: Skip captions entirely
To create videos with transparent backgrounds (useful for overlays):
-
Set environment variable:
export TRANSPARENT_BACKGROUND=true -
Or modify config.json:
{ "transparent_background": true } -
Output format: Videos will be saved as
.movfiles with alpha channel support
Note: Transparent videos are ideal for:
- Overlaying on other videos
- Creating video effects
- Professional video editing workflows
-
FFmpeg not found
- Install FFmpeg and ensure it's in your system PATH
- Restart your terminal after installation
-
Missing dependencies
pip install -r requirements.txt
-
Whisper model download issues
- The script will continue without captions if Whisper fails to load
- Check your internet connection for initial model download
-
Logo download failures
- The script will continue without logos if downloads fail
- Check your internet connection and logo URLs
- First run: May take longer as Whisper model downloads (~1GB)
- Large files: Consider processing during off-peak hours
- Storage: Ensure sufficient disk space for video output
- Memory: Video processing can be memory-intensive
# 1. Install dependencies
pip install -r requirements.txt
# 2. Place MP3 files in input folder
cp /path/to/your/interviews/*.mp3 input/
# 3. Run conversion
python mp3_to_mp4_converter.py
# 4. Check output folder for MP4 files
ls output/mp3-mp4/
├── input/ # MP3 input files
│ ├── interview1.mp3
│ ├── interview2.mp3
│ └── ...
├── output/ # MP4 output files
│ ├── interview1.mp4
│ ├── interview2.mp4
│ ├── rummerlab_logo.png # Downloaded logos
│ └── physioshark_logo.png
├── mp3_to_mp4_converter.py # Main script
├── requirements.txt # Python dependencies
└── README.md # This file
This project is open source. Feel free to modify and distribute as needed.
For issues or questions, please check the troubleshooting section above or create an issue in the repository.