Gemini TTS uses a dedicated config file instead of Azure's .env or Polly's
AWS config path. Keeping Gemini auth and region settings in their own file
makes provider-specific failures much easier to trace.
- Create a local
.gemini.envfile in INI format. - Create or choose a Google Cloud service account JSON file with access to
Cloud Text-to-Speech and the
aiplatform.endpoints.predictpermission. - Open
Tools > Settings. - Set
TTS ProvidertoGemini TTS. - Point
Gemini Config Fileat the dedicated config file. - Choose a Gemini model, language code, voice, and optional style prompt.
- Use
Test Gemini TTS.
[GEMINI]
project_id = YOUR_GOOGLE_CLOUD_PROJECT_ID
service_account_json = C:\path\to\service-account.json
region = globalproject_id and service_account_json are required. region defaults to
global if omitted.
The app currently exposes these Gemini-TTS model choices:
gemini-2.5-flash-ttsgemini-2.5-pro-tts
Based on the current Google Cloud Gemini-TTS docs:
- the
textfield can be up to 4,000 bytes - the
promptfield can be up to 4,000 bytes text + promptcan be up to 8,000 bytes total
If those limits are exceeded, the app raises a clear Gemini-specific error instead of sending a bad request.
- Gemini TTS is prompt-controlled and does not use SSML in this app path.
- Gemini-TTS supports style prompts, single-speaker output, and multi-speaker output at the provider capability level.
- Region/model availability depends on the configured Google Cloud region.