tts voice free download

Showing 64 open source projects for "tts voice"

View related business solutions

Python Clear Filters & Widen Search

Cut Data Warehouse Costs up to 54% with BigQuery
Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.

Try BigQuery Free
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
1

Voice-Pro

Comprehensive Gradio WebUI for audio processing

Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.

1 Review

Downloads: 16 This Week

Last Update: 2025-12-05
See Project
2

Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models

...The project includes pre-trained models and inference scripts that let users synthesize speech locally or integrate TTS into larger pipelines such as voice assistants, accessibility tools, or multimedia generation workflows. Because it’s part of the broader Qwen ecosystem, it benefits from the model’s understanding of linguistic nuances, enabling more accurate pronunciation, prosody, and contextual delivery than many traditional TTS systems. Developers can customize voice output parameters like speed, pitch, and volume, and combine the TTS stack with other AI components.

Downloads: 3 This Week

Last Update: 2026-02-06
See Project
3

Spark TTS

Spark-TTS Inference Code

...The project supports zero-shot voice cloning, meaning it can imitate a new speaker’s voice without dedicated training for that specific voice, and works across languages, including English and Chinese, even in cross-lingual code-switching scenarios. Spark-TTS allows users to control speech characteristics like gender, pitch, and speaking rate to customize synthesized output and support virtual speaker creation.

Downloads: 2 This Week

Last Update: 2026-02-04
See Project
4

Sopro TTS

A lightweight text-to-speech model with zero-shot voice cloning

Sopro TTS is an open-source text-to-speech (TTS) project that implements a lightweight model capable of producing speech from text with zero-shot voice cloning, meaning it can mimic a speaker’s voice from only a few seconds of reference audio. Built with a 169 million-parameter architecture that uses dilated convolutions and cross-attention layers instead of large Transformer stacks, it achieves relatively fast real-time performance even on CPUs (about a 0.25 real-time factor measured on an M3 base). ...

Downloads: 0 This Week

Last Update: 2026-02-06
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
5

Kitten TTS

State-of-the-art TTS model under 25MB

KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.

Downloads: 8 This Week

Last Update: 1 day ago
See Project
6

GLM-TTS

Controllable & emotion-expressive zero-shot TTS

GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice character even for unseen speakers. ...

Downloads: 0 This Week

Last Update: 2026-01-20
See Project
7

edge-tts

Use Microsoft Edge's online text-to-speech service from Python

edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications.

Downloads: 18 This Week

Last Update: 2025-12-12
See Project
8

GPT-SoVITS

1 min voice data can also be used to train a good TTS model

GPT‑SoVITS is a state-of-the-art voice conversion and TTS system that enables zero‑shot and few‑shot synthesis based on a short vocal sample (e.g., 5 seconds). It supports cross‑lingual speech synthesis across English, Chinese, Japanese, Korean, Cantonese, and more. It's powered by VITS architecture enhanced for few‑sample adaptation and real‑time usability.

Downloads: 49 This Week

Last Update: 2025-07-29
See Project
9

clone-voice

A sound cloning tool with a web interface, using your voice

Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. ...

Downloads: 12 This Week

Last Update: 2025-11-28
See Project
Ship AI Apps Faster with Vertex AI
Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.

Try Vertex AI Free
10

Coqui TTS

A deep learning toolkit for Text-to-Speech, battle-tested in research

TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings...

Downloads: 34 This Week

Last Update: 2023-12-12
See Project
11

MetaVoice-1B

Foundational model for human-like, expressive TTS

...With that scale and dataset volume, MetaVoice aims to push the boundary of what open-source TTS models can achieve: high fidelity, natural prosody, and robustness even for edge cases. As a foundational model, it can serve as the backbone for downstream tasks — such as voice generation, voice cloning, speech generation for virtual agents, or even audio production pipelines.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
12

VideoChat

Real-time voice interactive digital human

...It is built as a Gradio Python demo, exposing a web interface where users can talk to an animated avatar that lip-syncs to synthesized speech while responding intelligently. The system is customizable: you can define your own avatar appearance and voice, and it supports voice cloning so you can generate a new voice from a short 3–10 second reference sample. The tech stack integrates FunASR for speech recognition, Qwen for language understanding, multiple TTS engines like GPT-SoVITS, CosyVoice, or edge-tts, and MuseTalk for talking-head generation.

Downloads: 0 This Week

Last Update: 2025-12-18
See Project
13

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to equivalent Edge voices. ...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
14

Orpheus TTS

Towards Human-Sounding Speech

Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3B backbone, treating speech synthesis as a large language model problem instead of a traditional TTS pipeline. It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research...

Downloads: 0 This Week

Last Update: 2025-12-05
See Project
15

IndexTTS2

Industrial-level controllable zero-shot text-to-speech system

...The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.

Downloads: 1 This Week

Last Update: 2025-11-27
See Project
16

ebook2audiobook

Generate audiobooks from e-books, voice cloning & 1107+ languages

ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of underlying TTS backends (XTTSv2, Bark, VITS, Fairseq, Tacotron2, YourTTS and more), which gives flexibility depending on hardware availability, voice preference, and language. ...

Downloads: 35 This Week

Last Update: 2026-02-02
See Project
17

Bailing

Bailing is a voice dialogue robot similar to GPT-4o

Bailing is an open-source voice-dialogue assistant designed to deliver natural voice-based conversations by combining automatic speech recognition (ASR), voice activity detection (VAD), a large language model (LLM), and text-to-speech (TTS) in a single pipeline. Its goal is to offer a “voice-first” chat experience similar to what one might expect from a system like GPT-4o, but fully open and deployable by users.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
18

Luna AI

Virtual AI anchor that combines state-of-the-art technology

...The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE), and “xuniren,” and can output to streaming platforms like Bilibili, Douyin, Kuaishou, WeChat Channels, Pinduoduo, Douyu, YouTube, Twitch, and TikTok. For voice, it integrates with numerous TTS engines (Edge-TTS, VITS-Fast, ElevenLabs, VALL-E-X, OpenVoice, GPT-SoVITS, Azure TTS, fish-speech, ChatTTS, CosyVoice, F5-TTS, MultiTTS, MeloTTS, and others), and can optionally pass the output through voice conversion systems like so-vits-svc or DDSP-SVC to change timbre.

Downloads: 5 This Week

Last Update: 2025-11-28
See Project
19

LuxTTS

A high-quality rapid TTS voice cloning model

LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. ...

Downloads: 9 This Week

Last Update: 6 days ago
See Project
20

Chatterbox

SoTA open-source TTS

Chatterbox is Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our...

Downloads: 9 This Week

Last Update: 2025-06-25
See Project
21

Bolna

Conversational voice AI agents

Bolna is an end-to-end open-source platform for building conversational voice AI agents, enabling developers to create voice-first conversational assistants efficiently.

Downloads: 4 This Week

Last Update: 23 hours ago
See Project
22

MegaTTS 3

Official PyTorch Implementation

MegaTTS3 is an open-source text-to-speech (TTS) and voice-cloning system from ByteDance that aims to deliver high-quality, expressive speech synthesis, including zero-shot voice cloning of previously unseen speakers. Its backbone is a lightweight diffusion-transformer (on the order of ~0.45 B parameters), which enables efficient inference while still producing high-fidelity audio.

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
23

Step-Audio

Open-source framework for intelligent speech interaction

...It also provides a “generative data engine,” which can produce synthetic speech data (cloning voices, varying style) to support TTS training.

Downloads: 0 This Week

Last Update: 7 days ago
See Project
24

ElevenLabs Python

The official Python SDK for the ElevenLabs API

elevenlabs-python is the official Python SDK for the ElevenLabs API, giving developers a convenient way to access ElevenLabs’ high-quality, lifelike voices. The library wraps the HTTP API into a typed Python client, so you can perform text-to-speech, streaming, voice cloning, voice management, and agents-related operations with simple method calls. It exposes ElevenLabs’ main models such as Eleven Multilingual v2, Eleven Flash v2.5, and Eleven Turbo v2.5, each targeting different trade-offs...

Downloads: 3 This Week

Last Update: 20 hours ago
See Project
25

CosyVoice

Multi-lingual large voice generation model, providing inference

CosyVoice is a multilingual large voice generation model that offers a full-stack solution for training, inference, and deployment of high-quality TTS systems. The model supports multiple languages, including Chinese, English, Japanese, Korean, and a range of Chinese dialects such as Cantonese, Sichuanese, Shanghainese, Tianjinese, and Wuhanese. It is designed for zero-shot voice cloning and cross-lingual or mix-lingual scenarios, so a single reference voice can be used to synthesize speech across languages and in code-switching contexts. ...

Downloads: 0 This Week

Last Update: 2025-11-30
See Project