Sage

Converse with large language models using speech. DEMO

Open: Powered by state-of-the-art open-source speech processing models.
Efficient: Light enough to run on consumer hardware, with low latency.
Self-hosted: Entire pipeline runs offline, limited only by compute power.
Modular: Switching LLM providers is as simple as changing an environment variable.

How it works

Run

For text generation, you can either self-host an LLM using Ollama, or opt for a third-party provider. This can be configured using a .env file in the project root.
- If you're using Ollama, add the OLLAMA_MODEL variable to the .env file to specify the model you'd like to use. (Example: OLLAMA_MODEL=deepseek-r1:7b)
- Among the third-party providers, Sage supports the following out of the box:
  1. Deepseek
  2. OpenAI
  3. Anthropic
  4. Together.ai
- To use a provider, add a <PROVIDER>_API_KEY variable to the .env file. (Example: OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxxxx)
- To choose which model should be used for a given provider, use the <PROVIDER>_MODEL variable. (Example: DEEPSEEK_MODEL=deepseek-chat)
Next, you have two choices: Run Sage as a Docker container (the easy way) or natively (the hard way). Note that running it with Docker may have a performance penalty (Inference with whisper is 4-5x slower compared to native).
- With Docker: Install Docker and start the daemon. Download the following files and place them inside a models directory at the project root.
  Run bun docker-build to build the image and then bun docker-run to spin a container. The UI is exposed at http://localhost:3000.
- Without Docker: Install Bun, Rust, OpenSSL, LLVM, Clang, and CMake. Make sure all of these are accessible via $PATH. Then, run setup-unix.sh or setup-win.bat depending on your platform. This will download the required model weights and compile the binaries needed for Sage. Once finished, start the project with bun start. The first run on macOS is slow (~20 minutes on M1 Pro), since the ANE service compiles the Whisper CoreML model to a device-specific format. Next runs are faster.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
assets		assets
kokoro		kokoro
public		public
whisper		whisper
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
index.ts		index.ts
package.json		package.json
setup-unix.sh		setup-unix.sh
setup-win.bat		setup-win.bat
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sage

How it works

Run

Future work

About

Releases

Languages

License

farshed/sage

Folders and files

Latest commit

History

Repository files navigation

Sage

How it works

Run

Future work

About

Resources

License

Stars

Watchers

Forks

Releases

Languages