Steno

AI-Powered Desktop Transcription Application

A modern, cross-platform desktop application for high-accuracy speech-to-text transcription, powered by OpenAI Whisper and built with Tauri + React.

Download · Documentation · Report Bug · Request Feature · 中文

Overview

Steno is a sophisticated desktop application that transforms audio into accurate text using state-of-the-art AI technology. Built with performance and user experience in mind, it offers both real-time recording and file-based transcription capabilities with support for multiple languages and audio formats.

Key Capabilities

Advanced Audio Processing - Support for MP3, WAV, FLAC, OGG, AAC, M4A, WMA with intelligent format conversion
Real-time Transcription - Live audio capture with simultaneous speech recognition
Multi-language Support - Optimized for Chinese, English, and Indonesian with automatic language detection
AI Model Flexibility - Choose from Tiny (39MB), Base (74MB), or Large v3 (1.5GB) Whisper models
Native macOS Performance - Optimized desktop application for macOS
Apple Silicon Optimization - Metal GPU acceleration on M1/M2/M3 Macs

Installation

System Requirements

Platform	Version	Architecture	Memory	Storage
macOS	10.15+	Apple Silicon (M1/M2/M3)	4GB+	200MB+

Quick Install

macOS

# Download and install
curl -L -o Steno.dmg https://github.com/xazaj/Steno/releases/download/v1.0.0/Steno_1.0.0_aarch64.dmg
open Steno.dmg

# Configure permissions (required for unsigned apps)
xattr -rd com.apple.quarantine "/Applications/Steno.app"
codesign --force --deep --sign - "/Applications/Steno.app"

First Launch Setup

Model Selection: Choose your preferred Whisper model based on your needs:
- Tiny Model (39MB) - Fast processing, basic accuracy
- Base Model (74MB) - Balanced performance and quality
- Large v3 Model (1.5GB) - Highest accuracy, slower processing
Permissions: Grant microphone access for real-time transcription features

Usage

Basic Workflow

Audio Input → Processing → AI Recognition → Text Output → Export

Transcription Modes

Mode	Use Case	Input	Features
File Mode	Batch processing	Audio files	Drag & drop, batch queue, format conversion
Real-time	Live recording	Microphone	Live preview, speaker detection, instant results
Long Audio	Extended content	Large files	Smart chunking, progress tracking, memory optimization

Advanced Features

Smart Prompts - Context-aware templates for meetings, interviews, medical, and technical content
Speaker Diarization - Automatic identification and separation of different speakers
Export Options - Multiple formats including TXT, SRT, JSON, and Markdown
Search & Organization - Tag-based categorization with powerful filtering capabilities

Data Storage

Application Data Location

~/Library/Application Support/com.steno.app/
├── database/
│   └── steno.db                    # SQLite database (transcription records & settings)
├── models/                         # Whisper AI model files
│   ├── ggml-tiny.bin              # Tiny model (~39MB)
│   ├── ggml-base.bin              # Base model (~142MB)
│   └── ggml-large-v3.bin          # Large v3 model (~1.55GB)
├── audio/
│   ├── uploads/                    # User uploaded audio files
│   └── temp/                       # Temporary audio files
└── logs/                          # Application logs
    └── app.log

Data Management

# View application data
open "~/Library/Application Support/com.steno.app/"

# Backup database
cp "~/Library/Application Support/com.steno.app/database/steno.db" ~/Desktop/steno_backup.db

# Clean temporary files
rm -rf "~/Library/Application Support/com.steno.app/audio/temp/"

Development

Prerequisites

Rust 1.70+
Node.js 18+
Git

Local Development

# Clone repository
git clone https://github.com/xazaj/Steno.git
cd steno

# Install dependencies
npm install

# Start development server
npm run tauri:dev

Build Commands

# Development with hot reload
npm run tauri:dev

# Production build
npm run tauri:build

# Platform-specific builds
npm run build:mac-m1      # Apple Silicon

Project Architecture

steno/
├── src/                  # React frontend
│   ├── components/       # UI components
│   ├── hooks/           # Custom React hooks
│   ├── types/           # TypeScript definitions
│   └── utils/           # Utility functions
├── src-tauri/           # Rust backend
│   ├── src/             # Core application logic
│   ├── lib/             # whisper.cpp integration
│   └── capabilities/    # Tauri security permissions
├── docs/                # Documentation
└── models/              # AI model storage

Technology Stack

Layer	Technologies
Frontend	React 18, TypeScript, Tailwind CSS, Vite
Backend	Rust, Tauri 2.0, whisper.cpp, SQLite
Audio Processing	Symphonia, CPAL, WebRTC VAD, RustFFT
AI Models	OpenAI Whisper (Tiny, Base, Large v3)

Contributing

We welcome contributions from the community. Please read our Contributing Guidelines before getting started.

Development Workflow

Fork the repository
Create a feature branch (git checkout -b feature/new-feature)
Commit your changes (git commit -m 'Add new feature')
Push to the branch (git push origin feature/new-feature)
Open a Pull Request

Code Standards

Follow Conventional Commits for commit messages
Run tests before submitting: npm test && cargo test
Ensure code formatting: npm run lint && cargo fmt
Add documentation for new features

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenAI Whisper - State-of-the-art speech recognition models
whisper.cpp - High-performance C++ implementation
Tauri - Modern desktop application framework
Symphonia - Professional audio decoding library

Support

Bug Reports: GitHub Issues
Feature Requests: GitHub Issues
Documentation: Project Wiki
Discussions: GitHub Discussions

⬆ Back to Top

Made with care by the Steno development team

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.cargo		.cargo
.github		.github
.vscode		.vscode
docs		docs
models		models
public		public
src-tauri		src-tauri
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
.gitmodules		.gitmodules
GITHUB_RELEASE.md		GITHUB_RELEASE.md
INSTALL_MACOS.md		INSTALL_MACOS.md
README.md		README.md
README_CN.md		README_CN.md
build-whisper.bat		build-whisper.bat
build-whisper.sh		build-whisper.sh
build-windows.bat		build-windows.bat
build-windows.ps1		build-windows.ps1
index.html		index.html
install.sh		install.sh
install_steno.sh		install_steno.sh
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tauri.log		tauri.log
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Steno

Overview

Key Capabilities

Installation

System Requirements

Quick Install

macOS

First Launch Setup

Usage

Basic Workflow

Transcription Modes

Advanced Features

Data Storage

Application Data Location

Data Management

Development

Prerequisites

Local Development

Build Commands

Project Architecture

Technology Stack

Contributing

Development Workflow

Code Standards

License

Acknowledgments

Support

About

Uh oh!

Releases 1

Packages

Languages

xazaj/Steno

Folders and files

Latest commit

History

Repository files navigation

Steno

Overview

Key Capabilities

Installation

System Requirements

Quick Install

macOS

First Launch Setup

Usage

Basic Workflow

Transcription Modes

Advanced Features

Data Storage

Application Data Location

Data Management

Development

Prerequisites

Local Development

Build Commands

Project Architecture

Technology Stack

Contributing

Development Workflow

Code Standards

License

Acknowledgments

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages