Skip to content

anshuman-dev/llm-speech

Repository files navigation

Voice Claude

A Chrome/Edge extension that adds voice input and output to Claude.ai, enabling hands-free interaction with Claude directly on the web interface.

Features

  • Voice Input: Click the mic button to speak your prompts
  • Smart Notifications: Get notified when Claude responds
  • Voice Output: Hear Claude's responses read aloud
  • Auto-speak Mode: Optionally auto-play responses without confirmation
  • Native Integration: Works seamlessly with Claude.ai's existing interface

How It Works

  1. Voice Input: Extension adds a microphone button to the Claude.ai chat interface
  2. Speech-to-Text: Your speech is transcribed and filled into the prompt box
  3. Send: Submit your message as normal
  4. Response Detection: Extension detects when Claude finishes responding
  5. Notification: You get a notification asking if you want to hear the response
  6. Voice Output: Click "Yes, let's hear" to have the response read aloud

Installation

Chrome/Edge

  1. Clone or download this repository
  2. Open Chrome/Edge and go to chrome://extensions/ or edge://extensions/
  3. Enable Developer mode (toggle in top right)
  4. Click Load unpacked
  5. Select the project directory
  6. Navigate to https://claude.ai and start chatting with voice

Configuration

Click the extension icon to access settings:

  • Auto-speak responses: Toggle to automatically speak responses without confirmation

Usage

  1. Go to https://claude.ai
  2. Look for the microphone button in the chat input area
  3. Click the mic button and speak your question
  4. Your speech will be transcribed into the input box
  5. Send the message (or it can auto-send)
  6. When Claude responds, you'll see a notification
  7. Click "Yes, let's hear" to hear the response via text-to-speech

Technical Details

  • Speech Recognition: Web Speech API (browser native)
  • Text-to-Speech: Web Speech Synthesis API (browser native)
  • Response Detection: MutationObserver monitors DOM for new Claude messages
  • No API Key Required: Works with your existing Claude.ai session

Browser Compatibility

  • Chrome: Full support
  • Edge: Full support
  • ⚠️ Firefox: Limited (Web Speech API support varies)
  • Safari: Not supported

Development Roadmap

Phase 1 (Current)

  • ✅ Voice input integration
  • ✅ Response detection
  • ✅ Notification system
  • ✅ TTS playback

Phase 2 (Coming Soon)

  • Auto-speak toggle (implemented, needs testing)
  • Auto-send after speech
  • Custom voice selection
  • Keyboard shortcuts

Phase 3 (Future)

  • Conversation history playback
  • Multiple language support
  • Better TTS (ElevenLabs/OpenAI integration)
  • Mobile support

Privacy

  • No data leaves your browser except normal Claude.ai interactions
  • No API keys or external services required
  • Speech recognition uses your browser's built-in capabilities
  • All settings stored locally in browser

Contributing

Contributions welcome! Feel free to open issues or submit pull requests.

License

MIT

Troubleshooting

Mic button not appearing?

  • Make sure you're on claude.ai
  • Refresh the page
  • Check the extension is enabled

Speech recognition not working?

  • Allow microphone permissions when prompted
  • Check your microphone is working
  • Try Chrome/Edge for best support

Responses not being detected?

  • Make sure you're using the latest version
  • Check browser console for errors
  • Report issues on GitHub

About

Talk to Claude

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published