Inspiration
Last summer, a couple of us were on engineering internships where we got to sit in on real incident calls. The pattern was always the same: something breaks, six people jump on a call, someone says "can you check the connection pool config?" and nobody writes it down. The Jira tickets get created an hour later from memory. Half the call is spent figuring out who's doing what instead of actually fixing the problem.
We looked at what tools exist for this and found a gap. PagerDuty handles alerting. Otter and Fireflies transcribe meetings. Jira tracks issues. But nothing sits inside the incident call itself and works alongside the team while the outage is happening. All the coordination between "the alert fired" and "we're debugging" happens manually.
We'd been wanting to build something with real-time audio and LLMs at a hackathon, and this felt like the right problem. An AI that joins the call, listens, tracks tasks, investigates the codebase, and answers questions so engineers can stay focused on the actual fix.
What it does
Sprynt joins your engineering incident calls as a bot participant. It transcribes everything in real time with speaker labels using Deepgram Nova-3, pulls action items out of conversation and puts them on a live task board, and catches verbal reassignments ("actually give that to Sarah") to update the board on the spot. When someone says "hey Sprynt," it answers by posting to the meeting chat and speaking out loud via ElevenLabs TTS.
About 30 seconds into the call, Sprynt automatically investigates your GitHub repo. It fetches the file tree, pulls recent commits and diffs, and uses an LLM to rank the most likely suspect files with confidence scores and specific line ranges. Results show up in a Monaco editor panel with the suspect lines highlighted red.
Approved tasks sync to Jira with one click, complete with ADF-formatted descriptions and the transcript excerpt that triggered the task. Everything streams to a 3-panel dashboard (transcript, task board, deep dive) over WebSocket, with a tabbed layout on mobile.
How we built it
The backend is Python/FastAPI with async WebSocket rooms scoped per incident. We use Skribby's API to get a bot into Google Meet, Zoom, or Teams, and Skribby connects to Deepgram Nova-3 for real-time transcription with speaker diarization. Every finalized transcript chunk gets sent to an LLM through a multi-provider abstraction layer (Gemini, Claude, GPT-4o) for task extraction. The deep dive agent runs a 6-step pipeline: fetch the repo tree, pull recent commits with diffs, send everything to an LLM for ranking, then fetch suspect file contents and identify specific line ranges.
The frontend is Next.js 14 with Zustand for state management. Voice Q&A works by detecting wake phrases in the transcript, gathering incident context from the database, generating an answer, posting it to the meeting chat via Skribby, and synthesizing audio through ElevenLabs that gets uploaded to S3 and played back through a sequential audio queue. GitHub and Jira both connect through OAuth 2.0, and Supabase handles auth and Postgres.
Three engineers split the work by domain: one owned backend core (FastAPI, DB, auth, models), one owned the entire frontend, and one owned integrations (GitHub, Jira, S3, LLM clients). We defined file ownership explicitly and coordinated at the boundaries around WebSocket message contracts and API schemas, which kept merge conflicts close to zero across 4 sprint cycles.
Challenges we ran into
Our first task machine design auto-synced extracted tasks to Jira after a 15-second stabilization window. In practice, the LLM pulled a task out of nearly every sentence where someone mentioned doing something, and 15 seconds later it was a Jira ticket nobody asked for. We scrapped auto-promote entirely and switched to a two-column board (Proposed / Synced) where nothing touches Jira until a human clicks Approve.
Skribby's WebSocket protocol didn't match what we assumed. We built the listener expecting event type "transcript" with data.text and data.speaker, but Skribby actually sends type: "ts" with data.transcript and data.speaker_name. The listener connected fine but silently ignored every event because none matched. On top of that, the transcription model name we used was wrong ("deepgram-nova3-realtime" instead of "deepgram-realtime-v3"), and Skribby returned a 422 until we fixed it.
Other bugs that cost us time: Python's default log level is WARNING, so all our logger.info() calls were silently suppressed until we added one line to main.py. The LLM sometimes returned prose instead of JSON for suspect line identification, crashing with JSONDecodeError. And our resolved_at timestamp column was timezone-naive while our Python code was writing timezone-aware datetimes, which broke asyncpg in two separate files before we caught the pattern.
Accomplishments that we're proud of
The full pipeline works end-to-end in real time. Someone speaks on a call, it shows up in the transcript feed, gets parsed into a task on the board, and that task can be pushed to Jira with one click. All streaming, all live.
The deep dive agent actually finds the right file. We tested it by describing an incident verbally ("the database connection pool keeps exhausting") and it identified the file where max_overflow had been changed in a recent commit, down to the specific line range. Saying "hey Sprynt, what changed in the last deploy?" during a live call and getting an answer spoken back through TTS was a good moment too.
Ripping out the auto-promote task system mid-build and replacing it with approval-based flow was the right decision even though it meant rewriting the state machine, the API endpoints, and the frontend board layout in one sitting.
What we learned
Real-time async systems need attention at every layer. You can't fire off LLM calls and assume they come back in order. Partial transcripts from Deepgram need careful handling to avoid visual jitter in the UI. WebSocket reconnection needs exponential backoff. Browser autoplay policies will silently block your TTS audio if you don't account for them.
Read the API docs carefully. Two of our biggest time sinks (wrong Skribby event format, wrong transcription model name) came from assumptions about the API instead of checking the actual spec first. Also, LLMs are surprisingly good at pulling structured tasks out of messy human speech, even with crosstalk and half-finished sentences.
Three people can build a full-stack real-time system at a hackathon if you split ownership cleanly and define the contracts between domains up front. Every file had exactly one owner, and that's why we had almost no merge conflicts.
What's next for Sprynt
Runbook generation after each incident based on what was investigated and what fixed it. Incident similarity matching so Sprynt can tell you "this looks like the outage from two weeks ago, here's what resolved it." Slack integration for posting task updates and summaries to incident channels automatically.
Longer term: multi-repo support for investigating across microservices, continuous monitoring mode where Sprynt watches alerting tools and starts pre-investigation before the call even begins, and eventually an agent that can suggest and execute rollbacks on its own.
Built With
- alembic
- amazon-web-services
- anthropic
- deepgram
- docker
- elevenlabs
- fastapi
- github
- jira
- monaco
- next.js
- oauth
- postgressql
- python
- s3
- shadcn
- skribby
- sonnet
- sqlalchemy
- supabase
- tailwind
- typescript
- websockets
- zustand
Log in or sign up for Devpost to join the conversation.