Skip to content

graysonmc/Lecture-Processor

Repository files navigation

Lecture Processor

Automatically extract slides and transcribe audio from lecture videos.

Quick Start (Recommended)

Use the Google Colab notebook for the easiest experience - no installation required:

Open In Colab

The Colab notebook provides a user-friendly interface with:

  • Form-based configuration (no command-line needed)
  • Automatic GPU acceleration for Whisper
  • One-click download of results
  • Google Drive integration for easy file access

Features

  • Smart Slide Detection: Grid-based peripheral uniformity analysis to identify actual slides vs UI elements
  • Letterbox Detection: Automatically detects and crops letterbox bars from recordings
  • Sensitive Change Detection: Catches animated slides and text appearing
  • Fast Transcription: Uses Deepgram API by default (or Whisper as fallback)
  • Batch Processing: Process entire folders of lecture videos
  • PDF Output: All slides compiled into a single PDF for easy LLM intake

Prerequisites

  • Python 3.9+
  • FFmpeg (for audio extraction)
  • Deepgram API key (free tier available at deepgram.com)

Installation

git clone https://github.com/graysonmc/Lecture-Processor.git
cd Lecture-Processor

# Create virtual environment (using Anaconda)
conda create -n lecture-processor python=3.11
conda activate lecture-processor

# Or using venv
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Setup

Create a .env file in the project directory with your Deepgram API key:

DEEPGRAM_API_KEY=your-api-key-here

Get a free API key at https://deepgram.com ($200 free credits)

Local Installation (Advanced)

For users who prefer running locally instead of Colab:

Batch Processing

Put your lecture videos in ~/Desktop/Raw Lectures/, then run:

python process_lecture.py --batch

Output goes to ~/Desktop/Processed Lectures/<video_name>/:

  • slides.pdf - All slides in one PDF
  • transcript.txt - Full transcription

Single Video

python process_lecture.py lecture.mp4

Options

# Slides only (skip transcription)
python process_lecture.py --batch --slides-only

# Custom output directory
python process_lecture.py --batch --output-dir /path/to/output

# Adjust slide detection sensitivity (lower = more sensitive)
python process_lecture.py --batch --threshold 20

# Use local Whisper instead of Deepgram
python process_lecture.py --batch --whisper-model base

How It Works

Slide Detection

  1. Letterbox Detection: Requires black bars on sides (positive signal for recorded slides)
  2. Rectangle Mapping: Detects slide boundaries within the letterbox
  3. Grid Analysis: Maps 18x12 grid onto detected slide region
  4. Peripheral Uniformity: Checks that edges/corners have consistent white background
  5. Change Detection: Sensitive to new text appearing (catches animated slides)

Output Structure

~/Desktop/Processed Lectures/
├── Lecture-01/
│   ├── slides.pdf
│   └── transcript.txt
├── Lecture-02/
│   ├── slides.pdf
│   └── transcript.txt
└── ...

Running Tests

pip install pytest
pytest tests/ -v

Troubleshooting

  • No slides detected: Video may not have letterbox bars, or slides may have dark backgrounds
  • Too many duplicates: Lower the threshold with --threshold 15
  • Missing slides: Raise the threshold with --threshold 30
  • Transcription fails: Check your DEEPGRAM_API_KEY is set correctly

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published