Professional AI-Powered PowerPoint Generation Workbench
Transform documents into visual slides using Google Gemini 3 Pro, Nano Banana Pro, and Veo
Supports local AI (Ollama + ComfyUI) as an alternative
- Features
- Technical Architecture
- Quick Start
- Project Structure
- Core Features
- Development Guide
- Electron Desktop Application
- Configuration
- API Integration
- License
- Gemini 3 Pro (Thinking Mode): Intelligent text analysis, automatic slide outline generation
- Nano Banana Pro: High-quality image generation (supports 1K/2K/4K resolution)
- Veo 3.1 Fast: Cinematic video background generation
- Local AI Support: Supports Ollama (text generation) and ComfyUI (image generation) as alternatives
- Dual Style Modes: Concise mode and Detailed mode
- Custom Style Prompts: Support for custom design requirements
- Real-time Preview: Live editing and preview of slides
- Responsive Layout: Adapts to various screen sizes
- Complete Multi-language Coverage: All UI elements support multiple languages
- Supported Languages:
- English 🇺🇸
- Simplified Chinese 🇨🇳
- Traditional Chinese 🇹🇼
- Dynamic Language Switching: Switch languages in real-time without restarting the application
- Flag Icons: Language selector displays corresponding flag icons for intuitive visual recognition
- Dark Mode: Complete dark theme support
- Drag-and-Drop Editing: Intuitive slide editing interface
- Speaker Notes: Add speaker scripts for each slide
- Batch Generation: Generate multiple slides at once
- File Upload Support:
- Text files (.txt, .md, .json, .csv)
- Image files (.jpg, .jpeg, .png, .gif, .webp)
- PDF files
- Excel files (.xlsx, .xls) automatically converted to CSV format
- Word and PPT files need to be converted to PDF before uploading
- Smart Navigation: Scroll to switch slides, page number display
- Cross-platform Support: Windows, macOS, Linux
- Local Data Storage: Securely save API Key and configuration using SQLite
- Offline Functionality: Supports local AI services (Ollama + ComfyUI)
- Native Experience: Standalone desktop application, no browser required
- Vue 3.5: Using Composition API and latest Vue features
- TypeScript 5.8: Complete type safety
- Vite 6.2: Lightning-fast development experience
- Vue Router 4.5: Client-side routing
- Pinia 2.3: State management
- xlsx: Excel file parsing and conversion
- mammoth: Word document parsing (reserved)
- Tailwind CSS v4: Modern utility-first CSS framework
- Lucide Vue Next: Beautiful icon library
- Custom Theme System: Supports dark/light mode switching
- Google GenAI SDK: Official Gemini API integration
- Local AI Support: Ollama (text generation) and ComfyUI (image generation)
- Multi-model Support: Text, image, and video generation
- Search Grounding: Real-time search enhancement
- Electron 33.0: Cross-platform desktop application framework
- SQLite (sql.js): Local database storage
- electron-builder: Application packaging and distribution
- SOLID Principles: Follows object-oriented design principles
- camelCase Naming: Unified variable naming convention
- ESLint 9.15: Strict code quality checks
- No any Types: Complete TypeScript type definitions
- Node.js: >= 18.0.0
- npm: >= 9.0.0
- API Keys (optional):
- Gemini API Key: Get from Google AI Studio (required when using Google AI)
- Local AI (optional): Install Ollama and ComfyUI
- Clone the repository
git clone <repository-url>
cd powerpoint-workbench- Install dependencies
npm install- Configure environment variables (optional)
If you need to use Google Gemini API, create a .env.local file and set your API key:
GEMINI_API_KEY=your_api_key_hereOr use Local AI:
- Ensure Ollama is running on
http://localhost:11434 - Ensure ComfyUI is running on
http://localhost:8188 - Select "Local AI" as the provider in the application settings
- Start the development server
Web Application Mode:
Method 1: Using npm command
npm run devMethod 2: Using Windows batch script (recommended for Windows users)
# Double-click to run or execute in command line
dev.batThe application will start at http://localhost:5173 (Vite default port)
Electron Desktop Application Mode:
npm run electron:devThis command will:
- Build Electron main process and preload scripts
- Start Vite development server (http://localhost:5173)
- Wait for the server to be ready, then launch the Electron application
Launch Electron only (requires npm run dev to be running first):
npm run electron- Build production version
Web Application:
npm run buildElectron Desktop Application:
npm run electron:buildThis will:
-
Build Electron main process files to
dist-electron/ -
Build Vue application to
dist/ -
Preview production version
Web Application:
npm run previewPackage Electron Application:
npm run electron:distThis will create distributable installers, output to dist-electron/ directory:
- Windows: NSIS installer (.exe)
- macOS: DMG file
- Linux: AppImage file
The project is configured with GitHub Actions for automated build and release workflow. To publish a new version:
-
Update version number: Update the
versionfield inpackage.json(e.g.,0.1.4) -
Create Git tag:
git add . git commit -m "chore: bump version to 0.1.4" git tag v0.1.4 git push origin main git push origin v0.1.4
-
Automatic build and release: After pushing the tag, GitHub Actions will automatically:
- Build the application on Windows, macOS, and Linux
- Create a GitHub Release
- Upload installers for all platforms
-
Manual trigger: You can also manually trigger the build from the GitHub Actions page
Published installers can be downloaded from the GitHub Releases page.
Note:
- Tag format must be
v*(e.g.,v0.1.4) - Releases will automatically read update content from
CHANGELOG.md - If code signing is needed, configure certificates in GitHub Secrets
powerpoint-workbench/
├── electron/ # Electron desktop application
│ ├── main.ts # Electron main process (window management, app lifecycle)
│ ├── preload.ts # Preload script (secure API exposure)
│ └── database.ts # SQLite database operations
├── src/ # Source code directory
│ ├── components/ # Vue components
│ │ ├── ExportModal.vue # Export modal component
│ │ ├── GenerateAllModal.vue # Batch generation modal
│ │ ├── SettingsModal.vue # Settings modal component
│ │ ├── SlidePreview.vue # Slide preview component
│ │ └── TextEditorModal.vue # Text editor modal
│ ├── composables/ # Composable functions
│ │ ├── useI18n.ts # Internationalization composable
│ │ └── useTheme.ts # Theme switching composable
│ ├── i18n/ # Internationalization configuration
│ │ ├── index.ts # Internationalization entry
│ │ ├── languages.ts # Language configuration
│ │ └── locales/ # Translation files
│ │ ├── en.json # English translations
│ │ ├── zh-CN.json # Simplified Chinese translations
│ │ └── zh-TW.json # Traditional Chinese translations
│ ├── flag/ # Flag icon components
│ │ └── FlagIcons.vue # Flag icon library (200+ countries/regions)
│ ├── pages/ # Page components
│ │ ├── Editor.vue # Editor main page
│ │ └── Home.vue # Home page
│ ├── prompts/ # AI prompts
│ │ └── index.ts # Prompt configuration
│ ├── services/ # Service layer
│ │ ├── databaseService.ts # Database service (Electron)
│ │ ├── exportService.ts # Export service
│ │ ├── geminiService.ts # Gemini API service
│ │ └── localAiService.ts # Local AI service
│ ├── utils/ # Utility functions
│ │ └── ipChecker.ts # IP detection utility
│ ├── stores/ # Pinia state management
│ │ └── projectStore.ts # Project state store
│ ├── types/ # TypeScript type definitions
│ │ └── index.ts # Type definitions
│ ├── assets/ # Static assets
│ │ └── main.css # Main stylesheet
│ ├── App.vue # Application root component
│ ├── main.ts # Application entry
│ ├── router.ts # Router configuration
│ └── constants.ts # Constant definitions
├── dist-electron/ # Electron build output
│ ├── main.js # Built main process
│ └── preload.js # Built preload script
├── dist/ # Vue application build output
├── scripts/ # Build scripts
│ └── build-electron.js # Electron build script
├── index.html # HTML entry file
├── vite.config.ts # Vite configuration
├── tsconfig.json # TypeScript configuration
├── tsconfig.app.json # Application TypeScript configuration
├── tsconfig.node.json # Node TypeScript configuration
├── eslint.config.js # ESLint configuration
├── .stylelintrc.json # Stylelint configuration
├── .vscode/ # VS Code configuration
│ └── settings.json # Workspace settings (includes i18n Ally configuration)
├── .i18n-ally.yml # i18n Ally plugin configuration
├── dev.bat # Windows development server startup script
├── package.json # Project dependencies
├── metadata.json # Project metadata
└── README.md # Project documentation
electron/: Electron desktop application related filesmain.ts: Electron main process, responsible for window management and application lifecyclepreload.ts: Preload script, securely exposes Node.js APIs to renderer processdatabase.ts: SQLite database operations for local configuration storage
src/components/: Reusable Vue componentssrc/composables/: Vue 3 Composition API composable functionssrc/pages/: Page-level componentssrc/services/: API calls and business logicdatabaseService.ts: Database service wrapper, supports Electron and Web modes
src/stores/: Pinia state managementsrc/i18n/: Internationalization configuration and translation filessrc/flag/: Flag icon component library, contains 200+ country/region flag iconssrc/types/: TypeScript type definitionssrc/utils/: Utility functions (e.g., IP detection)dist-electron/: Electron build output directoryscripts/: Build and utility scripts
// 1. Input text content or upload files
const sourceText = "Your presentation content...";
// Or upload files (supports text, images, PDF, Excel)
const files: File[] = [/* uploaded files */];
// 2. Use Gemini 3 Pro to generate outline
// Supports text string or file array
const slides = await generateOutline(
apiKey,
sourceText, // or files
pageCount,
style,
customPrompt
);
// 3. Generate visual elements for each slide
for (const slide of slides) {
const image = await generateFullSlideImage(
apiKey,
slide,
customStylePrompt,
'2K'
);
}- Application root component
- Router view container
- Project home page
- Text input and file upload
- Supports multiple file formats (text, images, PDF, Excel)
- Excel files automatically converted to CSV
- Project configuration settings
- Supports Google AI and Local AI selection
- Slide editor main interface
- Three-column layout: thumbnails, canvas, properties panel
- Real-time preview and editing functionality
- Scroll to switch slides
- Page number display (current page/total pages)
- Gemini API wrapper
- Supports text generation, image generation, video generation
- Multi-modal file processing (text, images, PDF)
- Excel files automatically converted to CSV
- Error handling and retry logic
- Local AI service wrapper
- Ollama API integration (text generation)
- ComfyUI API integration (image generation)
Using Pinia for global state management:
- projectStore: Project configuration, slide data, and uploaded file management
- useI18n: Multi-language state (composable function)
- useTheme: Theme switching state (composable function)
Start development environment:
npm run electron:devThis command will:
- Build Electron main process and preload scripts
- Start Vite development server (http://localhost:5173)
- Wait for the server to be ready, then launch the Electron application
Launch Electron only (requires npm run dev to be running first):
npm run electronBuild production version:
npm run electron:buildThis will:
- Build Electron main process files to
dist-electron/ - Build Vue application to
dist/
Package distributable application:
npm run electron:distThis will create distributable installers, output to dist-electron/ directory:
- Windows: NSIS installer (.exe)
- macOS: DMG file
- Linux: AppImage file
Window Configuration:
Window size and behavior are configured in electron/main.ts:
- Default size: 1400x900
- Minimum size: 1000x600
- Title bar style: macOS uses
hiddenInset, other platforms use default - Menu bar: Auto-hide (production mode)
- Developer tools: Enabled in development mode, disabled in production mode
Build Configuration:
electron-builder configuration is in the build field of package.json:
- App ID:
com.gemini.ppt.workbench - Product name:
Gemini PPT Workbench - Output directory:
dist-electron
Database Storage:
- Database type: SQLite (using
sql.js) - Database location:
app.getPath('userData')/app.db- Windows:
%APPDATA%\gemini-ppt-workbench\app.db - macOS:
~/Library/Application Support/gemini-ppt-workbench/app.db - Linux:
~/.config/gemini-ppt-workbench/app.db
- Windows:
- Stored content: API Key, proxy configuration, local AI configuration, etc.
- Development Mode: Electron connects to Vite development server, supports hot reload
- Production Mode: Electron loads packaged static files
- Security:
- Context isolation enabled
- Node.js integration disabled (renderer process)
- Uses preload script to securely expose APIs
- Developer tools and shortcuts (Alt, Ctrl+Shift+I, F12) disabled in production mode
- Network Requests: All API calls (Gemini, local AI services) work normally in Electron
- Data Persistence: Configuration is automatically saved to local SQLite database
Electron window is blank:
- Ensure Vite development server is running (development mode)
- Ensure
npm run electron:buildhas been run (production mode) - Check console for error messages
Build failure:
- Ensure all dependencies are installed:
npm install - Check Node.js version (recommended >= 18.0.0)
- Check for TypeScript type errors
- Check build script output for error messages
Port is occupied:
- If port 5173 is occupied, you can:
- Windows: Use
netstat -ano | findstr :5173to find the process, then usetaskkill /PID <PID> /Fto terminate - macOS/Linux: Use
lsof -ti:5173 | xargs kill -9to terminate the process occupying the port
- Windows: Use
- Or modify
vite.config.tsto use a different port
Development mode startup failure:
- Check if other Vite instances are running
- Ensure
dist-electron/directory has correct permissions - Check terminal output for detailed error messages
- Build script now automatically checks ports and provides clear error messages
Packaging failure:
- Ensure
npm run electron:buildhas been run first - Check if
buildconfiguration inpackage.jsonis correct - Check if icon file exists (if custom icon is specified)
Database issues:
- Check application data directory permissions
- Confirm SQLite WASM files are loaded correctly
- Check console for error messages
- Variable Naming: Use camelCase
const slideData: SlideData = {...};
const apiKey: string = "...";- Type Definitions: Prohibit using
any
// ❌ Wrong
const data: any = fetchData();
// ✅ Correct
const data: SlideData[] = fetchData();- SOLID Principles
- Single Responsibility Principle (SRP)
- Open-Closed Principle (OCP)
- Liskov Substitution Principle (LSP)
- Interface Segregation Principle (ISP)
- Dependency Inversion Principle (DIP)
- Define types in
src/types/index.ts - Implement business logic in
src/services/ - Create Vue components in
src/components/orsrc/pages/ - Create composable functions in
src/composables/(if needed) - Update translation files in
src/i18n/locales/ - Run
npm run lintto check code quality
# Development mode (with hot reload)
npm run dev
# Or use Windows batch script
dev.bat
# Type checking
npx tsc --noEmit
# Code checking
npm run lintimport { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
import tailwindcss from '@tailwindcss/vite'
export default defineConfig({
plugins: [
vue(),
tailwindcss(),
],
resolve: {
alias: {
'@': fileURLToPath(new URL('./src', import.meta.url))
}
}
})- Target: ES2022
- Module: ESNext
- JSX: preserve (Vue SFC)
- Strict Mode: Enabled
Use @/ as an alias for src/:
import SlidePreview from '@/components/SlidePreview.vue'
import { generateOutline } from '@/services/geminiService'
import { useProjectStore } from '@/stores/projectStore'The project is pre-configured with the i18n Ally plugin. Configuration files are located at:
.vscode/settings.json: VS Code workspace settings.i18n-ally.yml: i18n Ally specific configuration
Main configuration items:
# Translation file paths
localesPaths:
- src/i18n/locales
# Key style: nested (dot-separated)
keystyle: nested
# Supported languages
locales:
- en
- zh-CN
- zh-TW
# Source language and display language
sourceLanguage: en
displayLanguage: zh-CNTo customize the configuration, edit the .i18n-ally.yml file. For detailed configuration options, refer to the i18n Ally documentation.
Used for intelligent analysis and outline generation:
import { GoogleGenAI } from '@google/genai'
const ai = new GoogleGenAI({ apiKey })
const response = await ai.models.generateContent({
model: 'gemini-3-pro-preview',
contents: text,
config: {
systemInstruction: prompt,
thinkingConfig: { thinkingBudget: 32768 },
responseMimeType: 'application/json',
},
})Used for high-quality image generation:
const response = await ai.models.generateContent({
model: 'gemini-3-pro-image-preview',
contents: prompt,
config: {
imageConfig: {
aspectRatio: '16:9',
imageSize: '2K',
},
},
})Used for video background generation:
const operation = await ai.models.generateVideos({
model: 'veo-3.1-fast-generate-preview',
prompt: prompt,
config: {
numberOfVideos: 1,
resolution: '1080p',
aspectRatio: '16:9'
}
})Used for local text generation and outline generation:
import { generateOutlineWithOllama } from '@/services/localAiService'
const slides = await generateOutlineWithOllama({
endpoint: 'http://localhost:11434',
model: 'llama3.2',
text: sourceText,
count: pageCount,
style: SlideStyle.CONCISE
})Used for local image generation:
import { generateImageWithComfyUI } from '@/services/localAiService'
const imageUrl = await generateImageWithComfyUI({
endpoint: 'http://localhost:8188',
workflowId: 'workflow_name',
prompt: visualPrompt
})The project uses the latest features of Tailwind CSS v4:
- CDN Integration: Fast loading via CDN
- Dark Mode:
classstrategy - Custom Scrollbar: Optimized for dark/light modes
import { useTheme } from '@/composables/useTheme'
const { theme, toggleTheme } = useTheme()
// Switch theme
toggleTheme()en: Englishzh-CN: Simplified Chinesezh-TW: Traditional Chinese
The project has complete multi-language support, including:
- ✅ Application title and navigation
- ✅ All UI elements on the home page
- ✅ File upload prompts and labels
- ✅ All interface elements in the editor
- ✅ Settings panel
- ✅ Export and generation functions
- ✅ Error messages and status information
- ✅ Language selector (with flag icons)
The project is configured with the i18n Ally plugin, providing a powerful internationalization development experience:
- Open the extensions panel in VS Code/Cursor (
Ctrl+Shift+X) - Search for "i18n Ally"
- Click install and reload the window
- Hover Preview: Hover over translation keys in code to view translations in all languages
- Inline Editing: Edit translations directly in code without opening JSON files
- Missing Detection: Automatically detect and mark missing translation keys
- Usage Tracking: Show where translation keys are used in code
- Quick Refactoring: Support batch replacement and refactoring of translation calls
- Multi-language Comparison: View translations in all languages side by side
The project includes the following configuration files:
.vscode/settings.json: VS Code workspace settings.i18n-ally.yml: i18n Ally specific configuration
Configuration is optimized for the project structure:
- Translation file path:
src/i18n/locales - Key style: nested (dot-separated, e.g.,
"app.title") - Framework support: Vue 3 + generic mode
- Source language:
en - Display language:
zh-CN
- Add new language code to
Languageenum insrc/types/index.ts - Add language configuration to
languageConfiginsrc/i18n/languages.ts - Create corresponding JSON translation file in
src/i18n/locales/directory - Add new language to
SUPPORTED_LANGUAGESinsrc/constants.ts - Update
localesandlocaleDisplayNamesconfiguration in.i18n-ally.yml
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
The AGPL-3.0 License allows you to:
- ✅ Commercial use
- ✅ Modify
- ✅ Distribute
- ✅ Private use
- ✅ Patent use
Requirements:
- ✅ License and copyright notice must be included
- ✅ State changes made to the code
- ✅ Disclose source code (copyleft)
- ✅ Same license must be used for derivative works
- ✅ Network use: If you modify and run the program on a server, you must provide source code to all users who interact with it remotely
Limitations:
- ❌ Liability disclaimer
- ❌ No warranty
Note: AGPL-3.0 is specifically designed for network server software. If you modify this program and make it available over a network, you must provide the source code to all users who interact with it remotely.
For detailed terms, please refer to the LICENSE file.
Welcome to submit Issues and Pull Requests!
Before submitting a PR, please ensure:
- ✅ Code passes ESLint checks (
npm run lint) - ✅ Styles pass Stylelint checks (
npm run lint:style) - ✅ All type definitions are correct, no
anytypes - ✅ Follow project code standards (camelCase naming, SOLID principles)
- ✅ Add necessary comments and documentation
- ✅ Update relevant internationalization translation files
For questions or suggestions, please contact us through:
- Submit an Issue
- Send an email to the project maintainer
Built with ❤️ and ☕
Powered by Google Gemini AI and Cursor