Bahria University
(Karachi Campus)
COURSE: Artificial Intelligence Lab
TERM: Spring 2025, CLASS: BSCS- 6 (B)
PROJECT NAME:
EMOTION-BASED SONG RECOMMENDATION SYSTEM
DR. RAHEEL SIDDIQUI / MS RABIA AMJAD
GROUP MEMBERS
M. Zain Abdullah 02-134222-046
M. Zeeshan Altaf 02-134222-113
1
TABLE OF CONTENTS
1. ACKNOWLEDGEMENT………………………………………………………………………………………………………………..3
2. ABSTRACT…………………………………………………………………………………………………………………………………..3
3. INTRODUCTION………………………………………………………………………………………………………………………….4
4. BACKGROUND/LITERATURE REVIEW……………………………………………………………………………..…………….4
5. PROBLEM STATEMENT………………………………………………………………………………………………………………..5
6. OBJECTIVES AND GOALS……………………………………………………………………………………………………………..5
7. PROJECT SCOPE…………………………………………………………………………………………………………………………..6
8. WORKFLOW………………………………………………………………………………………………………………………………..7
9. OVERVIEW OF PROJECT……………………………………………………………………………………………………………….8
10. TOOLS AND TECHNOLOGIES……………………………………………………………………………………………………….8
11. PROJECT FEATURES……………………………………………………………………………………………………………………..9
12. INTRODUCTION TO DATA SCIENCE IMPLEMENTED CONCEPTS…………………………………………………..10
13. OUTPUT……………………………………………………………………………………………………………………………………..14
14. CONCLUSION…………………………………………………………………………………………………………………………….16
15. REFERENCES……………………………………………………………………………………………………………………………….17
2
ACKNOWLEDGEMENT
We express our sincere gratitude to Allah Almighty for providing us with the strength and knowledge to
complete this project successfully. We extend our heartfelt appreciation to our course instructor, Maam
Aamana and Engr. Hamza, for their continuous guidance and valuable feedback throughout the
development process.
We are grateful to Bahria University, Karachi Campus, for providing the necessary resources and
conducive learning environment. Special thanks to our family members and friends who provided moral
support during the challenging phases of this project.
We also acknowledge the open-source community and researchers whose work in computer vision and
machine learning formed the foundation of our implementation.
ABSTRACT
The Emotion-Based Song Recommendation System is an innovative application that bridges the gap
between human emotions and personalized music experiences. This project leverages computer vision
techniques and machine learning algorithms to detect facial expressions in real-time and recommend
appropriate music content based on the user's emotional state.
The system employs a Convolutional Neural Network (CNN) for emotion classification, recognizing seven
distinct emotions: Angry, Disgust, Fear, Happy, Neutral, Sad, and Surprise. Real-time facial expression
detection is achieved through OpenCV integration with webcam feeds, while Spotify APIs provide access
to music catalogs for dynamic playlist generation.
Built using Streamlit framework, the application offers an intuitive user interface that processes live video
streams, performs emotion analysis, and displays corresponding music recommendations
instantaneously. Key features include real-time emotion detection, seamless Spotify playlist integration,
and responsive user interface design.
The implementation showcases the convergence of multiple technologies including TensorFlow, OpenCV,
Spotipy, and Streamlit, demonstrating practical applications of data science concepts in solving real-world
problems.
3
INTRODUCTION
In today's digital landscape, music streaming platforms serve millions of users worldwide, yet most lack
sophisticated personalization based on emotional states. Traditional recommendation systems rely
primarily on listening history and genre preferences, often missing the immediate emotional context that
drives music preferences.
This project addresses the gap between human emotions and music recommendation by developing an
intelligent system that recognizes facial expressions and suggests appropriate music content. The system
leverages advances in computer vision and deep learning to create a more intuitive and emotionally
aware music recommendation experience.
Music has always been intrinsically linked to human emotions, serving as both an expression of feelings
and a tool for emotional regulation. By leveraging real-time emotion detection technology and
combining it with music streaming APIs, our system creates a bridge between human emotional states
and appropriate musical content, enhancing the overall user experience.
BACKGROUND/LITERATURE REVIEW
Facial emotion recognition has emerged as a crucial component of human-computer interaction systems.
Paul Ekman's research established the foundation for understanding universal facial expressions,
identifying seven basic emotions that transcend cultural boundaries. Modern approaches leverage deep
learning architectures, particularly CNNs, which have demonstrated superior performance in image
classification tasks.
Traditional music recommendation systems employ collaborative filtering and content-based filtering
approaches. However, these approaches often fail to capture immediate emotional contexts. Recent
research has explored emotion-aware music recommendation systems, showing that incorporating
emotional features significantly improves recommendation accuracy and user satisfaction.
The application of computer vision techniques for emotion detection has advanced significantly with
sophisticated neural network architectures. The Facial Action Coding System (FACS) provides a systematic
approach to classifying facial movements and their corresponding emotional expressions. Contemporary
research focuses on improving real-time processing capabilities and accuracy in diverse conditions.
4
PROBLEM STATEMENT
Current music streaming platforms rely primarily on historical listening data and collaborative filtering
algorithms to generate recommendations. These approaches fail to address the immediate emotional
needs of users, creating a significant gap between the user's current emotional state and the music they
receive.
Key Problems Identified:
1. Lack of Emotional Context: Existing systems ignore the user's current emotional state, often
suggesting music that may not align with their immediate feelings.
2. Manual Mood Selection: Users must manually indicate their mood or search for emotion-specific
playlists, creating friction in the user experience.
3. Static Recommendations: Current systems provide static recommendations based on past behavior
rather than dynamic suggestions that adapt to real-time emotional changes.
4. Limited Personalization: Traditional systems fail to provide truly personalized experiences that
consider immediate emotional states.
OBJECTIVES AND GOALS
Primary Objectives
1. Develop Real-Time Emotion Detection System: Create a robust emotion recognition system
capable of accurately identifying seven distinct emotions from live webcam feeds.
2. Implement Dynamic Music Recommendation: Build a recommendation engine that dynamically
suggests appropriate music content based on detected emotions, utilizing Spotify's music catalog.
3. Create Intuitive User Interface: Design and implement a user-friendly interface that seamlessly
integrates emotion detection with music recommendation.
5
Secondary Goals
1. Demonstrate Data Science Applications: Showcase practical applications of computer vision,
machine learning, and API integration.
2. Achieve High Accuracy: Target emotion detection accuracy of 80% or higher across all emotion
categories.
3. Ensure Real-Time Performance: Optimize the system for real-time processing with minimal latency.
PROJECT SCOPE
Included Features
Real-time facial emotion detection using webcam input
Seven-emotion classification system (Angry, Disgust, Fear, Happy, Neutral, Sad, Surprise)
Dynamic Spotify playlist recommendation based on detected emotions
Interactive user interface with start/stop controls
Live video feed display with emotion detection visualization
Excluded Features
Multiple user face detection simultaneously
Emotion history tracking and analytics
Integration with other music streaming services
Mobile application development
Offline music playback capability
6
WORKFLOW
System Workflow Process
7
OVERVIEW OF PROJECT
The Emotion-Based Song Recommendation System represents a comprehensive integration of computer
vision, machine learning, and web technologies. The system follows a modular architecture with distinct
components handling specific functionalities.
Frontend Layer: Streamlit-based web interface providing user interaction capabilities and playlist
embedding.
Processing Layer: Computer vision pipeline incorporating face detection, image preprocessing, and
emotion classification.
Integration Layer: API management system handling Spotify authentication and playlist retrieval.
Key Innovations:
Real-time emotion processing without storing personal data
Seamless music integration with Spotify's catalog
Privacy-conscious design with local processing
Intuitive user experience requiring minimal interaction
TOOLS AND TECHNOLOGIES
Programming Languages
Python 3.8+ - Primary development language for machine learning and web development
Machine Learning Frameworks
TensorFlow 2.x - Deep learning framework for CNN model development NumPy - Numerical
computing for array manipulation
Computer Vision
OpenCV (cv2) - Real-time image processing and video capture Haar Cascade Classifiers - Pre-trained
models for face detection
8
Web Development
Streamlit - Web application framework for rapid prototyping HTML/CSS - Custom styling for embedded
players
API Integration
Spotipy - Python library for Spotify Web API Spotify Web API - Access to music catalog and playlists
System Requirements
Webcam with minimum 720p resolution
4GB RAM minimum
Modern web browser
Stable internet connection
Python 3.8 or higher
PROJECT FEATURES
Functional Requirements
Real-Time Emotion Detection
Live video feed capture and processing
Seven-category emotion classification
Face detection in real-time
Confidence score provision
Music Recommendation Engine
Emotion-to-playlist mapping
Spotify API integration
Dynamic recommendation updates
Metadata display with artwork
9
User Interface and Control
Start/stop webcam controls
Real-time video display
Interactive playlist players
Visual emotion feedback
Non-Functional Requirements
Performance: Response time under 200ms, stable 30-minute sessions
Usability: Intuitive interface, minimal learning curve
Reliability: Graceful error handling, stable operation
Security: Local processing, no data storage
Compatibility: Cross-platform support, modern browser compatibility.
IMPLEMENTED CONCEPTS
Computer Vision and Image Processing
Face Detection Implementation:
Data Science Concepts: Feature extraction, data normalization, real-time processing
10
Deep Learning Implementation
CNN Architecture:
Concepts Demonstrated: Neural networks, convolutional layers, regularization, classification
API Integration and Data Processing
Spotify Integration:
Concepts: RESTful APIs, data mapping, web integration
11
Real-Time Data Processing
Stream-lit Implementation:
12
Concepts: Stream processing, state management, real-time analytics
OUTPUT
System Interface Screenshots
13
14
Main Interface: The application launches with a clean, intuitive interface featuring the webcam control
buttons and space for video display and playlist embedding.
Webcam Feed with Emotion Detection:
Real-time video display showing the user
Green rectangle highlighting detected face
Emotion label displayed above the face
Current emotion status shown below video.
Playlist Recommendation Display:
Embedded Spotify playlist player corresponding to detected emotion
Album artwork and track information visible
Interactive controls for music playback
Dynamic updates based on emotion changes
Emotion Detection Results:
Happy: Triggers upbeat, energetic playlists
Sad: Recommends mellow, comforting music
Angry: Suggests intense, powerful tracks
Neutral: Provides balanced, ambient music
Fear/Surprise: Offers calming, reassuring content.
User Experience Flow
1. User clicks "Start Webcam" button
2. System initializes camera and displays live feed
3. Face detection activates with green bounding box
4. Emotion classification occurs in real-time
5. Corresponding Spotify playlist loads automatically
6. User can interact with embedded music player
7. System updates recommendations as emotions change
15
CONCLUSION
The Emotion-Based Song Recommendation System successfully demonstrates the integration of
computer vision, machine learning, and web technologies to create an innovative music recommendation
platform. The project achieves its primary objectives of real-time emotion detection and dynamic music
recommendation while maintaining user privacy and system performance.
Key Achievements:
Successful implementation of 7-emotion classification system with 59% accuracy
Seamless integration of Spotify APIs for dynamic playlist recommendations
Development of intuitive user interface requiring minimal user interaction
Real-time processing capabilities with sub-200ms response times
Privacy-conscious design with local data processing
Technical Contributions: The project showcases practical applications of data science concepts including
computer vision, deep learning, API integration, and real-time data processing. The modular architecture
ensures scalability and maintainability for future enhancements.
Impact and Applications: This system demonstrates the potential for emotion-aware computing in
entertainment applications. The approach can be extended to various domains including mental health
support, personalized content delivery, and human-computer interaction research.
Future Enhancements: Potential improvements include expanding emotion categories, integrating
multiple music services, implementing user personalization features, and developing mobile applications.
The foundation established by this project supports these advanced capabilities.
The successful completion of this project validates the feasibility of emotion-based recommendation
systems and opens pathways for more sophisticated emotional intelligence applications in software
development.
16
REFERENCES
[1] A. Mollahosseini, B. Hasani, and M. H. Mahoor, "AffectNet: A Database for Facial Expression, Valence,
and Arousal Computing in the Wild," IEEE Transactions on Affective Computing, vol. 10, no. 1, pp. 18-31,
2017.
[2] TensorFlow Documentation, "Image Classification," tensorflow.org, 2024.
[3] OpenCV Documentation, "Face Detection using Haar Cascades," docs.opencv.org, 2024.
[4] Spotify Web API Documentation, "Spotify for Developers," developer.spotify.com, 2024.
17