Emotion Based Smart Music Player
Emotion Based Smart Music Player
An emotion-based smart music player utilizes technology to personalize the music experience
based on a user’s current emotional state. This project explores different techniques to achieve this,
including Facial Expression Recognition: By capturing the user’s face through a camera and employing
machine learning algorithms (often Convolutional Neural Networks), the system identifies emotions
like happiness, sadness, anger, or neutrality.
Speech Emotion Recognition: Analyzing the user’s speech patterns (independent of the meaning of the
words) allows the system to gauge emotions.
Music Emotion Classification: Music is categorized based on the emotions it typically evokes. This can
be done through audio feature extraction and machine learning.
Automatically generate playlists: Based on the user’s detected emotion, the system selects songs
categorized to match that feeling.
Recommend music: The player suggests songs that could elevate a user’s mood or complement their
current emotional state.
This project offers a novel and potentially therapeutic approach to music listening, catering to the
user’s
emotional well-being and enhancing the overall music experience.
1
Contents
1 Introduction..................................................................................................................................5
1.1 Background...............................................................................................................................6
1.2 Motivation................................................................................................................................6
1.3 Objectives.................................................................................................................................6
1.4 Scope........................................................................................................................................6
2 System Overview...........................................................................................................................7
3 Proposed Algorithms..................................................................................................................12
3.1.2 Strides............................................................................................................................13
3.1.3 Padding...........................................................................................................................14
4 System Requirements..................................................................................................................15
4.3.1 Python............................................................................................................................15
2
4.4.2 Technical Feasibility........................................................................................................16
5 System Anlysis.............................................................................................................................17
5.1 Purpose..................................................................................................................................17
5.2 Scope......................................................................................................................................17
6 System Designs............................................................................................................................18
7 Modules.......................................................................................................................................23
8 System Implementation..............................................................................................................24
9 System Testing.............................................................................................................................25
9.2 Verification.............................................................................................................................25
9.3 Validation...............................................................................................................................25
3
9.5.1 Unit Testing.....................................................................................................................26
9.8 Usability.................................................................................................................................27
9.8.1 Robustness.....................................................................................................................27
9.8.2 Security..........................................................................................................................27
9.8.3 Reliability.......................................................................................................................27
9.8.4 Compatibility..................................................................................................................27
9.8.5 Flexibility........................................................................................................................27
9.8.6 Safety..............................................................................................................................27
9.9.1 Portability.......................................................................................................................27
9.9.2 Performance...................................................................................................................28
9.9.3 Accuracy.........................................................................................................................28
9.9.4 Maintainability................................................................................................................28
10 Conclusion.................................................................................................................................29
References......................................................................................................................................30
4
CHAPTER-1
1 Introduction
The link between music and human emotions has long been acknowledged, with music having the
power to elicit various emotional states. Recently, the utilization of facial emotion recognition
technology has garnered considerable interest in music recommendation systems. A facial emotion-
driven music recommendation system is an inventive solution aiming to tailor music suggestions
according to the listener’s facial expressions. This technology can assess a user’s facial expressions in
real-time and propose music that best suits their current emotional state. This study delves into the
creation and deployment of a facial emotion-driven music recommendation system, covering its core
technology, potential advantages, and the challenges that must be tackled to enhance its efficiency. The
objective of this research is to offer a thorough insight into the capabilities of such a system, along with
its potential applications in the music industry and beyond. The growing accessibility of advanced facial
recognition technology and the expanding array of music streaming services position this technology as
a promising avenue for personalized music recommendations. This system has the potential to
revolutionize the way we interact with music, making it more personalized and emotionally engaging.
By integrating facial emotion recognition technology into music recommendation systems, we can
create a more personalized and enjoyable music experience for listeners. Moreover, this technology can
also have practical applications in other fields such as healthcare, education, and entertainment.
However, as with any emerging technology, there are ethical, privacy, and data security concerns that
need to be addressed. This research aims to identify these concerns and provide recommendations to
ensure the safe and ethical use of this technology. Ultimately, this paper aims to contribute to the
development of facial emotion-based music recommendation systems and provide insights for future
research in this field. To achieve the objectives of this study, we will review and analyze previous
research studies on facial emotion recognition technology and music recommendation systems. We will
also explore various machine learning and deep learning algorithms that are commonly used in
developing facial emotion recognition models. Additionally, we will conduct an empirical study by
collecting data from participants to evaluate the effectiveness of the proposed system. The findings of
this study can have significant implications for the music industry, providing music streaming services
and music marketers with new ways to personalize their offerings. This system can help to increase
listener engagement, enhance user experience, and ultimately, boost revenue. Furthermore, this
technology has practical applications in healthcare, where it can be used to monitor patients’ emotional
5
state and provide personalized therapy. Overall, this study aims to contribute to the understanding of
the potential applications of facial emotion recognition technology in music and other fields, paving the
way for further research and development in this promising area of technology.
1.1 Background
Provide a thorough explanation of the concept of facial expression-based music player systems,
including how advancements in technology, particularly in computer vision and machine learning, have
enabled the development of such systems. Discuss the importance of emotional engagement in user
experiences and how integrating facial expression recognition technology can enhance music playback
by providing a more personalized and immersive experience.
1.2 Motivation
Delve deeper into the motivations behind the project, discussing the potential benefits of facial
expression-based music systems in various domains such as entertainment, therapy, education, and
marketing. Highlight the increasing interest in emotional intelligence and its applications in technology,
emphasizing the significance of understanding and responding to human emotions in interactive
systems.
1.3 Objectives
Elaborate on the specific goals of the project, such as designing and implementing a real-time facial
expression recognition system integrated with a music player to dynamically select and play music
based on the user’s emotional state. Discuss how achieving these objectives could contribute to
advancements in human-computer interaction and user experience design.
1.4 Scope
Provide a detailed overview of the scope of the project, including the technologies, methodologies, and
applications considered. Clarify any limitations or constraints, such as hardware requirements,
available datasets, and time constraints, to set realistic expectations for the project’s outcomes.
CHAPTER-2
2 System Overview
The facial emotion-based song recommendation system consists of three main components: the face
recognition system, the emotion classification system and the song recommendation system. The facial
emotion recognition system captures a video of the listener’s face and extracts facial features such as
eye movement, eyebrow position, and mouth shape. These features are then used to determine the
listener’s emotional state using a trained machine learning model. The song recommendation system
uses a recommendation algorithm to generate song recommendations based on the listener’s emotional
state. The algorithm considers the emotional characteristics of songs, such as tempo, rhythm, and
melody, to generate recommendations that align with the listener’s emotional state. The
recommendations can be personalized based on the listener’s musical preferences and past listening
history.
6
features such as the eyes, nose, and mouth. OpenCV provides a variety of functions for image
processing, including face detection, which can be used to identify the face in an image. Once the face
has been detected, it is possible to extract various features such as the position of the eyes, mouth, and
nose, as well as the shape of the face. Using this information, it is possible to determine the emotion that
the person is experiencing. For example, if the corners of the mouth are turned upward, it is likely that
the person is experiencing happiness. Similarly, if the eyebrows are furrowed and the mouth is turned
downward, the person may be experiencing sadness.
7
CNN algorithm for emotion classification is that it can capture both local and global features of the facial
expression. For example, the algorithm can learn to recognize specific patterns in the position of the
eyes or mouth, as well as broader patterns in the overall shape of the face. Another advantage of CNNs is
that they can be trained using transfer learning techniques. Transfer learning involves using a pre-
trained CNN model that has already been trained on a large dataset of images and finetuning it on a
smaller dataset of facial expressions for emotion classification. This approach can reduce the amount of
data required for training and improve the accuracy of the model. Despite the effectiveness of CNNs for
emotion classification, there are still challenges and limitations to this approach. For example, CNNs
may not be as effective for classifying subtle or nuanced emotions that are expressed in more subtle
ways. Additionally, the accuracy of the model can be affected by variations in lighting conditions, facial
occlusions, and individual differences in facial expressions. Overall, the emotion classification module
using a CNN algorithm is a powerful approach for accurately classifying the listener’s current emotional
state based on facial features. With continued research and development, this approach has the
potential to significantly enhance the music listening experience for users through personalized song
recommendations based on their emotional state.
8
Figure 5: Model Guessing Angry Emotion in Face
9
music classifier has been trained, it can be used to classify new music samples based on their audio
features. When a user’s facial emotion is recognized by the facial emotion recognition system, the music
classifier can recommend music that matches the user’s emotional state.
10
CHAPTER-3
3 Proposed Algorithms
3.1 CNN Algorithm
Convolutional Neural Network is one of the main categories to do image classification and image
recognition in neural networks. Scene labeling, objects detections, and face recognition, etc., are some of
the areas where convolutional neural networks are widely used. CNN takes an image as input, which is
classified and process under a certain category such as dog, cat, lion, tiger, etc. The computer sees an
image as an array of pixels and depends on the resolution of the image. Based on image resolution, it
will see as h * w * d, where h= height w= width and d= dimension. For example, An RGB image is 6 * 6 *
3 array of the matrix, and the grayscale image is 4 * 4 * 1 array of the matrix. In CNN, each input image
will pass through a sequence of convolution layers along with pooling, fully connected layers, filters
(Also known as kernels). After that, we will apply the Soft-max function to classify an object with
probabilistic values 0 and 1.
11
Figure 8: Feature Learning and Classification Works.
It is a mathematical operation which takes two inputs such as image matrix and a kernel or filter.
Let’s start with consideration a 5*5 image whose pixel values are 0, 1, and filter matrix 3*3 as:
12
Figure 10: Matrix Multiplication.
The convolution of 5*5 image matrix multiplies with 3*3 filter matrix is called ”Features Map” and
show as an output.
Convolution of an image with different filters can perform an operation such as blur, sharpen, and
edge detection by applying filters.
3.1.2 Strides
Stride is the number of pixels which are shift over the input matrix. When the stride is equaled to
1. then we move the filters to 1 pixel at a time and similarly, if the stride is equaled to 1
2. then we move the filters to 2 pixels at a time. The following figure shows that the convolution
would work with a stride of 2.
13
Figure 12: Striding .
3.1.3 Padding
Padding plays a crucial role in building the convolutional neural network. If the image will get shrink
and if we will take a neural network with 100’s of layers on it, it will give us a small image after filtered
in the end. If we take a three by three filter on top of a grayscale image and do the convolving then what
will happen?
• Ram : 2 GB
14
available under the GNU General Public License (GPL). This tutorial gives enough understanding on
Python programming language.
1. Easy-to-learn Python has few keywords, simple structure, and a clearly defined syntax. This
allows the student to pick up the language quickly.
2. Easy-to-read Python code is more clearly defined and visible to the eyes.
4. A broad standard library Python’s bulk of the library is very portable and cross-platform
compatible on UNIX, Windows, and Macintosh.
5. Interactive Mode Python has support for an interactive mode which allows interactive testing
and debugging of snippets of code.
6. Portable Python can run on a wide variety of hardware platforms and has the same interface on
all platforms.
7. Extendable You can add low-level modules to the Python interpreter. These modules enable
programmers to add to or customize their tools to be more efficient.
9. GUI Programming Python supports GUI applications that can be created and ported to many
system calls, libraries and windows systems, such as Windows MFC, Macintosh, and the X
Window system of Unix.
10. Scalable Python provides a better structure and support for large programs than shell scripting.
15
It seeks to determine the resources required to provide an information systems solution, the cost and
benefits of such a solution, and the feasibility of such a solution. The goal of the feasibility study is to
consider alternative information systems solutions, evaluate their feasibility, and propose the
alternative most suitable to the organization. The feasibility of a proposed solution is evaluated in terms
of its components.
CHAPTER-5
5 System Anlysis
5.1 Purpose
The purpose of this document is real time facial expression based music recommender system using
machine learning algorithms. In detail, this document will provide a general description of our project,
including user requirements, product perspective, and overview of requirements, general constraints.
In addition, it will also provide the specific requirements and functionality needed for this project such
as interface, functional requirements and performance requirements.
5.2 Scope
The scope of this SRSdocument persists for the entire life cycle of the project. This document defines the
final state of the software requirements agreed upon by the customers and designers. Finally at the end
of the project execution all the functionalities may be traceable from the SRSto the product. The
document describes the functionality, performance, constraints, interface and reliability for the entire
life cycle of the project.
16
satisfy the mood of the user. This system helps user to play songs automatically according to their
mood. The image of the user is captured by the web camera, and the images are saved. The images are
first converted from RGB to binary format. This process of representing the data is called a featurepoint
detection method. This process can also be done by using Haar Cascade technology provided by Open
CV. The music player is developed by using a java program. It manages the database and plays the song
according to the mood of the user.
CHAPTER-6
6 System Designs
6.1 Input Design
The input design is the link between the information system and the user. It comprises the developing
specification and procedures for data preparation and those steps are necessary to put transaction data
in to a usable form for processing can be achieved by inspecting the computer to read data from a
written or printed document or it can occur by having people keying the data directly into the system.
The design of input focuses on controlling the amount of input required, controlling the errors, avoiding
delay, avoiding extra steps and keeping the process simple. The input is designed in such a way so that it
provides security and ease of use with retaining the privacy. Input Design considered the following
things:
• Methods for preparing input validations and steps to follow when error occur.
17
Efficient and intelligent output design improves the system’s relationship to help user decision-making.
The output form of an information system should accomplish one or more of the following objectives •
Convey information about past activities, current status or projections of the
• Future.
• Trigger an action.
• Confirm an action
2. The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the
system components. These components are the system process, the data used by the process, an
external entity that interacts with the system and the information flows in the system.
3. DFD shows how the information moves through the system and how it is modified by a series of
transformations. It is a graphical technique that depicts information flow and the transformations
that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at any level of
abstraction. DFD may be partitioned into levels that represent increasing information flow and
functional detail.
18
6.4 UML Diagrams
UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling language
in the field of object-oriented software engineering. The standard is managed, and was created by, the
Object Management Group. The goal is for UML to become a common language for creating models of
object oriented computer software. In its current form UML is comprised of two major components: a
Meta-model and a notation. In the future, some form of method or process may also be added to; or
associated with, UML. The Unified Modeling Language is a standard language for specifying,
Visualization, Constructing and documenting the artifacts of software system, as well as for business
modeling and other non-software systems. The UML represents a collection of best engineering
practices that have proven successful in the modeling of large and complex systems. The UML is a very
important part of developing objects oriented software and the software development process. The
UML uses mostly graphical notations to express the design of software projects.
19
6.5 UseCase Diagram
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram defined by
and created from a Use-case analysis. Its purpose is to present a graphical overview of the functionality
provided by a system in terms of actors, their goals (represented as use cases), and any dependencies
between those use cases. The main purpose of a use case diagram is to show what system functions are
performed for which actor. Roles of the actors in the system can be depicted.
20
Figure 18: Activity Diagram
CHAPTER-7
21
7 Modules
• Data Collection Module
CHAPTER-8
22
8 System Implementation
8.1 System Architecture
Describing the overall features of the software is concerned with defining the requirements and
establishing the high level of the system. During architectural design, the various web pages and their
interconnections are identified and designed. The major software components are identified and
decomposed into processing modules and conceptual data structures and the interconnections among
the modules are identified. The following modules are identified in the proposed system.
CHAPTER-9
23
9 System Testing
9.1 Test Plan
Software testing is the process of evaluation a software item to detect differences between given input
and expected output. Also to assess the feature of a software item. Testing assesses the quality of the
product. Software testing is a process that should be done during the development process. In other
words software testing is a verification and validation process.
9.2 Verification
Verification is the process to make sure the product satisfies the conditions imposed at the start of the
development phase. In other words, to make sure the product behaves the way we want it to.
9.3 Validation
Validation is the process to make sure the product satisfies the specified requirements at the end of the
development phase. In other words, to make sure the product is built as per customer requirements.
• Unit Testing
• Integration Testing
• Functional Testing
• System Testing
• Stress Testing
• Performance Testing
• Usability Testing
• Acceptance Testing
24
• Regression Testing
• Beta Testing
25
9.6 Requirement Analysis
Requirement analysis, also called requirement engineering, is the process of determining user
expectations for a new modified product. It encompasses the tasks that determine the need for
analyzing, documenting, validating and managing software or system requirements. The requirements
should be documentable, actionable, measurable, testable and traceable related to identified business
needs or opportunities and define to a level of detail, sufficient for system design.
9.8 Usability
It specifies how easy the system must be use. It is easy to ask queries in any format which is short or
long, porter stemming algorithm stimulates the desired response for user.
9.8.1 Robustness
It refers to a program that performs well not only under ordinary conditions but also under unusual
conditions. It is the ability of the user to cope with errors for irrelevant queries during execution.
9.8.2 Security
The state of providing protected access to resource is security. The system provides good security and
unauthorized users cannot access the system there by providing high security.
9.8.3 Reliability
It is the probability of how often the software fails. The measurement is often expressed in MTBF (Mean
Time Between Failures). The requirement is needed in order to ensure that the processes work
correctly and completely without being aborted. It can handle any load and survive and survive and
even capable of working around any failure.
9.8.4 Compatibility
It is supported by version above all web browsers. Using any web servers like localhost makes the
system real-time experience.
9.8.5 Flexibility
The flexibility of the project is provided in such a way that is has the ability to run on different
environments being executed by different users.
9.8.6 Safety
Safety is a measure taken to prevent trouble. Every query is processed in a secured manner without
letting others to know one’s personal information.
26
9.9 Non-Functional Requirements
9.9.1 Portability
It is the usability of the same software in different environments. The project can be run in any
operating system.
9.9.2 Performance
These requirements determine the resources required, time interval, throughput and everything that
deals with the performance of the system.
9.9.3 Accuracy
The result of the requesting query is very accurate and high speed of retrieving information. The degree
of security provided by the system is high and effective.
9.9.4 Maintainability
Project is simple as further updates can be easily done without affecting its stability. Maintainability
basically defines that how easy it is to maintain the system. It means that how easy it is to maintain the
system, analyse, change and test the application. Maintainability of this project is simple as further
updates can be easily done without affecting its stability
27
CHAPTER-10
10 Conclusion
• The development of the emotion-based music player represents a significant milestone in the
intersection of technology and human emotion. Throughout this project, we have delved into the
complexities of emotion recognition, music analysis, and user experience design to create a
unique platform that enhances the way we interact with music.
• By leveraging cutting-edge technologies such as machine learning and signal processing, we have
successfully engineered an intelligent system capable of interpreting users’ emotions and curating
personalized playlists accordingly. Through rigorous experimentation and iterative refinement,
we have validated the efficacy and accuracy of our emotion recognition algorithms, ensuring a
seamless and immersive user experience.
• Moreover, the integration of user feedback and usability testing has been integral to the iterative
design process, enabling us to tailor the music player to meet the diverse needs and preferences of
our users. By prioritizing user-centric design principles, we have ensured that the emotion-based
music player not only performs optimally but also resonates deeply with its users on an emotional
level.
• Looking ahead, the potential applications of this technology are vast and diverse. From enhancing
mood regulation and emotional well-being to revolutionizing the way we consume and engage
with music, the emotion-based music player opens up new avenues for exploration and
innovation. As technology continues to evolve and our understanding of human emotion deepens,
the possibilities for further advancements in this field are limitless.
• In conclusion, the development of the emotion-based music player represents a pioneering effort
to harness the power of technology to enhance our emotional experiences and enrich our lives
through music. As we continue to refine and expand upon this technology, we are poised to unlock
new realms of creativity, expression, and emotional connection in the realm of music
References:
1. Emanuel I. Andelin and Alina S. Rusu,”Investigation of facial microexpressions of emotions in
psychopathy - a case study of an individual in detention”, 2015, Published by Elsevier Ltd.
2. Paul Ekman, Wallace V Friesen, and Phoebe Ellsworth. Emotion in the human face: Guidelines for
research and an integration of findings. Elsevier 2013.
3. F. De la Torre and J. F. Cohn, “Facial expression analysis,” Vis. Anal. Hum.,pp. 377–410, 2011.
4. Bavkar, Sandeep, Rangole, Jyoti, Deshmukh,” Geometric Approach for Human Emotion
Recognition using Facial Expression”, International Journal of Computer Applications, 2015.
5. Zhang, Z. Feature-based facial expression recognition: Sensitivity analysis and experiments with
a multilayer perceptron. International Journal of Patten Recognition and Artificial Intelligence.
7. Krittrin Chankuptarat, etal, “Emotion Based Music Player”, IEEE 2019 conference.
8. Kim, Y.: Convolutional Neural Networks for Sentence Classification. In: Proceedings of the 2014
Conference on EMNLP, pp. 1746–1751 (2014).
28
9. Tripathi, S., Beigi, H.: Multi-Modal Emotion recognition on IEMOCAP Dataset using Deep
Learning. In: arXiv:1804.05788 (2018).
10. Teng et al.,”Recognition of Emotion with SVMs”, Lecture Notes in Computer Science, August 2006.
11. B.T. Nguyen, M.H. Trinh, T.V. Phan, H.D. NguyenAn efficient realtime emotion detection using
camera and facial landmarks , 2017 seventh international conference on information science and
technology (ICIST) (2017)
29