0% found this document useful (0 votes)
19 views5 pages

A Novel Emotion Based Music Recommendation System Using CNN

The document presents a novel emotion-based music recommendation system that utilizes Convolutional Neural Networks (CNN) to detect user emotions through facial recognition. The system aims to enhance user experience by providing personalized music playlists based on detected emotional states, leveraging human-computer interaction techniques. Implementation details and methodologies are discussed, highlighting the effectiveness of CNN in accurately recognizing emotions and improving music selection.

Uploaded by

sri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views5 pages

A Novel Emotion Based Music Recommendation System Using CNN

The document presents a novel emotion-based music recommendation system that utilizes Convolutional Neural Networks (CNN) to detect user emotions through facial recognition. The system aims to enhance user experience by providing personalized music playlists based on detected emotional states, leveraging human-computer interaction techniques. Implementation details and methodologies are discussed, highlighting the effectiveness of CNN in accurately recognizing emotions and improving music selection.

Uploaded by

sri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the 7th International Conference on Intelligent Computing and Control Systems (ICICCS-2023)

IEEE Xplore Part Number: CFP23K74-ART; ISBN: 979-8-3503-9725-3

A Novel Emotion based Music Recommendation System


using CNN
Tejaswini Priyanka V Y Reshma Reddy
2023 7th International Conference on Intelligent Computing and Control Systems (ICICCS) | 979-8-3503-9725-3/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICICCS56967.2023.10142330

Department of computer science Department of computer science and


and engineering engineering Dharani Vajja
Gokaraju Rangaraju Institute of Gokaraju Rangaraju Institute of Department of computer
Engineering and Technology Engineering and Technology science and engineering
Hyderabad, India Hyderabad, India Gokaraju Rangaraju
[email protected] [email protected] Institute of Engineering
and Technology
G Ramesh S Gomathy Hyderabad, India
Department of computer science Department of computer science and [email protected]
and engineering engineering
Gokaraju Rangaraju institute of Gokaraju Rangaraju Institute of
engineering and technology Engineering and Technology
Hyderabad, India Hyderabad, India
[email protected] [email protected]

Abstract— Music has a unique emotional connection with humans as close companions. This involves
with humans. It is a means of connecting individuals expanding the range of abilities of computers to facilitate
from all over the world. On the other hand, it is a human-like interaction, such as identifying human traits,
highly difficult task to generalize music and claim that speaking, listening, and even inferring human emotions.
everyone would prefer and enjoy the same type. By using sophisticated video cameras and microphones,
Emotion-based music selection is important because it researchers are implementing a non-invasive method of
can assist humans in reducing stress. Its major purpose detecting user behavior through enhanced sensory
is to accurately predict the user's emotions, and abilities. This method enables computers to comprehend a
play songs depending on the user's preferences. Using user's intentions, gaze direction, and even perceive their
Human Computer Interaction (HCI), the physical or emotional state. The ultimate objective of the
proposed bot recognizes human emotions from facial emotion recognition system is to accurately recognize
emotions. Another significant challenge is the human emotions. [2]
extraction of facial features from the user's face. The
When users have hundreds of songs, it becomes
proposed CNN Algorithm is utilized in the proposed
model to properly capture and recognize the user's challenging to manually create and organize playlists.
Additionally, it can be hard to keep track of all the songs,
face from the live webcam stream and to detect
and unused ones may unnecessarily occupy device
emotions based on facial factors such as lips and eyes.
memory, necessitating the user to remove them manually.
Also, an additional option will be provided for people
to make a good choice manually. Users also have to choose songs manually based on their
preferences and mood, and rearranging and playing music
Keywords— Face Recognition, Image Processing, can be difficult if the play style varies. To address these
Emotion Detection, Music, and Mood detection. issues, the project has incorporated machine learning
techniques, involving facial scans and feature tracking [11],
to determine the user's emotion and offer a personalized
I. INTRODUCTION playlist based on it. The emotion recognition module plays
a vital role in identifying the user's emotions and provide
Music and emotion have a strong connection; it can be entertainment in the form of emotion-based music. The
influenced by each other. A common way for people to application comprises three primary modules:
express their emotions is through facial expressions. At questionnaire, mood detection, and music
the same time, certain music can change a person's recommendation
emotional state. Emotion-based music recommendation is
much needed as it helps people to relieve stress and listen
to relaxing music that suits their current emotions. The
II. LITERATURE SURVEY
primary objective of this research work is to capture users’
emotions through facial expressions. Here, the music Charles Darwin, an eminent scientist, recognized facial
player intends to capture human emotions using the expressions as an indicator of human emotions, intentions,
computer’s webcam feature. Proposed application takes a and ideas. Rosalind Picard presented the importance of
picture of the user and then the image processing emotions to the computing community in 1997. Affective
techniques extracts the features of the face and attempts to computing has two components: one to enable computers
recognize the emotion that the person is attempting to to recognize emotions and allow them to categorize the
express. [1] emotions. Facial emotions are important in the decision-
making process and emotion recognition is an important
The goal of researchers is to equip computers with step towards developing an adaptable computer system.
remarkable perceptual skills, enabling them to collaborate
Efforts have been directed towards developing a smart,

979-8-3503-9725-3/23/$31.00 ©2023 IEEE 592


Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 04,2023 at 04:00:33 UTC from IEEE Xplore. Restrictions apply.
adaptable computer system capable of detecting a user's IV. IMPLEMENTATION
emotional state. Incorporating emotions into computers
can also increase computer users' productivity. Dryer and Implementation steps:
Horowitz (1997) revealed that people with similar or 1. Open anaconda
complementary personalities collaborate more effectively.
Furthermore, Dryer (1999) demonstrated that people 2. Select project environment
perceive their computers with personalities. Therefore, it 3. Launch spyder
is vital to develop computers that can work well with their
users. 4. Run app.py
In 2010, Renu Nagpal, Pooja Nagpal, and Sumeet Kaur 5. Go to Microsoft edge and type localhost:5000
introduced a novel method for emotion detection in
heavily corrupted noisy environments by combining 6. Click sign up and register yourself
Mutation Bacteria Foraging Optimization (MBFO) and 7. Using the registered email, user can sign and fill out the
Adaptive Median Filter (AMF) in a cascading manner. form which exists on the next page
The approach involved eliminating noise from the image
using MBFO and AMF, followed by identifying local, 8. After navigating to the button (sit straight and put your
global, and statistical features from the image. The face in the middle else it will show Please look steadily)
researchers discovered that the proposed technique was
suitable for identifying emotions in the presence of salt 9. After getting the maximu m emotion which is captured
and pepper noise levels as high as 90%. Future research will be shown on the next webpage if it’s wrong, there is a
may entail utilizing the same technique to detect emotions questionnaire also to know the emotional state.
in the presence of other forms of noise.

III. PROPOSED MODEL


The proposed MRS (Music Recommendation System)
application incorporates an emotion detection module as
its primary feature. The emotion detection module is
crucial in identifying the user's emotions, which in turn
provides entertainment in the form of music according to
their mood. The application comprises three main
modules: Questionnaire, Mood detection, and Music
recommendation. Upon opening the application, the user
is presented with two options: selecting a mood from a
questionnaire and clicking on the recommend button to
send the message, or clicking on the "detect emotion"
button, which triggers the chatbot application to start the
emotion detection process.
Fig.2 The seven basic emotions

METHODOLOGY
A. OPENCV [12]
OpenCV is a large open-source image processing
data library. In OpenCV, the CV is a shortened form of
Computer Vision, which is a computational concept first
surfaced in the 1950s, when neural networks were used to
locate the edges of items, subsequently progressing to
written-by-hand text, dialogue, and languages. OpenCV
currently includes the datasets for face recognition.
OpenCV is defined as a field of study that assists
computer frameworks with photographs, and
video recordings. OpenCV supports Python, C++, Java,
and other programming languages. [7]
B. CNN (CONVOLUTIONAL NEURAL NETWORK )
[13,14,15]

Convolutional Neural Networks (CNNs), the advanced


version of Artificial Neural Networks (ANNs) can be
trained to produce the results . The Convolutional Neural
Fig.1. Flowchart of proposed system [16] Network (CNN) model used here recognizes the entire
face of the user and taken it as an input and eliminates
the noise/error from more profound face designs. Based on
the estimated value, the exact projections will be made.

593
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 04,2023 at 04:00:33 UTC from IEEE Xplore. Restrictions apply.
The next phase includes fine-tuning analysis of face
locations. At each of the three layers, different
convolutional layers are combined to improve E. EMOTION RECOGNITION
accuracy and efficiency. Before collecting the information Emotional acknowledgement refers to the relationship that
and develop predictions for images it has never seen, th e includes recognizing human emotions. Individuals' ability
model is trained with a huge number of photos. [5] CNN to recognize the emotions of others varies widely.
is divided into two sections: Emotion recognition is one of the several facial features.
First is hidden layers and feature extraction- In this Face recognition software is being employed to allow
segment, the network will perform a series of convolutions modification in order to explore and deal with a human's
and pooling to recognize the facial features. face. AI recognizes various appearances in order to
connect them with more information. This may be used
Second is classification to highlight and classify the for a variety of reasons and allows professionals
obtained features. to recognize an individual's sentiments. [9]
C. MAX POOLING
The technique of selecting the most extreme component F. FACIAL EXTRACTION
from the element map is referred to as max pooling. As a
result, the maximum pooling layer generates a component Data extraction is the most important stage in design
map that includes the most highlights from the prior identification and data mining. Using specific rules, the
element map. The input image should be 28 × 28 pixels in significant element subset is extracted from the initial
size. While zoom into an open field of 5*5, a few information at this step. Since all removed highlights may
highlights in this image will be recorded. While not significantly add to the structure, it is preferable to
eliminating few highlights, MaxPooling is recommended. eliminate suitable element space [10]. It also determined
Since the images will have edges and inclinations, Max that only a few pieces are generally required and selected.
pooling cannot be implemented in the early stages of As a result, a massive amount of data can be reduced to
convolutional neural network [6] work computationally faster and relatively smaller. As a
result, trained element selection is a critical step towards
D. CONVOLUTION achieving efficient face recognition and biometric
Convolution is a numerical methodology that permits two verification. The primary aim of component extraction
is to reduce machine preparation and computational
arrangements of data to be consolidated. Convolution is
complexity in order to achieve aspect reduction [4].
applied to the information on CNN to channel the data and
assemble an element map. To perform convolution, the
portion iteratively crosses the info picture, performing grid
increase component by component. Assuming that the G. ARTIFICIAL INTELLIGENCE
CNN has layers upon layers, the preparation stage will Artificial Intelligence (AI) mimics human understanding
consume more processing time. To deal with and train the in computers, allowing them to think like people and
neural networks, a Convolution requires a huge dataset.
replicate their behaviours when programmed to do so. The
sample features of AI mimic the features of human
The figure below shows the structure of the CNN. brain like reasoning and problem-solving. Despite the fact
that multidisciplinary research covers a variety of
opinions, advances in AI are prompting a paradigm shift
in almost all the fields.

V. RESULT

Fig.3. CNN Structure

In the proposed model, CNN Algorithm has been used for


accurately detecting the user’s face from the live webcam
feed and to detect the emotion being expressed by the user
from the facial features.
Traditionally, the Haar Cascade algorithm is used to detect
emotions from facial features extracted from the user but
as research suggests CNN can perform both detection and
extraction of facial features using concepts like Fig.4. “Neutral” mood
convolution and max pooling. Henceforth CNN has been Fig 4 shows the detection of emotion through webcam
used.

594
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 04,2023 at 04:00:33 UTC from IEEE Xplore. Restrictions apply.
when clicked on capture and recommend. Users can provides a framework for designing and implementing an
confirm the emotion to be correctly detected else may Emotion-based music player. The proposed system
recapture it. Neutral emotion is detected when there is no processes videos, extracts facial expressions, recognizes
change in facial features. basic emotions, and displays a list of songs based on the
emotions. Simple methods were used to develop the
proposed music emotion recognition system. It extracts a
person’s facial expressions, such as happiness, anger,
surprise, and neutrality to offer music suitable for the
individual’s emotion. Although the system cannot handle
major head rotations and obstacles, it allows head
movements. The future work will be dedicated to improve
the recognition rate of the system.
ACKNOWLEDGM ENT
We are very grateful to our Project guide Mr. G Ramesh
for his extensive patience and guidance throughout our
project work. We express our immense gratitude to all
who have directly or indirectly contributed to our needs at
the right time. This has been for the development and
Fig.5. “Surprised” mood success of our project work.
In Fig 5, “Surprised” emotion is detected based on few
changes in facial features like eye brows , mouth, nose etc.
REFERENCES
For the respective moods detected, a list of songs from
[1]. Dr. Vishwanath Kharad: "Research Paper on the Chatbot
Spotify playlists is played in the Spotify app. Development for Educational Institute" Social Science Research
Network (SSRN), 2020.
A. Metrics:
[2]. Abhigya Verma, Chandana Kuntala, Pragya Khatri, Sristi, Sukhmani
● Accuracy Kaur, A K Mohapatra, Shweta Singhal: "University Chatbot System
using NLP" Social Science Research Network(SSRN), 2022.
Accuracy = TN + TP/ TP+FP+TN+FN
[3]. Payal Jain: "College Enquiry Chat Bot using Iterative Model"
When the “detect emotion” option is selected, songs International Journal of Scientific Engineering and Research (IJSER),
2019.
appropriate for the user’s emotion are played with an
accuracy of approximately 96%~. [4]. Harshala Gaude, Vedika Patil, Prachi Vishe, Sonali Kolpe: "College
Enquiry Chat-Bot System" International Journal of Engineering
Accuracy for the proposed CNN algorithm is 97.21%. Research & T echnology(IJERT ), 2020
[5]. Saksham Saraswat, Siddhartha Mishra, Vikas Mani, Shristi Priya:
"GALGOBOT – T he College Companion Chatbot" International
● Loss Conference on Intelligent Computing and Control Systems(ICICCS),
2021
Loss = 0.0215 [6]. Neelkumar P. Patel, Devangi R. Parikh, Darshan A. Patel, Ronak R.
Patel: "AI and Web-Based Human-Like Interactive University Chatbot
(UNIBOT )" International Conference on Electronics, Communication,
and Aerospace T echnology (ICECA), 2019
● F1
[7]. Udhayakumar Shanmugam, Sowjanya Mani, Sneha Sivakumar,
It is the harmonic mean of precision and recall. Overall Rajeswari: "Human-Computer text conversation through NLP in T amil
to avoid Type 1 errors more than Type 2. To do so, there’s using Intent Recognition" International Conference on Vision T owards
Emerging T rends in Communication and Networking (ViT ECoN), 2019
an F1 Score. F1 Score is the harmonic mean of precision
and recall. [8]. Kumar, S.K., Reddy, P.D.K., Ramesh, G., Maddumala, V.R. (2019).
Image transformation technique using steganography methods using
Precision = TP/ TP+FP = 100 LWT technique. T raitement du Signal, Vol. 36, No. 3, pp. 233 -237.
https://doi.org/10.18280/ts.360305
Recall = TP/ TP+FN = 97.18 [9]. Y. Sara, J. Dumne, A. Reddy Musku, D. Devarapaga and R. Gajula,
F1 Score = 98.57 "A Deep Learning Facial Expression Recognition based Scoring System
for Restaurants," 2022 International Conference on Applied Artificial
Intelligence and Computing (ICAAIC), Salem,India, 2022, pp. 630-634,
doi: 10.1109/ICAAIC53929.2022.9793219
VI. CONCLUSION AND FUTURE SCOPE [10]. Somasekar, J Ramesh, G “Beneficial Image Preprocessing by
Contrast Enhancement Technique for SEM Images”, IJEMS Vol.29(6)
This study has successfully developed automatic facial [December 2022], NIScPR-CSIR,India.
expression recognition system for the purpose of creating [11]. G. Ramesh et al., “Feature Selection Based Supervised Learning
an emotion-based music player. Facial emotion analysis Method for Network Intrusion Detection”, International Journal of
has been extensively researched and applied, beginning Recent T echnology and Engineering (IJRT E), ISSN: 2277 - 3878,
with psychological research. Manual face analysis Volume-8, Issue-1, May 2019.
previously employed by psychologists has now been [12]. Ramesh, G., Anugu, A., Madhavi, K., Surekha, P. (2021).
replaced by suitable computer software. Various image Automated Identification and Classification of Blur Images, Duplicate
Images Using Open CV. In: Luhach, A.K., Jat, D.S., Bin Ghazali, K.H.,
processing algorithms have been developed to meet the Gao, XZ., Lingras, P. (eds) Advanced Informatics for Computing
demands of the facial emotion recognition system. This Research. ICAICR 2020. Communications in Computer and Information
project not only covers the theoretical foundation, but also Science, vol 1393. Springer, Singapore. https://doi.org/10.1007/978-981-
16- 3660-8_52

595
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 04,2023 at 04:00:33 UTC from IEEE Xplore. Restrictions apply.
[13]. Authors: Rahul Chauhan, Kamal Kumar Ghanshala, R.C Joshi
“ Convolutional Neural Network (CNN) for Image Detection and
Recognition” (ICSCCC 2018).
[14] Sivakumar Depuru, Anjana Nandam, S. Sivanantham, K Amala, V.
Akshaya, M. Saktivel, "Convolutional Neural Network based Human
Emotion Recognition System: A Deep Learning Approach", 2022 Smart
T echnologies, Communication and Robotics (ST CR), pp.1 -4, 2022.
[15] Natarajan, V.A., Kumar, M.S., Patan, R., Kallam, S. and Mohamed,
M.Y.N., 2020, September. Segmentation of Nuclei in Histopathology
images using Fully Convolutional Deep Neural Architecture. In 2020
International Conference on Computing and Information T echnology
(ICCIT 1441) (pp. 1-7). IEEE.
[16] Roy, Dharmendra, Anjali CH, G. Kavya Sri, B. T harun, and K.
Venu Gopal. "Music Recommendation Based on Current Mood Using Ai
& Ml." (2023).

596
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 04,2023 at 04:00:33 UTC from IEEE Xplore. Restrictions apply.

You might also like