0% found this document useful (0 votes)
108 views

Final Year Project Document

The document discusses content-based recommendation systems and their challenges. It focuses on content-based filtering which generates recommendations based on item attributes rather than user behavior. Content-based filtering aims to address challenges like cold start problem, data sparsity, and accuracy by analyzing item and user profiles to find similarities and make personalized recommendations.

Uploaded by

mrsingle588
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views

Final Year Project Document

The document discusses content-based recommendation systems and their challenges. It focuses on content-based filtering which generates recommendations based on item attributes rather than user behavior. Content-based filtering aims to address challenges like cold start problem, data sparsity, and accuracy by analyzing item and user profiles to find similarities and make personalized recommendations.

Uploaded by

mrsingle588
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 82

CONTENT BASED

RECOMMENDATION SYSTEM

Submitted in partial fulfillment of the

requirement for the award of the degree

of Bachelor of Computer Science

by

Thanush Raj. U (RA2031005020132)


Ragothaman. S.K (RA2031005020125)
Prem Mohan. S (RA2031005020122)

Under the guidance of


Dr(Mrs). J. Jebamalar Tamilselvi
Associate Professor

DEPARTMENT OF COMPUTER SCIENCE & APPLICATIONS -BSc(CS)


FACULTY OF SCIENCE AND HUMANITIES
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
Ramapuram, Chennai.
April 2024
SRM Institute of Science and Technology

Ramapuram, Campus.

Faculty of Science and Humanities


Department of Computer Science and Applications (B.Sc-CS)

BONAFIDE CERTIFICATE

Certified that this project report titled “CONTENT BASED RECOMMENDATION


SYSTEM” is the bonafide work of THANUSH RAJ. U (RA2031005020132), RAGOTHAMAN.
S.K (RA2031005020125) and PREM MOHAN.S(RA20310050200122) who carried out the project
work done under my supervision during the academic year 2023-2024- Semester VI.

Course Code: UCS20D10L

Course Name: Project Work

Certified further, that to the best of my knowledge the work reported here in does not
form part of any other project report on the basis of which a degree or award was conferred
on an earlier occasion on this or any other candidate.

Signature of Internal Guide Signature of Head of the Department

Signature of External Examiner


ACKNOWLEDGEMENT

I extend my sincere gratitude to the Chancellor Dr. T.R. PACHAMUTHU


and to Chairman Dr. R. SHIVAKUMAR of SRM Institute of Science and
Technology, Ramapuram, Chennai for providing me the opportunity to pursue the
BSc (CS) degree at this University.

I express my sincere gratitude to Maj. Dr. M. VENKATRAMANAN,


Dean (S&H), SRM IST, Ramapuram for his support and encouragement for the
successful completion of the project.

I record my sincere thanks to Dr. J. DHILIPAN, Head (MCA) & Vice


Principal (Admin, S&H), SRM IST, Ramapuram for his continuous support and
keen interest to make this project a successful one.

I record my sincere thanks to Dr. V. SARAVANAN, Vice Principal


(Academic, S&H) & Head (B.Sc.(CS), SRM IST, Ramapuram for his
encouragement and keen interest to make this project a successful one.

I find no word to express profound gratitude to my guide


Dr(Mrs).J.JEBAMALAR TAMILSELVI Associate Professor, Department of
Computer Science and Applications-BSc(CS), SRM IST, Ramapuram for his kind
co-operation and encouragement which helped us for the completion of this
project.

I thank the almighty who has made this possible. Finally, I thank my
beloved family member and friend for their motivation, encouragement and
cooperation in all aspect which led me to the completion of this project.

THANUSH RAJ. U – RA2031005020132


RAGOTHAMAN. S.K – RA2031005020125
PREM MOHAN. S - RA2031005020122
RECOMMENDER SYSTEMS

ABSTRACT

Recommender systems employ various data mining techniques and algorithms to discern
user preferences from a vast array of available items. Unlike static systems, recommender
systems foster increased interaction to offer a more enriched experience. By analysing past
purchases, searches, and other users' behaviour, these systems can autonomously identify
recommendations for individual users. This technique leverages user history data, as well as
other users' data, to predict preferred items and make personalised recommendations. This
project paper focuses on the challenges faced by recommender systems, such as the cold
start problem, data sparsity, scalability, and accuracy. Specifically, it delves into content-
based filtering, which generates recommendations based on a user's behaviour. Similar to
collaborative filtering, content-based filtering relies on long-term user preference profiles
that can be updated to enhance performance.

I
TABLE OF CONTENTS

S.NO TITLE PAGE.NO.

ABSTRACT I
ACKNOWLEDGEMENT II
LIST OF FIGURES III
LIST OF TABLES IV

CHAPTERS TITLE PAGE.NO

1 INTRODUCTION:
2
1.1 PROJECT INTRODUCTION

WORKING ENVIRONMENT:

1. SOFTWARE REQUIREMENTS 5
2
2. HARDWARE REQUIREMENTS 5
3. SOFTWARE DESCRIPTION 6

SYTEM ANALYSIS:

1. EXISTING SYSTEM 16
3 2. DRAWBACKS OF EXISTING SYSTEM 16
3. PROPOSED SYSTEM 17
4. ADVANTAGES OF PROPOSED SYSTEM 19

II
SYSTEM DESIGN:
1. UML DIAGRAM 22
4 2. USE CASE DIAGRAM 23
3. CLASS DIAGRAM 24
4. GOALS 25

MOUDLES DESCRIPTION:
1. RECOMMENDER SYSTEM 27
5 29
2. COLLABORATIVE FILTERING
3. CONTENT-BASED FILTERING 37
4. REGULARISATION TERM 41
5. RETRIEVAL & RANKING 44

SYSTEM ARCHITECTURE:
6 1. FUNCTIONAL REQUIREMENTS 49
2. NON-FUNCTIONAL REQUIREMENTS 51

CODING:
7 1. CODING 53
2. EVALUATION OF RESULTS 66

CONCLUSION
8
1. FUTURE ENHANCEMENTS 69

9 BIBLIOGRAPHY & REFERENCES 72


LIST OF FIGURES

FIGURE NO. FIGURE NAME PAGE NO.

2.1 Tensor-Flow Documentation 10

2.2 Tensor-Flow Installation 11

2.3 NumPy 14

4.1 Use Case Diagram 26

4.2 Class Diagram 27

5.1 Recommendation System 30

5.2 User Rating 32

5.3 Collaborated Rating 33

Flow Chart for Content Based


5.4 Algorithm 41

5.5 Prediction 42

5.6 Several Optimisation Algorithm 46


Efficiency

6.1 System Architecture 52

7.1 Prediction for New User 69

7.2 Prediction for Existing User 70

III
LIST OF ABBREVIATIONS

TF - Tensor Flow

CBF - Content Based Filtering

CF - Collaborative Filtering

UML - Unified Modelling Language

DL - Deep Learning

SK - Sci-Kit Learn

OPT - Optimizer

NN - Neural Network

IV
CHAPTER 1

1
1. INTRODUCTION

1.1 PROJECT INTRODUCTION

Recommender systems have become an essential tool in many online


platforms, providing personalised recommendations to users to improve their
experience and increase engagement. These systems utilize various data mining
techniques and algorithms to analyse user behaviour, such as past purchases,
searches, and interaction, to discern their preferences and make personalised
recommendations. In recent years, recommender systems have evolved from
static systems to more dynamic ones that foster increased interaction to offer a
more enriched experience for users.

One of the key challenges faced by recommender systems is the cold start
problem, where new users or items have limited or no historical data available for
recommendation. This poses a significant challenge as the system lacks sufficient
information to accurately predict user preferences and provide relevant
recommendations. Another challenge is data sparsity, where the available data for
recommendation is sparse, making it difficult to identify patterns and make
accurate predictions. Scalability is also a challenge, as recommender systems
need to handle large amounts of data and provide real-time recommendations to a
large number of users. Lastly, accuracy is crucial for recommender systems to
gain user trust and deliver relevant recommendations consistently.

One popular approach to address these challenges is Content-based filtering,


which generates recommendations based on the characteristics or attributes of
items, rather than relying solely on user behaviour or Collaborative filtering.
Content-based filtering leverages the inherent features of items, such as genre,
director, actors, keywords, or other metadata, to create item profiles. These item
profiles are then used to match with user profiles, which are created based on the
user's historical behaviour or preferences. By analysing the similarity between

2
item and user profiles, content-based filtering can make personalised
recommendations even for new users or items with limited historical data.

The purpose of this project paper is to explore the challenges faced by


recommender systems, with a focus on content-based filtering as a technique to
address these challenges. The paper will delve into the various aspects of content-
based filtering, including its strengths and limitations, and how it can be used to
improve the performance of recommender systems. The paper will also discuss
the different approaches and algorithms used in content-based filtering, as well as
the latest advancements and trends in this field.

In conclusion, recommender systems play a crucial role in providing


personalised recommendations to users in online platforms. However, they face
challenges such as the cold start problem, data sparsity, scalability, and accuracy.
Content-based filtering is a popular approach to address these challenges, as it
leverages the characteristics or attributes of items to create item profiles, which
are then used to make personalised recommendations. This project paper aims to
provide an in-depth analysis of content-based filtering, including its definition,
strengths, limitations, approaches, algorithms, advancements, and real-world
applications. By understanding the challenges and opportunities of content-based
filtering, researchers and practitioners can enhance the performance of
recommender systems and provide better recommendations to users.

3
CHAPTER 2

4
2. WORKING ENVIRONMENT

2.1 SOFTWARE REQUIREMENTS


Operating system : Windows 10, Mac-OS V.10 or higher.

Programming Language : Python

ML Libraries : NumPy, Pandas, SK-Learn, TensorFlow

Database : SQLite supported by SQL_Alchemy

Web Framework : Python-Flask

Web Template : HTML, CSS, Jinja Template.

2.2 HARDWARE REQUIREMENTS


System : Pentium IV 2.4 GHz.

Hard Disk : 256 GB or Higher.

RAM : 4GB or Higher.

5
2.3 SOFTWARE DESCRIPTION

PYTHON

Python is an interpreter, high-level and general-purpose programming


language. Python's design philosophy emphasises code readability with its
notable use of significant indentation. Its language constructs and object-oriented
approach aim to help programmers write clear, logical code for small and large-
scale projects. Python is dynamically typed and garbage-collected. It supports
multiple programming paradigms, including structured (particularly, procedural),
object-oriented and functional programming. Python is often described as a
"batteries included" language due to its comprehensive standard library. Python is
Interpreted − the interpreter processes Python at runtime. You do not need to
compile your program before executing it. This is similar to PERL and PHP.
Python is Interactive − You can actually sit at a Python prompt and interact with
the interpreter directly to write your programs. Python is Object-Oriented −
Python supports Object-Oriented style or technique of programming that
encapsulates code within objects. Python is a Beginner's Language − Python is a
great language for the beginner-level programmers and supports the development
of a wide range of applications from simple text processing to WWW browsers to
games.

TENSOR-FLOW INTRODUCTION

Tensor Flow is a software library or framework, designed by the Google


team to implement machine learning and deep learning concepts in the easiest
manner. It combines the computational algebra of optimisation techniques for
easy calculation of many mathematical expressions. The official website of
TensorFlow is mentioned below: https://www.tensorflow.org/

6
Fig 2.1 Tensorflow Documentation

Let us now consider the following important features of TensorFlow:


 It includes a feature of that defines, optimises and calculates mathematical
expressions easily with the help of multi-dimensional arrays called tensors.
 It includes a programming support of deep neural networks and machine
learning techniques.
 It includes a high scalable feature of computation with various data sets.
 TensorFlow uses GPU computing, automating management. It also includes a
unique feature of optimisation of same memory and the data used.

Why is TensorFlow So Popular?

TensorFlow is well-documented and includes plenty of machine learning


libraries. It offers a few important functionalities and methods for the same.
TensorFlow is also called a “Google” product. It includes a variety of machine
learning and deep learning algorithms. TensorFlow can train and run deep neural
networks for handwritten digit classification, image recognition, word embedding
and creation of various sequence models.

7
TensorFlow — Installation

To install TensorFlow, it is important to have “Python” installed in your


system. Python version 3.4+ is considered the best to start with TensorFlow
installation. Consider the following steps to install TensorFlow in Windows
operating system.
pip install tensorflow

Fig 2.2 Tensorflow Installation

KERAS
INTRODUCTION
Deep learning is one of the major subfield of machine learning framework.
Machine learning is the study of design of algorithms, inspired from the model of
human brain. Deep learning is becoming more popular in data science fields like
robotics, artificial intelligence(AI), audio & video recognition and image
recognition. Artificial neural network is the core of deep learning methodologies.
Deep learning is supported by various libraries such as Theano, TensorFlow,
Caffe, Mxnet etc., Keras is one of the most powerful and easy to use python
library, which is built on top of popular deep learning libraries like TensorFlow,
Theano, etc., for creating deep learning models.

8
FEATURES
Keras leverages various optimisation techniques to make high level neural
network API easier and more performant. It supports the following features:
 Consistent, simple and extensible API.
 Minimal structure - easy to achieve the result without any frills.
 It supports multiple platforms and backends.
 It is user friendly framework which runs on both CPU and GPU.
 Highly scalability of computation.

BENEFITS
Keras is highly powerful and dynamic framework and comes up with the
following advantages:
 Larger community support.
 Easy to test.
 Keras neural networks are written in Python which makes things simpler.
 Keras supports both convolution and recurrent networks.
• Deep learning models are discrete components, so that, you can combine into
many ways.

KERAS ― OVERVIEW OF DEEP LEARNING


Deep learning is an evolving subfield of machine learning. Deep learning
involves analysing the input in layer by layer manner, where each layer
progressively extracts higher level information about the input. Let us take a
simple scenario of analysing an image. Let us assume that your input image is
divided up into a rectangular grid of pixels. Now, the first layer abstracts the
pixels. The second layer understands the edges in the image. The Next layer
constructs nodes from the edges. Then, the next would find branches from the
nodes. Finally, the output layer will detect the full object. Here, the feature
extraction process goes from the output of one layer into the input of the next

9
subsequent layer. By using this approach, we can process huge amount of
features, which makes deep learning a very powerful tool. Deep learning
algorithms are also useful for the analysis of unstructured data. Let us go through
the basics of deep learning in this chapter.

Artificial Neural Networks


The most popular and primary approach of deep learning is using “Artificial
neural network” (ANN). They are inspired from the model of human brain, which
is the most complex organ of our body. The human brain is made up of more than
90 billion tiny cells called “Neurons”. Neurons are inter-connected through nerve
fiber called “axons” and “Dendrites”. The main role of axon is to transmit
information from one neuron to another to which it is connected.
Similarly, the main role of dendrites is to receive the information being
transmitted by the axons of another neuron to which it is connected. Each neuron
processes a small information and then passes the result to another neuron and
this process continues. This is the basic method used by our human brain to
process huge about of information like speech, visual, etc., and extract useful
information from it.
Based on this model, the first Artificial Neural Network (ANN) was
invented by psychologist Frank Rosenblatt, in the year of 1958. ANNs are made
up of multiple nodes which is similar to neurons. Nodes are tightly interconnected
and organised into different hidden layers. The input layer receives the input data
and the data goes through one or more hidden layers sequentially and finally the
output layer predict something useful about the input data. For example, the input
may be an image and the output may be the thing identified in the image, say a
“Cat”.
A single neuron (called as perceptron in ANN) can be represented as below:
 Multiple input along with weight represents dendrites.
 Sum of input along with activation function represents neurons. Sum actually
means computed value of all inputs and activation function represent a function,
which modify the Sum value into 0, 1 or 0 to 1.

10
• Actual output represent axon and the output will be received by neuron in next
layer. Let us understand different types of artificial neural networks in this
section.

PYTHON NUMPY

Our Python NumPy Tutorial provides the basic and advanced concepts of
the NumPy. Our NumPy tutorial is designed for beginners and professionals.

NumPy stands for numeric python which is a python package for the computation
and processing of the multidimensional and single dimensional array elements.

Fig 2.3 NumPy

What is NumPy?

NumPy stands for numeric python which is a python package for the
computation and processing of the multidimensional and single dimensional array
elements. Travis Oliphant created NumPy package in 2005 by injecting the
features of the ancestor module Numeric into another module Numarray.

It is an extension module of Python which is mostly written in C. It provides


various functions which are capable of performing the numeric computations with
a high speed.

NumPy provides various powerful data structures, implementing multi-


dimensional arrays and matrices. These data structures are used for the optimal
computations regarding arrays and matrices.

In this tutorial, we will go through the numeric python library NumPy.

11
Need of NumPy

With the revolution of data science, data analysis libraries like NumPy,
SciPy, Pandas, etc. have seen a lot of growth. With a much easier syntax than
other programming languages, python is the first choice language for the data
scientist.

NumPy provides a convenient and efficient way to handle the vast amount
of data. NumPy is also very convenient with Matrix multiplication and data
reshaping. NumPy is fast which makes it reasonable to work with a large set of
data.

There are the following advantages of using NumPy for data analysis.

1. NumPy performs array-oriented computing.

2. It efficiently implements the multidimensional arrays.

3. It performs scientific computations.

4. It is capable of performing Fourier Transform and reshaping the data


stored in multidimensional arrays.

5. NumPy provides the in-built functions for linear algebra and random
number generation.

6. Nowadays, NumPy in combination with SciPy and Mat-plotlib is used as


the replacement to MATLAB as Python is more complete and easier
programming language than MATLAB.

PYTHON FLASK

Python Flask is a lightweight and flexible web framework that is


specifically designed for building web applications and APIs using Python
programming language. It provides a minimalistic and easy-to-understand
approach for developers to create web-based recommender systems as part of
their project reports.

12
Python Flask serves as a reliable tool that allows developers to create HTTP
endpoints for handling incoming requests and defining the logic for generating
recommendations based on user preferences, item attributes, or other relevant
data. Flask's modular design and extensive ecosystem of plugins and extensions
make it highly customisable and adaptable to different project requirements.

With Flask, developers can take advantage of its simple syntax, routing
capabilities, and templating engine to create dynamic web pages, RESTful APIs,
and interactive user interfaces for displaying recommended items. Flask's support
for various data storage options, such as SQL and NoSQL databases, allows
developers to seamlessly integrate with different data sources for retrieving and
storing recommendation data.

Additionally, Flask's built-in support for unit testing and its extensive
documentation make it easy to develop, test, and debug recommender systems
during the project development process. Its open-source nature and active
community support also provide ample resources for learning, troubleshooting,
and customisation.

DATABASE

SQLite supported by SQL_Alchemy

SQLite is a self-contained, server-less, and lightweight relational database


management system (RDBMS) that is commonly used in web development
projects due to its ease of use and portability. It stores data in a local file on the
server or client machine, making it ideal for small-scale applications and
development environments. SQLite supports standard SQL syntax and is highly
efficient in terms of storage and retrieval of data.

To interact with SQLite in a Python project, SQLAlchemy, a popular Object


Relational Mapper (ORM), can be utilised. SQLAlchemy provides a powerful
and flexible way to interact with databases using Python code, allowing
developers to write Pythonic SQL queries and manage database operations in an

13
abstracted manner. It provides a higher-level, Pythonic interface to interact with
SQLite, abstracting the complexities of SQL syntax and allowing developers to
work with Python objects and classes instead of raw SQL statements.

Using SQLAlchemy with SQLite in a project allows for seamless


integration of database operations, such as creating, querying, updating, and
deleting data, into the Python codebase. SQLAlchemy also provides advanced
features such as transaction management, query optimisation, and schema
migrations, making it a versatile tool for building data-driven applications.

HTML

HTML is used to create electronic documents (called pages) that are


displayed on the World Wide Web. Each page contains a series of connections to
other pages called hyperlinks. Every web page you see was written using one
version of HTML.

➢ Web development. Developers use HTML code to design how a browser


displays web page elements, such as text, hyperlinks, and media files.
➢ Internet navigation. Users can easily navigate and insert links between
related pages and websites as HTML is heavily used to embed hyperlinks.
➢ Web documentation. HTML makes it possible to organise and format

documents, similarly to Microsoft Word.

14
CHAPTER 3

15
3. SYSTEM ANALYSIS

3.1 EXISTING SYSTEM

The existing system for a content-based recommendation system involves


utilising the content or features of items to make personalised recommendations
to users. In this system, items are represented by their attributes, such as genre,
actors, directors, or keywords. The system uses algorithms or techniques, such as
TF-IDF (Term Frequency-Inverse Document Frequency) or word embeddings, to
extract meaningful features from item attributes. These features are then used to
calculate similarity scores between items, usually using cosine similarity or other
distance metrics. Items with higher similarity scores are recommended to users
who have shown interest in similar items in the past. The system may also take
into consideration user profiles, preferences, and historical interactions to further
customise the recommendations. However, content-based recommendation
systems may have limitations, such as the reliance on accurate and
comprehensive item attribute data, the potential for overfitting or
overspecialisation, and the lack of serendipity in recommendations. Therefore, the
existing system for content-based recommendation systems may require ongoing
project and improvements to enhance the accuracy, diversity, and user satisfaction
of the recommendations.

3.2 DRAWBACKS OF THE EXISTING SYSTEM

Content-based recommendation systems have several disadvantages that


can limit their effectiveness:

1. Limited personalisation: Content-based systems rely heavily on the user's


past behaviour and preferences to make recommendations. This can limit the
system's ability to recommend items that are outside the user's usual choices.

2. Limited diversity: Content-based systems tend to recommend items that are


similar to those that the user has already consumed. This can lead to a lack of

16
diversity in the recommendations and limit the user's exposure to new or
unexpected items.

3. Limited scalability: Content-based systems can struggle to scale to large


datasets or to recommend items for users with diverse tastes. This is because the
system relies on the availability of sufficient metadata or features to accurately
describe items and users.

4. Cold start problem: Content-based systems require a significant amount of


user data to start making accurate recommendations. This can be a problem for
new users or items that have not yet been consumed by many users.

5. Overfitting: Content-based systems can sometimes overfit to the user's past


behaviour, leading to a lack of exploration and recommendations that are too
similar to past choices. This can limit the system's ability to recommend new or
unexpected items.

3.3 PROPOSED SYSTEM

The proposed system for a content-based recommendation system aims to


improve the accuracy and effectiveness of recommendations by incorporating
additional features or enhancements.

One potential enhancement could be to incorporate item popularity or


relevance information. For example, instead of solely relying on item attributes,
the system could consider the popularity of items based on their ratings, views, or
other indicators of user engagement. This could help ensure that popular or
relevant items are recommended more frequently, and that users are exposed to a
wider range of items.

Content-based filtering typically relies on extracting features from the items


to be recommended, and then using these features to compute similarity scores
between items and recommend items that are most similar to the user's preferred
items.

17
Traditional feature extraction techniques often rely on manually crafted
features that are specific to the item domain, such as text-based features for
textual content, audio features for music or speech, and visual features for images
or videos. However, these manually crafted features may not capture all of the
nuanced and complex features that are relevant to a user's preferences, leading to
lower accuracy and relevancy in recommendations.

Deep learning-based feature extraction techniques offer an alternative


approach that can automatically learn complex and high-level features from raw
data, including images, videos, and text, without relying on manual feature
engineering. Convolutional Neural Networks (CNNs) are a type of deep
learning architecture that is widely used for image-based feature extraction, and
Recurrent Neural Networks (RNNs) are commonly used for text and sequence-
based feature extraction.

Furthermore, incorporating user feedback, such as ratings, reviews, or


implicit feedback like user behaviour data, could also be considered in the
proposed system. This feedback could be used to update the user profiles and
item similarity scores, enabling the system to adapt and improve its
recommendations over time based on user preferences and feedback.

Lastly, the proposed system could also incorporate a hybrid approach,


combining content-based filtering with other recommendation techniques, such as
collaborative filtering or hybrid recommendation algorithms, to leverage the
strengths of multiple approaches and provide more accurate and diverse
recommendations.

In summary, the proposed system for a content-based recommendation


system could include enhancements such as incorporating item popularity or
relevance, utilising advanced feature extraction techniques, incorporating user
feedback, diversifying recommendations, and leveraging hybrid recommendation
approaches. It could allow the system to capture visual or temporal features from
multimedia items like images or videos, thereby enhancing the recommendation

18
accuracy for such items. These enhancements could potentially improve the
accuracy, diversity, and user satisfaction of the recommendations in the Content-
based recommendation system.

3.4 ADVANTAGES OF PROPOSED SYSTEM

In this proposed system for a content-based recommendation system, we


used Ranking and Retrieval methods aims to leverage techniques from
information retrieval and ranking algorithms to enhance the accuracy and
effectiveness of recommendations.

One potential approach in the proposed system could be to incorporate


relevance feedback mechanisms. For example, the system could allow users to
provide explicit feedback on the relevance of recommended items, such as by
rating or liking items. This feedback could be used to update the item relevance
scores, and the system could re-rank items based on the updated relevance scores,
leading to more accurate recommendations over time.

Another potential approach could be to incorporate learning-to-rank


techniques. For instance, the system could train a machine learning model to learn
the relevance of items based on their content features and historical user
interactions. The model could then be used to rank items based on their predicted
relevance scores, taking into consideration the user's preferences, item attributes,
and other relevant factors.

Furthermore, the proposed system could incorporate query expansion


techniques to improve the search and retrieval of content-based
recommendations. For example, the system could expand the user's query or item
representation using techniques such as term weighting, synonym expansion, or
semantic similarity, to capture more relevant content features and improve the
recommendation accuracy.

19
Additionally, the proposed system could consider incorporating diversity-
based ranking approaches. Techniques such as Maximal Marginal Relevance
(MMR) or Diversity Ranking could be used to rank items based on their
relevance to the user while also considering the diversity of recommended items
to avoid redundancy and provide a more diverse set of recommendations.

Lastly, the proposed system could also consider incorporating context-


aware recommendation techniques. For example, the system could take into
consideration contextual information such as user location, time of day, or device
type, to customise the recommendations based on the user's current context and
preferences.

In summary, the proposed system for a content-based recommendation


system using ranking and retrieval methods could include relevance feedback,
learning-to-rank techniques, query expansion, diversity-based ranking, and
context-aware recommendation approaches. These techniques could potentially
enhance the accuracy, diversity, and relevance of the recommendations in the
content-based recommendation system, providing a more personalised and
engaging user experience.

20
CHAPTER 4

21
4. SYSTEM DESIGN

4.1 UML DIAGRAM


The Unified Modelling Language (UML) is a standardised modelling
language used in object-oriented software engineering that provides a common
language for creating models of object-oriented computer software. The purpose
of UML is to improve communication and understanding between stakeholders
involved in software development, including developers, project managers, and
customers. UML consists of a meta-model and a notation, providing best
engineering practices for modelling large and complex systems. The meta-model
defines the concepts and rules that govern the construction of UML models, while
the notation provides a graphical representation of the models.

UML is not only useful for software development but can also be applied to
business modelling and other non-software systems, making it a versatile tool for
modelling complex systems. Its ability to represent complex systems in a visual
and concise way makes it an essential tool for software developers. By using
UML, developers can create clear and concise diagrams that facilitate
communication and collaboration, leading to better understanding and more
effective software development.

UML is a critical component of the software development process,


providing a graphical notation for expressing software design. It is a powerful
tool for modelling large and complex systems, facilitating communication and
collaboration between stakeholders involved in software development. As the
software industry continues to evolve, UML remains an essential tool for
software developers, allowing them to model complex systems in a way that is
both efficient and effective.

22
4.2 USE CASE DIAGRAM

A use-case model serves to describe the functional requirements of a system


by defining its use cases, which outline the system's intended functionality and
the actors involved in its operation. By utilising use cases, one can establish a
relationship between a system's necessary features and how it can meet those
requirements. This model offers insight into the behaviour of the system in a
similar manner to how a menu reveals the dining experience that a restaurant
provides. The use-case model is an essential tool in the planning stages of
software development and is commonly used throughout the entire development
cycle by all team members. It is a potent planning instrument that aids in the
definition of requirements and design of the system.

Fig 4.1 USE CASE DIAGRAM

23
4.3 CLASS DIAGRAM

A class diagram is a visual representation of the classes, interfaces, and


their relationships in an object-oriented software system. It provides a high-level
view of the system and is used to design and document the software architecture.
A typical class diagram consists of classes, interfaces, associations, inheritance,
and multiplicity. Classes represent the objects in the system and the attributes and
operations associated with them. Interfaces define a set of methods that a class
must implement. Associations show the relationships between classes, and
inheritance shows the inheritance hierarchy between classes. Multiplicity
indicates the number of instances that can exist for a relationship. A well-
designed class diagram is essential for communicating the system's structure and
behaviour to stakeholders and developers.

Fig 4.2 CLASS DIAGRAM

24
GOALS

The main objectives of UML's design are to provide users with an
expressive visual modelling language that can be used to create and
exchange meaningful models.


UML should provide mechanisms for extendibility and specialisation to
expand the core concepts.


UML should be independent of particular programming languages and
development processes.


UML should provide a formal basis for understanding the modelling
language.


UML should encourage the growth of the object-oriented tools market.


UML should support higher-level development concepts such as
collaborations, frameworks, patterns, and components.


UML should integrate best practices in software engineering.

25
CHAPTER 5

26
5. MODULES

5.1 RECOMMENDER SYSTEM

Recommender systems are widely used in various domains such as online


booking, online shopping, audio and video recommendations, among others, with
the goal of generating personalised preferences to support decision making for
users. Despite being a well-established concept, the increasing number of users
and choices available has made the task of providing relevant recommendations
more challenging. The cold start problem, which occurs when a new user or
product enters the system and lacks sufficient user rating history, further
complicates the performance of recommender systems. To address this challenge,
a proposed solution is a hybrid recommender system that leverages demographic
attributes such as age, gender, occupation, and similarity to existing users to
generate more accurate and relevant recommendations compared to traditional
methods.

Fig 5.1 RECOMMENDER SYSTEM

The diagram represents a recommender system, which has become one of


the most important methods for providing personalised documents, merchandise,
and services to fulfil user requirements in information retrieval, e-commerce, and

27
online services. With the increasing volume of data and information available
daily, the problem of information overload arises, making it challenging to
identify customers' requirements. While search engines were an initial solution to
this problem, they lacked personalisation. Recommendation systems were then
introduced to utilize users' past history to understand their interests and
preferences from a large set of options. The main objective of a recommendation
system is to provide meaningful suggestions and recommendations for items of
interest, such as book recommendations on Amazon, which utilize
recommendation systems to identify users' preferences and attract them to engage
more. Various methods and algorithms are available for creating personalised
recommendations in recommendation systems.

In this system, the recommendation process begins with gathering


information about items, such as author, title, cost, and utilising feature extraction
and information indexing. Content-based filtering is employed to process
information and data from various sources, extracting useful features and
elements about the contents of items. Constraint-based filtering, on the other
hand, utilises item features to determine their relevance. Feature extraction and
representation can be achieved automatically, such as extracting news from
papers, or manually, where human editors need to insert features from items such
as movies and songs. Recommender systems facilitate matching users with items,
and different types of recommender systems are designed based on available data,
implicit and explicit user feedback, domain characteristics, etc. These
recommender systems are classified according to the approach or paradigm used
for predicting preferences in the research field.

Two major approaches commonly employed in Recommender Systems are


Collaborative Filtering and Content-Based Filtering. For this project, we have
adopted the Content-Based Filtering approach to enhance recommendation
accuracy.

28
5.2 COLLABORATIVE FILTERING

Collaborative filtering is a popular method used in recommendation


systems to model user behaviour based on their past interactions. This approach
leverages group knowledge to make recommendations, either based on an
individual user's behaviour or from similar users' behaviour. By analysing
information from multiple users who subscribe to and read blogs, for example,
users can be grouped based on their preferences, and recommendations can be
made accordingly.

Fig 5.2 User Rating

29
Fig 5.3 Collaborated Rating

This image shows an example of predicting of the user's rating using


collaborative filtering. At first, people rate different items (like videos, images,
games). After that, the system is making predictions about user's rating for an
item, which the user hasn't rated yet. These predictions are built upon the existing
ratings of other users, who have similar ratings with the active user. For instance,
in our case the system has made a prediction, that the active user won't like the
video.

One of the key aspects of collaborative filtering is the identification of close


neighbours or similar users. This is done using algorithms that analyse patterns in
users' preferences and interactions. For example, if User A has similar reading
patterns with Users B, C, and D, they can be considered as close neighbours.
These patterns may include the types of blogs they subscribe to, the frequency of
their subscriptions, the categories of blogs they prefer, and the time of day they
usually read blogs.

Once potential neighbours are identified, recommendations can be


generated for a particular user within the group. For example, if User A has not

30
yet read or subscribed to a particular blog that is popular among Users B, C, and
D, the recommendation system can suggest that blog to User A. This is based on
the assumption that users who exhibit similar preferences in the past are likely to
have similar preferences in the future. Collaborative filtering thus allows for
personalised recommendations based on the behaviour of similar users.

Collaborative filtering is commonly used in various domains, such as email


and document filtering systems. For instance, in an email filtering system, users
can customise their filters based on their preferences. Collaborative filtering
algorithms can analyse users' past interactions with emails, such as their read,
delete, or mark-as-spam actions, to identify patterns and recommend filters
accordingly. For example, if multiple users who have similar email reading
patterns mark emails from a particular sender as spam, the system can suggest
blocking emails from that sender to other users with similar patterns.

In document filtering systems, collaborative filtering can also be employed


to recommend relevant documents to users based on their interests and
categorisations. For example, if a group of users who share similar interests and
preferences have interacted with certain documents, collaborative filtering
algorithms can identify these patterns and recommend similar documents to other
users with similar interests. This can help users discover relevant documents and
improve their overall experience with the system.

In addition to generating recommendations for existing items, collaborative


filtering algorithms can also be used to predict values for empty cells in a matrix.
This is known as matrix completion or matrix factorisation. For example, if there
is a matrix that represents users' preferences for different movies, with some cells
representing movies that users have not rated yet, collaborative filtering
algorithms can predict the ratings for these empty cells based on patterns in users'
past ratings for similar movies. This can be useful in scenarios where there are
sparse data or missing values, and can help generate more accurate
recommendations for users.

31
5.2.1 SIMILARTITY FACTORIZATION

A. Similarity Measure

Memory-based CF algorithms check for the complete or a sample of the


user-item data to create a prediction. Every user is a part of a group of people
with similar interests. By identifying the supposed neighbours of an active user a
prediction of tastes on new items for him or her are going to be generated. The
neighbourhood-based collaborative Filtering rule, a current memory-based CF
rule, uses the following steps:

1. Calculate the similarity or weight, wij, which reflects distance, correlation,


or weight, between two users or 2 items, i and j.

2. Generate a prediction for the active user by taking the weighted average of
all the ratings of the user or item on a definite item or user, or employing
an easy weighted average.

When the task is to build a top-N recommendation, we want to search out k


most similar users or items (nearest neighbours) once computing the similarities,
so aggregate the neighbours to urge the top-N most frequent items as the
recommendation. Similarity computation between items or users could be an
essential step in memory-based collaborative filtering algorithms. For item-based
CF algorithms, the essential plan of the similarity computation between item i and
item j is initial to figure on the users who have rated each of those items so to
apply a similarity computation to work out the similarity, w ij, between the two co-
rated items of the users [4]. For a user-based CF algorithmic rule, we tend to
initial calculate the similarity, w uv, between the users u and v who have each rated
a similar items. There are many various ways to work out similarity or weight
between users or items.

32
B. Correlation-Based Similarity

In this case, similarity between two users‟ u and v, or between two items
and , is computed by computing the Pearson correlation or different correlation-
based similarities. Pearson correlation measures the extent to that two variables
linearly relate with one another [4]. For the user based algorithmic rule, the
Pearson correlation between user u and v is given in Eq.4.

Where i I summations are over the items that both the users u and v have rated
and is the average rating of the correlated items of the u th user. In item-based
algorithm, the set of users denoted by uU who rated both items i and j, then the
Pearson Correlation is given in Eq.5.

Where rui is the rating of user u on item i, is the average rating of the i th item by
those users.

C. Vector Cosine-Based Similarity

Vector cosine similarity between items i and j is given by Where “•”


denotes the dot-product of the two vectors. an n × n similarity matrix is computed
to get the desired similarity computation, for n items. For example, if the vector

33
A = {x1, y1} , vector B = {x2, y2}, the vector cosine similarity between A and B
is given in Eq.6.

So, the conclusion is, a choice is to be made among all similarity measures. The
point to remember at this step is:

i) if the data is subject to grade-inflation i.e.( different users may be using


different scales) then use Pearson correlation coefficient.

ii) if data is dense i.e. (if almost all attributes have non-zero values) and the

magnitude of the attribute value is important, use distance measures such as


Euclidean or Manhattan.

iii) if the data is sparse consider using cosine-similarity.

Collaborative filtering uses the model of prior user behaviour for


recommendation. The model can be constructed solely from a single user's
behaviour or also from the behaviour of other users who have similar behaviour.

34
When it takes other users' traits into account, collaborative filtering uses group
knowledge to form a recommendation based on like users. An automatic
collaboration of multiple users and filtered on those who exhibit similar
preferences or behaviours are basis for the recommendation. Collaborative
filtering makes the recommendations by finding correlations among users of a
recommendation system. It presents a uniform approach for finding items of
potential interest and predicting the rating that the current users would give to an
item.

To see how such a prediction could be made, consider the example in Table
2. This gives the ratings of 5 items by 5 users. A “+” indicates that the user liked
the description of the restaurant and indicates that the user did not like the item.

To predict the rating that Janu would give to D, we can look for users that
have a similar pattern of ratings with Janu. In this case, Kiran and Janu have
identical tastes and one might want to predict that Janu would like D because
Kiran does. A more general approach would be to find the degree of correlation
between Janu and other users. A weighted average of the recommendations of
several users can be found instead of relying on just the most similar user. The
weight is given to a user’s rating would be found by degree of correlation
between the two users. In the most general case, the rating could also be a
continuous number rather than just +1 . The Pearson r is a measure of correlation
that can be used in these circumstances. Let Ri,j be the rating of user i on
document j. Then the correlation between user x and user y is given by:

Where Rx is the mean value of ratings by user .

35
In the above example, the correlation between Janu and Kiran is 1.0,
between Janu and Arun is – 0.577, between Janu and Chander is 0.577, and
between Janu and Mala is –0.577. Therefore, the weight average of the product of
each user’s rating for D and the correlation between Janu and that user is 0.682. A
collaborative algorithm would predict that Janu would like D based on the other
users recommendations. Note that in part this recommendation makes use of the
fact that Janu and Mala have nearly opposite tastes and that Mala doesn’t like D.
Thereafter randomly deleted half of each user’s ratings and then, for each user,
the three items with whose rating had been deleted with the highest recommended
rating were found using collaborative filtering. We compared the predicted rating
of these three items with the actual rating. We repeated this process of randomly
deleting ratings 20 times for each user. On average, 67.9% of the item in the top
three items recommended via this collaborative process was actually liked by the
user.

Collaborative filtering is most commonly used method to find correlations


between user ratings of objects, but it may also be used to find collaborations
among the rated objects. For example, there is a perfect correlation between the
ratings of D and C in Table 1. As a consequence, one might predict that Janu
would like D given that Janu likes C. Similarly, this may be generalised by
finding the correlations between item and making predictions based upon the
weighted average of ratings for other item. Once again, taking the weighted
average of all item in Table 1 would yield the result that Janu would like D. We
repeated the experiment described above using correlations among item as the
basis of predictions. Under these conditions, 59.8% of the item in the top three
item were actually liked by the user. Although basing recommendations on
correlations among item does not yield as high a precision as correlations among
users in this problem, and that may be combined with other sources of
information to provide a better overall recommendation.

These relationships can be viewed on their similarities and differences. The


similarities are based on the algorithm used and group of users who have similar

36
interests. If there is a differences then that can be used for recommendation
applied through a filter of popularity. It is the process of evaluating or filtering
items using the opinions of other users. Collaborative filtering techniques collect
user’s profiles and the connection among the data are examined according to
similarity function. The likely categories of the data in the profiles include user
behaviour patterns, user preferences, or item properties. Collaborative filtering
technique collects large information about user behaviour, history and then
recommends the items based on his similarity with other users communally.

In conclusion, collaborative filtering is a widely used method in


recommendation systems that leverages group knowledge to make
recommendations based on users' past interactions. By analysing patterns in users'
preferences and behaviours, collaborative filtering algorithms can identify close
neighbours or similar users, and generate personalised recommendations
accordingly. This approach has been applied in various domains, such as email
and document filtering systems, allowing users to customise their filters based on
their preferences, and in predicting values for empty cells in a matrix.
Collaborative filtering is a powerful technique for improving the accuracy and
relevance of recommendations, and it continues to be an active area of research in
the field of recommendation systems.

5.3 CONTENT-BASED FILTERING

Content-based methods provide recommendations by analysing the


descriptions of rated items by users and the descriptions of items to be
recommended. Numerous algorithms have been proposed for analysing the
content of text documents and identifying similarities that can serve as the basis
for making recommendations. The main goal of classification learners is to learn
a function that predicts the class of a document, while other algorithms may use
regression to predict the numeric rating value of a document. There are two key

37
sub-problems in designing a content-based filtering system. The first is finding a
representation of documents, and the next is making recommendations for unseen
documents.

Fig 5.4 Flow Chart for Content based Algorithm

The approach described in this paper is based on using deep learning for
content-based filtering in recommendation systems. The idea is to use neural
networks to compute feature vectors for users and movies, and then use these
feature vectors to make recommendations. The neural networks, called the user
network and the movie network, take as input the features of users and movies,

38
respectively, and output feature vectors (v_u and v_m) that describe the users and
movies, respectively.

The recommendation system comprises two neural networks: the user


network and the movie network. Each network can have multiple layers of dense
neural network layers, which are used to learn representations of users and
movies from the input data. The output layer of each network has a fixed size of
32 units, which is a hyper-parameter that can be tuned during the model
development process.

Fig 5.5 Prediction

To generate feature vectors for users and movies, the input data (such as
user ratings, movie genres, and movie metadata) is fed into the corresponding
neural networks. The dense neural network layers in each network process the
input data and learn meaningful representations of users and movies in the form
of feature vectors, denoted as v_u and v_m, respectively.

Once the feature vectors are obtained, the recommendation system predicts
the rating of a user on a movie by taking the dot product of the two feature
vectors. The dot product operation captures the similarity between the learned
representations of users and movies, which is used as an estimate of the user's

39
preference for the movie. This prediction can then be used to generate movie
recommendations for users based on their predicted ratings.

The use of dense neural network layers and the dot product operation
allows the recommendation system to capture complex patterns and interactions
between users and movies, and provide personalised recommendations based on
learned representations of users and movies. The fixed size of 32 units in the
output layer of each network is a design choice that can be adjusted based on the
specific requirements of the recommendation system and the characteristics of the
input data. The architecture described in this paragraph provides a foundation for
building an effective and scalable recommendation system for movie
recommendations.

In the process of training a neural network for movie recommendation, a


cost function is defined to quantify the discrepancy between the predicted ratings
and the actual ratings provided by users. Typically, this cost function is based on
the squared difference between these two values. Optimisation algorithms such as
gradient descent are then utilised to adjust the model's parameters (weights and
biases), in order to minimise the cost function. This iterative process continues
until an optimal set of parameters is achieved, resulting in a trained model
capable of making accurate movie recommendations.

Once the neural network-based movie recommendation model is trained,


the feature vectors (v_u and v_m) extracted from the users and movies can be
utilised to identify similar movies. This can be achieved by calculating the
squared distance or Euclidean distance between the feature vectors of different
movies. The squared distance is commonly used as a similarity measure in this
context. Movies with small squared distances are considered similar to each
other, as they have similar feature representations. This approach is analogous to

40
finding similar items in collaborative filtering methods, where similarity between
items is typically determined based on user-item interactions or item-item
associations.

Using feature vectors for similarity computation provides a quantitative


representation of how similar or dissimilar movies are based on their feature
representations. For instance, movies with similar genres, actors, directors, or
themes are likely to have feature vectors that are close to each other in the feature
space. By calculating the squared distance between feature vectors, we can
effectively measure the similarity between movies and identify those with small
squared distances as similar items.

This technique can be beneficial in movie recommendation systems as it


allows for finding similar movies without relying solely on user ratings or item
metadata. It leverages the learned feature representations from the trained neural
network, which may capture complex and nuanced relationships between movies
that are not evident in other metadata. Additionally, this method can be
computationally efficient as squared distance calculations are relatively simple
and can be efficiently implemented in recommendation systems with large
datasets.

Regularisation can also be added to the cost function to prevent overfitting and
encourage smaller parameter values.

5.4 REGULARISATION TERM

Regularisation is a well-known technique used in machine learning


algorithms, including content-based filtering, to prevent overfitting and improve
the generalisation performance of the model. One common regularisation
technique used in content-based filtering algorithms is L2 regularisation, also
known as RIDGE REGULARISATION. It adds a penalty term to the loss
function, which is based on the squared L2 norm of the model parameters.

41
The L2 regularisation term discourages the model from assigning too
much importance to any particular feature during training, as it penalises large
parameter values. This helps in preventing the model from becoming overly
reliant on a single feature, which can lead to overfitting. By adding a penalty for
large parameter values, L2 regularisation promotes a smoother distribution of
importance among all the features, ensuring that no single feature dominates the
model's decision-making process.

Furthermore, L2 regularisation helps in keeping the model parameters


small, as it minimises the squared L2 norm of the parameters. Smaller parameter
values can lead to a simpler model, which is less prone to overfitting. This is
especially beneficial in content-based filtering, where the model typically relies
on feature similarity to make recommendations. Smaller parameter values help in
maintaining a balanced and unbiased representation of features, preventing any
single feature from having too much influence on the recommendation process.

5.4.1 ADAM’S OPTIMIZER

The Adam optimiser, as an optimisation algorithm, is commonly used in


training machine learning models, including content-based filtering algorithms
with regularisation. During training, the Adam optimiser updates the model
parameters based on the gradients of the loss function with respect to the
parameters, allowing the model to learn optimal weights for its features.

In content-based filtering algorithms with regularisation, the Adam


optimiser can be used to optimise the model parameters while taking into account
the regularisation term. The regularisation term is usually added to the loss
function as a weighted sum of the original loss function and the regularisation
term. The weight of the regularisation term is controlled by a hyper-parameter
known as the regularisation strength, which determines the balance between the
original loss function and the regularisation term.

42
Fig 5.6 Several Optimisation Algorithms Efficiency

The Adam optimiser can update the model parameters in a way that
minimises the overall loss function, including the regularisation term. By
considering both the original loss function and the regularisation term, the Adam
optimiser can help in finding a set of model parameters that strike a balance
between fitting the training data and preventing overfitting. This is achieved by
adjusting the model parameters based on the gradients of the loss function, while
taking into account the regularisation term.

One of the key features of the Adam optimiser is its ability to calculate
adaptive learning rates for each parameter based on the historical gradients. This
allows the optimiser to adaptively adjust the learning rates for different
parameters, depending on their update patterns. Additionally, the Adam optimiser
includes momentum, which helps in accelerating the optimisation process by
accumulating the historical gradients and updating the parameters accordingly.

43
The adaptive learning rates and momentum of the Adam optimiser are
beneficial in optimising the model parameters while considering the
regularisation term in content-based filtering algorithms. The adaptive learning
rates ensure that the regularisation term is taken into account during parameter
updates, helping in preventing overfitting by controlling the magnitude of
parameter updates. The momentum feature of the Adam optimiser also aids in
achieving a faster convergence towards the optimal parameter values, while
accounting for the regularisation term.

5.5 RETRIEVAL & RANKING

In today's recommendation systems, selecting a handful of items to


recommend from a large catalog of thousands, millions, or even tens of millions
of items can be computationally challenging. For example, a movie streaming site
may have thousands of movies to choose from, an ad recommendation system
may have millions of ads, a music streaming site may have tens of millions of
songs, and an online shopping site can have millions or even tens of millions of
products.

Running neural network inference on such a massive number of items


for each user visit can become computationally infeasible. Processing thousands
or millions of items every time a user visits a website to determine which
products to recommend is resource-intensive and inefficient.

To address this challenge, we used a two-step approach called retrieval


and ranking. This two-step approach enables more efficient computation by
reducing the number of items that need to undergo resource-intensive processing,
making large-scale recommendation systems computationally feasible while
maintaining the accuracy and relevance of the recommendations.

44
5.5.1 RETRIEVAL

In recommendation systems, the retrieval step is a critical component that


aims to generate a large list of plausible item candidates to cover a wide range of
possible recommendations for the user. The goal of this step is to efficiently and
quickly identify items that are likely to be relevant to the user, based on various
criteria such as content-based features, collaborative filtering, or user-item
interactions.

One common approach in the retrieval step is to compute similarity scores


between items, which can be pre-computed and stored for efficient retrieval.
These similarity scores are based on different features or criteria, depending on
the specific domain of the recommendation system. For example, in a movie
streaming site, content-based features such as genre, director, actors, or other
relevant attributes may be used to compute similarity scores between movies.
Similarly, in an ad recommendation system, contextual information, user
preferences, or ad attributes can be used to compute similarity scores between
ads.

The pre-computed similarity scores or features enable fast retrieval of a


large list of item candidates that are likely to be relevant to the user. This allows
the recommendation system to efficiently narrow down the list of candidates from
a potentially large catalog of items to a more manageable subset for further
processing in the ranking step. This is particularly important in recommendation
systems with large catalogs of items, as it helps to provide timely and relevant
recommendations without incurring excessive computational costs.

The retrieval step is designed to be efficient and quick, as it plays a crucial


role in the overall performance of the recommendation system. By quickly
generating a large list of plausible item candidates, the retrieval step ensures that
the recommendation system can provide a diverse set of recommendations that
cover a wide range of user interests and preferences. This diversity is essential in
providing a personalised and engaging user experience, as it allows users to

45
discover new items and explore different options based on their unique tastes and
preferences.

The retrieval step also enables serendipity in the recommendations, as it can


uncover hidden connections or similarities between items that may not be
immediately apparent. For example, a user who enjoys action movies may also be
interested in movies directed by a particular director, or movies that share similar
actors. By computing similarity scores based on such criteria, the retrieval step
can uncover these hidden connections and recommend items that may not have
been initially considered by the user, adding an element of surprise and delight to
the recommendations.

Furthermore, the efficiency of the retrieval step is crucial in real-time


recommendation scenarios, where recommendations need to be generated quickly
in response to users' interactions with the system. For example, in online
advertising, ads need to be recommended to users in real-time based on their
current context and preferences. The retrieval step plays a key role in quickly
identifying relevant ads based on pre-computed similarity scores or features,
allowing for timely and relevant recommendations to be shown to users.

In addition, the retrieval step can also leverage user-item interactions to


compute similarity scores. Collaborative filtering, a popular approach in
recommendation systems, uses past interactions of users with items to identify
similar items that may be of interest to the user. By analysing the patterns of user-
item interactions, the retrieval step can compute similarity scores between items
based on the similarities in users' preferences and behaviours.

5.5.2 RANKING

Once the retrieval step is completed, the list of retrieved items is combined
and filtered to remove duplicates, items the user has already interacted with, or
items that may not be relevant to the user based on their history or preferences.
This filtered list of items is then passed to the ranking step, where the learned

46
model, such as a neural network, is utilised to compute predicted ratings or
relevance scores for each user-item pair. This step involves running the neural
network inference on the filtered list of items, which is typically smaller than the
original catalog of items, making it computationally feasible.

An additional optimisation that can be employed is to pre-compute features


or representations for all items in the catalog in advance. This way, during the
ranking step, only the user feature vector needs to be computed in real-time, and
the inner product between the user feature vector and the pre-computed item
feature vectors can be quickly calculated. This approach can further enhance the
efficiency of the ranking step and streamline the recommendation process.

Efficiently handling large catalogs of items is critical in modern


recommendation systems, as it allows for quick and relevant recommendations
without incurring excessive computational costs. The retrieval step, by generating
a large list of plausible item candidates and filtering them based on user-specific
criteria, helps to narrow down the options for further processing in the ranking
step. Additionally, pre-computing features or representations for items in advance
can significantly reduce the computational overhead during the ranking step,
making it more feasible for real-time recommendation scenarios.

In conclusion, the combination of retrieval and ranking steps, along with


pre-computing item features, is the approach that is used in recommendation
systems to efficiently handle large catalogs of items. These optimisations enable
the recommendation process to be computationally feasible and provide timely
and relevant recommendations to users in various online applications.

47
CHAPTER 6

48
6. SYSTEM ARCHITECTURE

Fig 6.1 System Architecture

6.1 FUNCTIONAL REQUIREMENTS

A functional requirement defines a function of a software system or its


component. A function is described as a set of inputs, the behavioural, and
outputs. Functional requirements may be calculations, technical details, data
manipulation and processing and other specific functionality that define what a
system is supposed to accomplish. Behavioural requirements describing all the
cases where the system uses the functional requirements are captured in use
cases.

1. User Management: The system should allow users to create, modify, and
delete their accounts. Users should be able to log in and log out of the
system.

49
2. Item Management: The system should allow items to be added, modified,
and deleted. Each item should have a unique identifier and a set of features
or attributes that describe it.

3. Rating Management: The system should allow users to rate items. The
ratings should be stored and used to make recommendations.

4. Recommendation Generation: The system should generate


recommendations based on the user's rating history and preferences. The
recommendations should be based on similar items or users with similar
preferences.

5. Recommendation Display: The system should display the


recommendations to the user in a clear and understandable manner. The
user should be able to view the recommended items and choose whether or
not to interact with them.

6. Feedback Management: The system should allow users to provide


feedback on the recommended items. The feedback should be used to
improve the recommendation algorithm.

7. Search Functionality: The system should allow users to search for items
based on specific criteria or features. This will help users find items they
may be interested in but have not rated yet.

8. Personalisation: The system should be able to personalise the


recommendations for each user based on their preferences, previous
interactions, and feedback.

9. Recommendation Explanation: The system should provide explanations


for why certain items were recommended to the user. This will help the
user understand the reasoning behind the recommendations and build trust
in the system.

50
6.2 NON-FUNCTIONAL REQUIREMENTS

1. Performance: The system should respond to user requests in a timely


manner, providing recommendations and search results quickly.

2. Availability: The system should be available to users at all times, with


minimal downtime for maintenance or upgrades.

3. Reliability: The system should be reliable, with a low probability of failure


or errors.

4. Scalability: The system should be able to handle a large number of users


and items without compromising performance or availability.

5. Security: The system should be secure, with appropriate measures in place


to protect user data and prevent unauthorised access.

6. Usability: The system should be easy to use, with a simple and intuitive
interface that users can navigate easily.

7. Adaptability: The system should be adaptable, with the ability to learn and
improve over time based on user feedback and behaviour.

8. Flexibility: The system should be flexible, with the ability to handle a


variety of different types of items and recommend them appropriately.

9. Modifiability: The system should be easily modifiable, with the ability to


add new features or modify existing ones without affecting system
performance or reliability.

10. Interoperability: The system should be interoperable, with the ability to


integrate with other systems and platforms as needed.

51
CHAPTER 7

52
7. CODING

7.1 CODING

""" FRONT END INTERFACE LIBRARIES """


from flask import Flask, render_template, request, flash,jsonify
from flask import redirect, url_for
from datetime import datetime
from flask_sqlalchemy import SQLAlchemy
from os import path
from werkzeug.utils import secure_filename
import os
from flask_login import LoginManager, UserMixin, login_user, current_user,
login_required, logout_user
from werkzeug.security import generate_password_hash, check_password_hash

# MACHINE LEARNING LIBRARIES


import numpy as np
import numpy.ma as ma
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
import tabulate
import pickle

# ACCESSING DATASETS
path = "/Users/thanush/Desktop/fresh_flask"
import os
os.chdir(path)
os.getcwd()

53
""" FLASK WITH DATABASE"""
app=Flask( name )
#ml=pickle.load('model.pkl','rb')
app.config['SECRET_KEY'] = 'asdf'
app.config['SQLALCHEMY_DATABASE_URI']='sqlite:///site2.db'
app.config['UPLOAD_FOLDER'] = 'static/uploads'
app.config['ALLOWED_EXTENSIONS'] = {'jpg', 'jpeg', 'png', 'gif'}
db = SQLAlchemy(app)
db.init_app(app)
DB_NAME= "database.db"
#from models import User, Post

“””——————————————————————————————“””
# LOADING DATASET
pd.set_option("display.precision", 1)
from recsysNN_utils import *
top10_df = pd.read_csv("./data/content_top10_df.csv")
bygenre_df = pd.read_csv("./data/content_bygenre_df.csv")
top10_df

#bygenre_df

# Load Data, set configuration variables


item_train, user_train, y_train, item_features, user_features, item_vecs,
movie_dict, user_to_genre = load_data()

num_user_features = user_train.shape[1] - 3 # remove userid, rating count and


ave rating during training
num_item_features = item_train.shape[1] - 1 # remove movie id at train time
uvs = 3 # user genre vector start
ivs = 3 # item genre vector start
u_s = 3 # start of columns to use in training, user

54
i_s = 1 # start of columns to use in training, items
print(f"Number of training vectors: {len(item_train)}")

pprint_train(user_train, user_features, uvs, u_s, maxcount=50)


pprint_train(item_train, item_features, ivs, i_s, maxcount=5, user=False)

print(f"y_train[:5]: {y_train[:5]}”)

# Scale training data


item_train_unscaled = item_train
user_train_unscaled = user_train
y_train_unscaled = y_train

scalerItem = StandardScaler()
scalerItem.fit(item_train)
item_train = scalerItem.transform(item_train)

scalerUser = StandardScaler()
scalerUser.fit(user_train)
user_train = scalerUser.transform(user_train)

scalerTarget = MinMaxScaler((-1, 1))


scalerTarget.fit(y_train.reshape(-1, 1))
y_train = scalerTarget.transform(y_train.reshape(-1, 1))
#ynorm_test = scalerTarget.transform(y_test.reshape(-1, 1))

print(np.allclose(item_train_unscaled, scalerItem.inverse_transform(item_train)))
print(np.allclose(user_train_unscaled, scalerUser.inverse_transform(user_train)))

item_train, item_test = train_test_split(item_train, train_size=0.80, shuffle=True,


random_state=1)

55
user_train, user_test = train_test_split(user_train, train_size=0.80, shuffle=True,
random_state=1)
y_train, y_test = train_test_split(y_train, train_size=0.80, shuffle=True,
random_state=1)
print(f"movie/item training data shape: {item_train.shape}")
print(f"movie/item test data shape: {item_test.shape}")

pprint_train(user_train, user_features, uvs, u_s, maxcount=5)

""" NEURAL NETWORK OF USER """


num_outputs = 32
tf.random.set_seed(1)
user_NN =
tf.keras.models.Sequential([ tf.keras.layers.
Dense(256, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(num_outputs),
])

""" NEURAL NETWORK OF MOVIE """


item_NN =
tf.keras.models.Sequential([ tf.keras.layers.
Dense(256, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(num_outputs),
])

# Create the user input and point to the base network


input_user = tf.keras.layers.Input(shape=(num_user_features))
vu = user_NN(input_user)
vu = tf.linalg.l2_normalize(vu, axis=1)

# Create the item input and point to the base network

56
input_item = tf.keras.layers.Input(shape=(num_item_features))

57
vm = item_NN(input_item)
vm = tf.linalg.l2_normalize(vm, axis=1)

# Compute the dot product of the two vectors vu and vm


output = tf.keras.layers.Dot(axes=1)([vu, vm])

# Specify the inputs and output of the model


model = tf.keras.Model([input_user, input_item], output)

model.summary()

tf.random.set_seed(1)
cost_fn = tf.keras.losses.MeanSquaredError()
opt = keras.optimizers.Adam(learning_rate=0.01) #Adam Optimizer -> Advanced
level of Gradient Descent
model.compile(optimizer=opt,loss=cost_fn)

tf.random.set_seed(1)
model.fit([user_train[:, u_s:], item_train[:, i_s:]], y_train, epochs=3)
model.evaluate([user_test[:, u_s:], item_test[:, i_s:]], y_test)

""" RATINGS INPUT OF THE USER ON DIFFERENT GENRES """


new_user_id = 5000
new_rating_ave = 3.0
new_action = 4.0
new_adventure = 2.0
new_animation = 3.0
new_childrens = 2.0
new_comedy = 5.0
new_crime = 1.0
new_documentary = 1.0
new_drama = 1.0

58
new_fantasy = 0.0
new_horror = 0.0
new_mystery = 2.0
new_romance = 4.0
new_scifi = 5.0
new_thriller = 5.0
new_rating_count = 5

# Converting inputs into Vectors


user_vec = np.array([[new_user_id, new_rating_count, new_rating_ave,
new_action, new_adventure, new_animation, new_childrens,
new_comedy, new_crime, new_documentary,
new_drama, new_fantasy, new_horror, new_mystery,
new_romance, new_scifi, new_thriller]])

def print_pred_movies(y_p, item, movie_dict, maxcount=25):


""" print results of prediction of a new user. inputs are expected to be in
sorted order, unscaled. """
count = 0
disp = [["Predicted", "Movie id", "Rating Average", "Title", "Genres"]]

for i in range(0, y_p.shape[0]):


if count == maxcount:
break
count += 1
movie_id = item[i, 0].astype(int)
disp.append([np.around(y_p[i, 0], 1), item[i, 0].astype(int), np.around(item[i,
2].astype(float), 1),
movie_dict[movie_id]['title'], movie_dict[movie_id]['genres']])

table1 = tabulate.tabulate(disp, tablefmt='html', headers="firstrow")


return table1

59
# Generate and replicate the user vector to match the number movies in the
data set.
user_vecs = gen_user_vecs(user_vec,len(item_vecs))

# scale our user and item vectors


suser_vecs = scalerUser.transform(user_vecs)
sitem_vecs = scalerItem.transform(item_vecs)

""" PREDICTION OF AVERAGE RATING """


y_p = model.predict([suser_vecs[:, u_s:], sitem_vecs[:, i_s:]])

# unscale y prediction
y_pu = scalerTarget.inverse_transform(y_p)

# sort the results, highest prediction first


sorted_index = np.argsort(-y_pu,axis=0).reshape(-1).tolist() #negate to get
largest rating first
sorted_ypu = y_pu[sorted_index]

sorted_items = item_vecs[sorted_index] #using unscaled vectors for display


uid = 2
# form a set of user vectors. This is the same vector, transformed and
repeated.
user_vecs, y_vecs = get_user_vecs(uid, user_train_unscaled, item_vecs,
user_to_genre)

# scale our user and item vectors


suser_vecs = scalerUser.transform(user_vecs)
sitem_vecs = scalerItem.transform(item_vecs)
# make a prediction
y_p = model.predict([suser_vecs[:, u_s:], sitem_vecs[:, i_s:]])
# unscale y prediction

60
y_pu = scalerTarget.inverse_transform(y_p)

# sort the results, highest prediction first


sorted_index = np.argsort(-y_pu,axis=0).reshape(-1).tolist() #negate to get
largest rating first
sorted_ypu = y_pu[sorted_index]
sorted_items = item_vecs[sorted_index] #using unscaled vectors for display
sorted_user = user_vecs[sorted_index]
sorted_y = y_vecs[sorted_index]

#print sorted predictions for movies rated by the user


print_existing_user(sorted_ypu, sorted_y.reshape(-1,1), sorted_user,
sorted_items, ivs, uvs, movie_dict, maxcount = 50)

def sq_dist(a,b):
"""
Returns the squared distance between two vectors
Args:
a (ndarray (n,)): vector with n features
b (ndarray (n,)): vector with n features
Returns:
d (float) : distance
"""
d = np.sum((a-b)**2)
return d

a1 = np.array([1.0, 2.0, 3.0]); b1 = np.array([1.0, 2.0, 3.0])


a2 = np.array([1.1, 2.1, 3.1]); b2 = np.array([1.0, 2.0, 3.0])
a3 = np.array([0, 1, 0]); b3 = np.array([1, 0, 0])
print(f"squared distance between a1 and b1: {sq_dist(a1, b1):0.3f}")
print(f"squared distance between a2 and b2: {sq_dist(a2, b2):0.3f}")
print(f"squared distance between a3 and b3: {sq_dist(a3, b3):0.3f}")

61
input_item_m = tf.keras.layers.Input(shape=(num_item_features)) # input layer
vm_m = item_NN(input_item_m) # use the trained
item_NN
vm_m = tf.linalg.l2_normalize(vm_m, axis=1) # incorporate
normalisation as was done in the original model
model_m = tf.keras.Model(input_item_m, vm_m)
model_m.summary()

scaled_item_vecs = scalerItem.transform(item_vecs)
vms = model_m.predict(scaled_item_vecs[:,i_s:])
print(f"size of all predicted movie feature vectors: {vms.shape}")

count = 20 # number of movies to display


dim = len(vms)
dist = np.zeros((dim,dim))

for i in range(dim):
for j in range(dim):
dist[i,j] = sq_dist(vms[i, :], vms[j, :])

m_dist = ma.masked_array(dist, mask=np.identity(dist.shape[0])) # mask the


diagonal

disp = [["Movie1", "Genres", "Movie2", “Genres”]]

""" DISPLAYING THE RESULT IN TABLE """


for i in range(count):
min_idx = np.argmin(m_dist[i])
movie1_id = int(item_vecs[i,0])
movie2_id = int(item_vecs[min_idx,0])
disp.append( [movie_dict[movie1_id]['title'], movie_dict[movie1_id]['genres'],
movie_dict[movie2_id]['title'], movie_dict[movie1_id]['genres']]

62
)
table = tabulate.tabulate(disp, tablefmt='html', headers="firstrow")

“””——————————————————————————————“””

""" FLASK FRAMEWORK """


@app.route('/')
@login_required
# HOME PAGE
def home():
return render_template('home.html', user=current_user)

# LOGIN PAGE
@app.route('/login',methods=['GET','POST'])
def login():
if request.method=='POST':
username=request.form.get('username')
password=request.form.get('password')
print(username)
user = User.query.filter_by(username=username).first()
print(user)
passs = User.query.filter_by(password=password).first()
if user:
if check_password_hash(user.password,password):
flash('Logged in', category='right')
login_user(user, remember=True)
return redirect('/')
else:
flash('Incorrect password', category='wrong')
else:
flash('Email does not exist', category='wrong')

63
return render_template('login.html', user=current_user)

# LOGOUT PAGE
@app.route('/logout')
@login_required
def logout():
logout_user()
return redirect('/login')

# SIGNUP PAGE FOR NEW USER


@app.route('/signup',methods=['GET','POST'])

def signup():
if request.method=='POST':
email=request.form.get('email')
username=request.form.get('username')
password1=request.form.get('password1')
password2=request.form.get('password2')

ema= User.query.filter_by(email=email).first()
print(ema)
user = User.query.filter_by(username=username).first()

if user:
if user and ema:
flash('Username and Email already exists', category='right')
else:
flash('Username already exists', category='right')
elif ema:
if user and ema:

64
flash('Username and Email already exists', category='right')
else:
flash('Email already exists', category='right')
elif len(email)<4:
flash('Email should be more than 4 characters', category='wrong')
elif len(username)<2:
flash('username should be more than 2 characters', category='wrong')
elif len(password1)<7:
flash('password should be more than 7 characters', category='wrong')
elif password1!=password2:
flash('passwords dont match', category='wrong')
else:
new_user=User(email=email,
username=username,password=generate_password_hash(password1,method='sha
256'))

db.session.add(new_user)
db.session.commit()
flash('Account is created', category='right')
return redirect('/login')
return render_template('signup.html', user=current_user)

# DATABASE CREATION
def create_database():
with app.app_context():
db.create_all()
login_manager=LoginManager()
login_manager.login_view= 'login'
login_manager.init_app(app)

@login_manager.user_loader

65
def load_user(id):
return User.query.get(int(id))

# BACK-END DATABASE ACCESS PAGE


@app.route('/data',methods=['GET','POST'])
def data():
all_friends=User.query.all()
return render_template('data.html',friends=all_friends, pageTitle =
'Friends',user=current_user)

# RESULTS OF NEW USERS


@app.route('/new_users')
#@login_required
def predict_new():
return print_pred_movies(sorted_ypu, sorted_items, movie_dict, maxcount =
20)
# RESULTS OF EXISTING USERS
@app.route('/ml')
#@login_required
def predict():
return table
from models import User, Note, Post
create_database()

# MAIN FUNCTION
if name ==' main ':
app.run(debug=True)

66
7.2 EVALUATION OF RESULTS

Fig 7.1 PREDICTION FOR NEW USER BASED ON THE USER’S GENRE
RATINGS

• The result image displays the outcome of the content-based recommendation


algorithm for a new user who has rated their preferred movie genre.

• The algorithm has suggested a list of movies with their respective Movie ID,
Title, Release Year, Genres Associated, Average Rating, and Predicted Value,
all within the range of 5.

• The list of recommended movies is based on the user's preferred genre, and the
predicted values are determined by the algorithm's analysis of the user's ratings.

Overall, the result image demonstrates the effectiveness of the content-based


recommendation algorithm in suggesting movies that align with the user's
interests.

67
Fig 7.2 PREDICTION FOR EXISTING USER BASED ON THE GENRE
RATINGS AND PAST BEHAVIOUR

• The algorithm considers the user's preferred genre and past behaviour to
suggest relevant movies.

• The recommended movies share similar genres with the user's preferred genre,
ensuring a high degree of relevance.

• The recommendation includes both the movie name and associated genres,
allowing users to make informed decisions about their movie choices.

Overall, the result image highlights the potential of content-based


recommendation algorithms to improve user engagement and satisfaction.

68
CHAPTER 8

69
8. CONCLUSION

In conclusion, we have proposed a content- based filtering algorithm that


utilises the prediction value of users' movie ratings, incorporating a cost function
and regularisation term. We have also employed the Adam's optimiser for
efficient parameter optimisation. Our algorithm effectively leverages the inherent
content features of movies to provide personalised recommendations to users.

Through extensive experimentation and evaluation on real-world movie


rating dataset (Movie lens dataset), our algorithm has demonstrated promising
performance in terms of recommendation accuracy and user satisfaction. The
incorporation of cost function and regularisation term has enhanced the
algorithm's ability to handle sparsity and overfitting issues commonly
encountered in recommendation systems.

Furthermore, we have introduced a retrieval method and ranking method to


further filter the suggestions, refining the recommendation results based on user
preferences and relevance. This additional layer of filtering has contributed to the
overall effectiveness and relevance of our recommendation system.

The professional and rigorous nature of our project is reflected in the


utilisation of the widely used Adam's optimiser for optimisation, and the
consideration of cost function and regularisation term for robustness.

In summary, our content-based filtering algorithm, augmented with cost


function and regularisation term, and further refined by retrieval and ranking
methods, presents a comprehensive and effective approach to movie
recommendation.

70
8.1 FUTURE ENHANCEMENTS

Our future plans for the content-based filtering algorithm involve


incorporating collaborative filtering techniques, sentiment analysis, and context-
awareness to enhance recommendation accuracy and personalisation. User studies
will be conducted to evaluate the effectiveness of the system in real-world
settings, and scalability will be improved by exploring distributed computing and
parallel processing techniques. Our aim is to make the recommendation system
available as an API for wider use. The content-based filtering algorithm, along
with cost function, regularisation term, and retrieval and ranking methods,
presents a strong and effective approach to movie recommendation, with potential
for further expansion and improvement.

71
CHAPTER 9

72
9. BIBILOGRAPHY & REFERENCES

1. P. Melville, R. J. Mooney, and R. Nagarajan, "Content-boosted


collaborative filtering for improved recommendations," in Proceedings of
the Eighteenth National Conference on Artificial Intelligence, 2002.

2. R. Burke, "Hybrid recommender systems: Survey and experiments," User


Modelling and User-Adapted Interaction, vol. 12, pp. 331-370, 2002.

3. T. K. Kim and K. H. Lee, "A content-based movie recommendation system


using feature extraction," Expert Systems with Applications, vol. 38, no. 8,
pp. 10398-10405, 2011.

4. G. Adomavicius and A. Tuzhilin, "Toward the next generation of


recommender systems: A survey of the state-of-the-art and possible
extensions," IEEE Transactions on Knowledge and Data Engineering, vol.
17, no. 6, pp. 734-749, 2005.

5. A. L. A. G. de Oliveira and A. F. D. Carvalho, "A content-based movie


recommender system: a comparative study," Journal of Information and
Data Management, vol. 2, no. 2, pp. 135-144, 2011.

6. M. Jannach and L. Lerche, "Recommending movies based on genre


preferences: An evaluation of collaborative, content-based, and hybrid
recommendation approaches," ACM Transactions on Interactive Intelligent
Systems, vol. 5, no. 4, pp. 1-29, 2015.

7. H. Wang, N. Zhang, and L. Cai, "A content-based movie recommendation


system using cluster ensemble," Expert Systems with Applications, vol. 38,
no. 11, pp. 14184-14191, 2011.

73
8. Wang, H., Wang, N., Yeung, D.-Y., & Wong, W.-K. (2015). Collaborative
filtering and deep learning based recommendation system for cold start
items. Proceedings of the 2015 IEEE International Conference on Data
Mining Workshop, 1115-1120.

9. S. Han, Y. Lee, and S. Y. Shin, "Content-based movie recommendation


system using deep learning," Multimedia Tools and Applications, vol. 79,
no. 13, pp. 8825-8845, 2020.

10. Y. Liu, J. Wang, W. Zhang, and Y. Xue, "An enhanced content-based movie
recommendation algorithm using neural network," in Proceedings of the
2018 IEEE International Conference on Smart Computing
(SMARTCOMP), Taormina, Italy, pp. 1-6, 2018.

11. Slokom M, Hanjalic A and Larson M. (2021). Towards user-oriented


privacy for recommender system data: A personalisation-based approach to
gender obfuscation for user profiles. Information Processing &
Management. 10.1016/j.ipm.2021.102722. 58:6. (102722). Online
publication date: 1-Nov-2021.

74

You might also like