0% found this document useful (0 votes)

3 views

Data Driven Summary

The document outlines the structure and functionality of user interfaces and server components in software, alongside an overview of recommender systems, business process management, classification and clustering algorithms, and behavioral aspects of decision-making. It discusses various recommendation paradigms such as collaborative filtering and content-based methods, as well as biases in these systems. Additionally, it emphasizes the importance of improving decision quality through better information and debiasing techniques.

Uploaded by

d8dzk2dqb8

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Data Driven Summary

Uploaded by

d8dzk2dqb8

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

ui.

User Interface – Control the layout, appearance, widget for user inputs and display the output.
Ex: the title, page layout, texto input, radio button, dropdown menu, graphics output, etc.

server.R

Server – set of instructions that uses the input provided by user, process them and produces the
required output.

Ver os apontamentos do Notion sobre os cursos do DATACAMP

Slides 1 – Recommender Systems

Recommender Systems (Sistemas de recomendação)

Recommender systems (RS) help to match users with items.

Recommender systems are software agents that identify the interests and preferences of
individual consumers and make recommendations. They have the potential to support and
improve the quality of the decisions consumers make when searching for and selecting products
online.

Paradigms of recommender systems (think about Netflix example)

Recommender systems reduce information overload by estimating relevance

Personalized recommendations

Collaborative Filtering: "Tell me what's popular among my peers"

This approach is used for exemple for books or movies and use the ratings from the users
to recommend items.

User-based nearest-neighbor collaborative filtering - The idea behind is customers who

had similar tastes in the past, will have similar tastes in the future. User preferences
remain stable and consistent over time

A popular similarity measure in user-based CF: Pearson correlation

Collaborative Filtering-methods do not require any information about the items.

Content-based: "Show me more of the same what I've liked"

This approach needs some information about the available items such as the genre
(género) and the preferences of the user in order to recommend items that are "similar" to the
user preferences.

Compute the similarity of an unseen item with the user profile based on the keyword
overlap.

Limitations:

• Keywords alone may not be sufficient to judge quality/relevance of a document or web

page
• It should use other sources to learn the user preferences
• Algorithms tend to propose "more of the same", too similar news items

Case-based Recommendation

• Content-base Recommendation: unstructured or semi-structured manner, using

keyword-based content analysis techniques
• Case-based Recommendation: structured representations, using attribute-value
representations techniques

Operates through a process of remembering one or a small set of concrete cases and basing
decisions on comparisons between the new and old situations (medical diagnosis, legal
interpretation)

Knowledge-based: "Tell me what fits based on my needs"

based on explicitly defined set of recommendation rules

based on different types of similarity measures

Hybrid: combinations of various inputs and/or composition of different mechanism

Biases (vícios) in recommender systems

Decisions made by recommender systems are affected by various biases, originating from data,
algorithms, presentation or user.

Mitigating harmful biases: Data rebalancing, Regularization, Reranking items in

recommendation list, Filtering items.

What are good metrics for evaluating an RecSys (Recommender systems)?

Precision: for exemple the proportion of recommended movies that are actually good

Recall: the proportion of all good movies recommended

These metrics (Precision and Recall) are inversely proportional

Slides 2 - Business Process Mining / Management (BPM)

BPM is a body of principles, methods and tools to design, analyze, execute and monitor
business processes.

A set of logically related tasks performed to achieve a defined business outcome.” Davenport
(1990) A business process is a closed, repeating set of interconnected tasks performed by
(human or machine) agents in a temporal or causal order required to fulfill a business function
the aim of which is to create value.
Example exercise 1:

A typical order-to-cash process is triggered by the receipt of a purchase order from a customer.
The purchase order has to be checked against the stock regarding the availability of the item(s)
requested. Depending on stock availability the purchase order may be confirmed or rejected. If
the purchase order is confirmed, an invoice is emitted and the goods requested are shipped.
The process completes by archiving the order or if the order is rejected.
pontos de decisão (XORsplit) e pontos de fusão de fluxos alternativos (XOR-join)

New Solution
Example exercise 2: Order distribution process

A company has two warehouses, one in Amsterdam, the other in Hamburg, that store different
products. When an order is received, it is distributed across these warehouses: if some of the
relevant products are maintained in Amsterdam, a sub-order is sent there; likewise, if some
relevant products are maintained in Hamburg, a sub-order is sent there. Afterwards, the order
is registered and the process completes.

Example exercise 3:
Process Maps

Um mapa de processos é um gráfico em que: cada atividade é representada por um nó, e um

arco da atividade A para a atividade B significa que B é diretamente seguida por A em pelo
menos um traço no registo.

Os arcos num mapa de processos podem ser anotados com:

• Frequência absoluta: quantas vezes B segue diretamente A?

• Frequência relativa: em que percentagem das vezes em que A é executado, é
diretamente seguido por B?
• Tempo: qual é o tempo médio entre a ocorrência de A e a ocorrência de B?
➔ Celonis – exercícios!

Slides 3 - Classification and Clustering Algorithms

An algorithm is a sequence of instructions for solving a problem, i.e. for obtaining a required
output for any legitimate input in a finite time interval.

Classification

Classification is the problem of identifying to which (existing) class a new observation belongs
to.

a. Random Forest (decision tree)

Constructing multiple decision trees during training phase. Final decision on majority of trees:
Random Forest Strenghts

• runs efficiently on large data sets

• short training time
• small risk of overfitting
• for large data, it produces highly accurate predictions
• can deal with small data or missing data

Random Forest Weaknesses

• fast training, but slow in making predictions (large number of decision trees)

b. k-nearest neighbor (KNN)

An (new) object is classified by majority vote for its neighbour classes.

The object is assigned to the most common class among its K nearest neighbours.

The distance is measures by a function.

KNN example – slide 16 ao 19

Se K for demasiado pequeno, é sensível aos pontos de ruído.

Se K for demasiado grande, K pode incluir pontos maioritários de outras classes.

A regra geral é K < sqrt(n), n é o número de exemplos.

Clustering

(Aglomeração - A organização de dados não etiquetados, em grupos de similaridade chamados

clusters. Um cluster é um conjunto de itens de dados que são "semelhantes" entre si e
"dissemelhantes" dos itens de dados noutros clusters.)

The organization of unlabeled data, into similarity groups called clusters. A cluster is a
collection of data items which are “similar” between them, and “dissimilar” to data items in
other clusters.
a. K-Means (algoritmo de agrupamento particional)

Exemplo nos slides 30 a 35 -> Muito importante

Slides 4 – Behavioral Aspects

Why decision support?

The primary role is to improve the quality of decision making. Better decisions can be achieve
by:

• Using more/better information in rational decisions, i.e. using evidence and facts
• Reducing irrational/biased decisions

Debiasing

• Identify the existence and nature of the potential bias. This includes understanding the
environment of the bias and the cognitive triggers of the bias;
• Consider alternative means for reducing or eliminating the bias;
• Monitor and evaluate the effectiveness of the debiasing technique chosen. The
possibility of negative side effects should be a particular concern.
Analysis Cycle

• Make the decision-maker articulate what they know about the decision.

• Encourage decision-makers to search for discrepant information or information that

challenges the adopted or preferred position.

• Offer ways to decompose the problem into more understandable subproblems or themes.

• Consider a wider set of decision situations or scenarios. Then, consider the nature of the
current situation in light of the expanded conception.

• Propose alternative formulations of the problem. For example, reformulate a production

problem as a marketing problem.
Analysis Cycle - Debiasing

• Identify the existence and nature of the potential bias.

• Identify the likely impact and the magnitude of the bias.
• Consider alternative means for reducing or eliminating the bias.
• Reassure the user that the presence of biases is not a criticism of their cognitive
abilities.

TI-89 Graphing Calculator For Dummies
100% (2)
TI-89 Graphing Calculator For Dummies
363 pages
An Introduction To Business Analytics by Koole (2019)
No ratings yet
An Introduction To Business Analytics by Koole (2019)
171 pages
Big Data Analytics For R-2017 by ArunPrasath S., Sriram Kumar K., Krishna Sankar P.
No ratings yet
Big Data Analytics For R-2017 by ArunPrasath S., Sriram Kumar K., Krishna Sankar P.
7 pages
Evans Analytics2e PPT 01
No ratings yet
Evans Analytics2e PPT 01
47 pages
Hci Lesson Plan
No ratings yet
Hci Lesson Plan
6 pages
Manual Data Mine
No ratings yet
Manual Data Mine
67 pages
BI SHORT NOTES
No ratings yet
BI SHORT NOTES
15 pages
Intro Big Data
No ratings yet
Intro Big Data
36 pages
All Units MAAL BDA - Chatgpt
No ratings yet
All Units MAAL BDA - Chatgpt
17 pages
IBA Koole First Chapters
No ratings yet
IBA Koole First Chapters
78 pages
Predictive Analytics and Data Mining: Charles Elkan Elkan@cs - Ucsd.edu May 31, 2011
No ratings yet
Predictive Analytics and Data Mining: Charles Elkan Elkan@cs - Ucsd.edu May 31, 2011
165 pages
Data Mining and BI - Student Notes 2
No ratings yet
Data Mining and BI - Student Notes 2
40 pages
Lecture02 Frameworks Platforms-Part1
No ratings yet
Lecture02 Frameworks Platforms-Part1
40 pages
Course 5: Quantitative Techniques For Decision Making - Ii (Machine Learning Techniques)
No ratings yet
Course 5: Quantitative Techniques For Decision Making - Ii (Machine Learning Techniques)
5 pages
Business Analytics Important Question Answers
No ratings yet
Business Analytics Important Question Answers
38 pages
Data Analytics Quantum
No ratings yet
Data Analytics Quantum
144 pages
Modulo1 L02
No ratings yet
Modulo1 L02
81 pages
abhijitya_midsem
No ratings yet
abhijitya_midsem
6 pages
1 Introduction
No ratings yet
1 Introduction
30 pages
Big Data Analytics - A Hands-On Approach (PDFDrive) (1) - 35-42
No ratings yet
Big Data Analytics - A Hands-On Approach (PDFDrive) (1) - 35-42
8 pages
BDA 02 - Fundamentals
No ratings yet
BDA 02 - Fundamentals
64 pages
Introduction to Data Mining
No ratings yet
Introduction to Data Mining
19 pages
Business Data Analytics Part 4
No ratings yet
Business Data Analytics Part 4
52 pages
VishalChangrani DataScience Portfolio
No ratings yet
VishalChangrani DataScience Portfolio
3 pages
Big Data Analytics Algorithm, Tools in Systematic Review
No ratings yet
Big Data Analytics Algorithm, Tools in Systematic Review
7 pages
Chapter 1: Introduction To Business Analytics
No ratings yet
Chapter 1: Introduction To Business Analytics
14 pages
Ccw331-Business Analytics Printed Notes
No ratings yet
Ccw331-Business Analytics Printed Notes
59 pages
DS 2 HAMZA
No ratings yet
DS 2 HAMZA
5 pages
Lecture 01 Overview of Business Analytics
No ratings yet
Lecture 01 Overview of Business Analytics
52 pages
CC Unit IV
No ratings yet
CC Unit IV
30 pages
Lecture 1.3 1.4
No ratings yet
Lecture 1.3 1.4
16 pages
Bia 3
No ratings yet
Bia 3
4 pages
Glossary of Research Methodology
From Everand
Glossary of Research Methodology
Dr. Awadhesh Kishore
No ratings yet
Data Mining Intro IEP
No ratings yet
Data Mining Intro IEP
47 pages
Data Mining All Summary
No ratings yet
Data Mining All Summary
47 pages
Big Data and Analytics Challenges and Issues
No ratings yet
Big Data and Analytics Challenges and Issues
12 pages
Predictive modeling (1)
No ratings yet
Predictive modeling (1)
27 pages
6 - Romanko - Data - Science - and - Business - Analytics - Data - Mining
No ratings yet
6 - Romanko - Data - Science - and - Business - Analytics - Data - Mining
51 pages
FBA-FINALS-LONG-QUIZ
No ratings yet
FBA-FINALS-LONG-QUIZ
13 pages
Unit of learning 1 summary notes
No ratings yet
Unit of learning 1 summary notes
6 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
IAT-I Question Paper With Solution of 17MCA452 Big Data Analytics Mar-2019-Gomathi T
No ratings yet
IAT-I Question Paper With Solution of 17MCA452 Big Data Analytics Mar-2019-Gomathi T
9 pages
ML Report
No ratings yet
ML Report
23 pages
BA Full Note 1
No ratings yet
BA Full Note 1
183 pages
Big Data Analytics
No ratings yet
Big Data Analytics
25 pages
Datascience and Machine Learning
No ratings yet
Datascience and Machine Learning
8 pages
BDM Curriculum 1665047518017
No ratings yet
BDM Curriculum 1665047518017
2 pages
FBA-LONG-QUIZ
No ratings yet
FBA-LONG-QUIZ
7 pages
Book DescriptionPublication Date
No ratings yet
Book DescriptionPublication Date
8 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
200 pages
Assignment Solution 074
No ratings yet
Assignment Solution 074
8 pages
Module 3
No ratings yet
Module 3
6 pages
Bia Unit-3 Part-2
No ratings yet
Bia Unit-3 Part-2
43 pages
Question Bank AIM 2
No ratings yet
Question Bank AIM 2
4 pages
Rapport Bi
No ratings yet
Rapport Bi
94 pages
Big Data
No ratings yet
Big Data
35 pages
BI_Unit1_notes
No ratings yet
BI_Unit1_notes
16 pages
Analytics Methods
No ratings yet
Analytics Methods
40 pages
JDA DP Leadership Exchange Tips To Optimize Jdas DP Modules
No ratings yet
JDA DP Leadership Exchange Tips To Optimize Jdas DP Modules
32 pages
ds sem
No ratings yet
ds sem
71 pages
Ai Notes
No ratings yet
Ai Notes
8 pages
Data Analytics
From Everand
Data Analytics
Jeffery Short
1/5 (1)
Ways to Achieve Quality
From Everand
Ways to Achieve Quality
chakrapani srinivasa
5/5 (1)
Unidad 8pr
No ratings yet
Unidad 8pr
2 pages
The Mercedes Benz S Class
No ratings yet
The Mercedes Benz S Class
50 pages
CH2_STM32_softwareProgramming_Part2.ppt
No ratings yet
CH2_STM32_softwareProgramming_Part2.ppt
14 pages
Siwes Report
100% (1)
Siwes Report
26 pages
VOLUSON S8 Obgyn 3D4D HDlive Specification
No ratings yet
VOLUSON S8 Obgyn 3D4D HDlive Specification
3 pages
(70-215) Cram Notes - Windows 2000 Network Infrastructure (Full Access)
No ratings yet
(70-215) Cram Notes - Windows 2000 Network Infrastructure (Full Access)
10 pages
MS Nae5510
No ratings yet
MS Nae5510
6 pages
Pin Diode RF Modulators
No ratings yet
Pin Diode RF Modulators
6 pages
Hostel Management System: Project Report On
100% (2)
Hostel Management System: Project Report On
56 pages
Peugeot 207 CC 2011 Ing
No ratings yet
Peugeot 207 CC 2011 Ing
224 pages
Tk-Star Gps Tracker User Manual
No ratings yet
Tk-Star Gps Tracker User Manual
11 pages
MindShare Intro To PCIe
No ratings yet
MindShare Intro To PCIe
18 pages
VX602,1502,2002,3002,4002,6002,2004,3004,4004,6005
No ratings yet
VX602,1502,2002,3002,4002,6002,2004,3004,4004,6005
14 pages
2017 SPM ICT FORM 5 MPKK Assessment - Document 01 Trim Down
No ratings yet
2017 SPM ICT FORM 5 MPKK Assessment - Document 01 Trim Down
42 pages
Deld Mcqs 1
No ratings yet
Deld Mcqs 1
25 pages
2016 May CSEC Technical Drawing Paper3
No ratings yet
2016 May CSEC Technical Drawing Paper3
7 pages
Exercise Chapter 2
No ratings yet
Exercise Chapter 2
11 pages
Linear Circuit Analysis (ELEN-1100) : Lecture # 18: Operational Amplifier (Cont..)
No ratings yet
Linear Circuit Analysis (ELEN-1100) : Lecture # 18: Operational Amplifier (Cont..)
9 pages
VAC-Information Security 2021-Report Compressed
No ratings yet
VAC-Information Security 2021-Report Compressed
6 pages
ARMA 3 Field Guide
No ratings yet
ARMA 3 Field Guide
22 pages
Administration Officer (Academics)
No ratings yet
Administration Officer (Academics)
3 pages
Cpu, Mobo, Gpu, Server, 15.12
No ratings yet
Cpu, Mobo, Gpu, Server, 15.12
1 page
GUI Microproject
No ratings yet
GUI Microproject
17 pages
Instant Access to (Ebook) CMOS PLLs and VCOs for 4G Wireless by Adem Aktas, Mohammed Ismail ISBN 9781402080593, 9781402080609, 140208059X, 1402080603 ebook Full Chapters
100% (1)
Instant Access to (Ebook) CMOS PLLs and VCOs for 4G Wireless by Adem Aktas, Mohammed Ismail ISBN 9781402080593, 9781402080609, 140208059X, 1402080603 ebook Full Chapters
81 pages
Operations Management: Emin Ilyas
No ratings yet
Operations Management: Emin Ilyas
24 pages
6 - Lecture6
No ratings yet
6 - Lecture6
3 pages
Eng TELE-satellite 0907
No ratings yet
Eng TELE-satellite 0907
108 pages

Data Driven Summary

Uploaded by

Data Driven Summary

Uploaded by

ui.

Ver os apontamentos do Notion sobre os cursos do DATACAMP

Slides 1 – Recommender Systems

Recommender Systems (Sistemas de recomendação)

Recommender systems (RS) help to match users with items.

Paradigms of recommender systems (think about Netflix example)

Recommender systems reduce information overload by estimating relevance

Collaborative Filtering: "Tell me what's popular among my peers"

User-based nearest-neighbor collaborative filtering - The idea behind is customers who

A popular similarity measure in user-based CF: Pearson correlation

Collaborative Filtering-methods do not require any information about the items.

Content-based: "Show me more of the same what I've liked"

• Keywords alone may not be sufficient to judge quality/relevance of a document or web

• Content-base Recommendation: unstructured or semi-structured manner, using

Knowledge-based: "Tell me what fits based on my needs"

based on explicitly defined set of recommendation rules

based on different types of similarity measures

Hybrid: combinations of various inputs and/or composition of different mechanism

Mitigating harmful biases: Data rebalancing, Regularization, Reranking items in

What are good metrics for evaluating an RecSys (Recommender systems)?

Recall: the proportion of all good movies recommended

These metrics (Precision and Recall) are inversely proportional

Slides 2 - Business Process Mining / Management (BPM)

Um mapa de processos é um gráfico em que: cada atividade é representada por um nó, e um

Os arcos num mapa de processos podem ser anotados com:

• Frequência absoluta: quantas vezes B segue diretamente A?

Slides 3 - Classification and Clustering Algorithms

a. Random Forest (decision tree)

• runs efficiently on large data sets

Random Forest Weaknesses

b. k-nearest neighbor (KNN)

An (new) object is classified by majority vote for its neighbour classes.

The distance is measures by a function.

Se K for demasiado pequeno, é sensível aos pontos de ruído.

Se K for demasiado grande, K pode incluir pontos maioritários de outras classes.

A regra geral é K < sqrt(n), n é o número de exemplos.

(Aglomeração - A organização de dados não etiquetados, em grupos de similaridade chamados

Exemplo nos slides 30 a 35 -> Muito importante

Why decision support?

• Encourage decision-makers to search for discrepant information or information that

• Propose alternative formulations of the problem. For example, reformulate a production

• Identify the existence and nature of the potential bias.

You might also like