0% found this document useful (0 votes)
3 views

Data Driven Summary

The document outlines the structure and functionality of user interfaces and server components in software, alongside an overview of recommender systems, business process management, classification and clustering algorithms, and behavioral aspects of decision-making. It discusses various recommendation paradigms such as collaborative filtering and content-based methods, as well as biases in these systems. Additionally, it emphasizes the importance of improving decision quality through better information and debiasing techniques.

Uploaded by

d8dzk2dqb8
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Data Driven Summary

The document outlines the structure and functionality of user interfaces and server components in software, alongside an overview of recommender systems, business process management, classification and clustering algorithms, and behavioral aspects of decision-making. It discusses various recommendation paradigms such as collaborative filtering and content-based methods, as well as biases in these systems. Additionally, it emphasizes the importance of improving decision quality through better information and debiasing techniques.

Uploaded by

d8dzk2dqb8
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

ui.

User Interface – Control the layout, appearance, widget for user inputs and display the output.
Ex: the title, page layout, texto input, radio button, dropdown menu, graphics output, etc.

server.R

Server – set of instructions that uses the input provided by user, process them and produces the
required output.

Ver os apontamentos do Notion sobre os cursos do DATACAMP

Slides 1 – Recommender Systems

Recommender Systems (Sistemas de recomendação)

Recommender systems (RS) help to match users with items.

Recommender systems are software agents that identify the interests and preferences of
individual consumers and make recommendations. They have the potential to support and
improve the quality of the decisions consumers make when searching for and selecting products
online.

Paradigms of recommender systems (think about Netflix example)

Recommender systems reduce information overload by estimating relevance

Personalized recommendations

Collaborative Filtering: "Tell me what's popular among my peers"

This approach is used for exemple for books or movies and use the ratings from the users
to recommend items.

User-based nearest-neighbor collaborative filtering - The idea behind is customers who


had similar tastes in the past, will have similar tastes in the future. User preferences
remain stable and consistent over time

A popular similarity measure in user-based CF: Pearson correlation

Collaborative Filtering-methods do not require any information about the items.

Content-based: "Show me more of the same what I've liked"


This approach needs some information about the available items such as the genre
(género) and the preferences of the user in order to recommend items that are "similar" to the
user preferences.

Compute the similarity of an unseen item with the user profile based on the keyword
overlap.

Limitations:

• Keywords alone may not be sufficient to judge quality/relevance of a document or web


page
• It should use other sources to learn the user preferences
• Algorithms tend to propose "more of the same", too similar news items

Case-based Recommendation

• Content-base Recommendation: unstructured or semi-structured manner, using


keyword-based content analysis techniques
• Case-based Recommendation: structured representations, using attribute-value
representations techniques

Operates through a process of remembering one or a small set of concrete cases and basing
decisions on comparisons between the new and old situations (medical diagnosis, legal
interpretation)

Knowledge-based: "Tell me what fits based on my needs"

based on explicitly defined set of recommendation rules

based on different types of similarity measures

Hybrid: combinations of various inputs and/or composition of different mechanism


Biases (vícios) in recommender systems

Decisions made by recommender systems are affected by various biases, originating from data,
algorithms, presentation or user.

Mitigating harmful biases: Data rebalancing, Regularization, Reranking items in


recommendation list, Filtering items.

What are good metrics for evaluating an RecSys (Recommender systems)?

Precision: for exemple the proportion of recommended movies that are actually good

Recall: the proportion of all good movies recommended

These metrics (Precision and Recall) are inversely proportional

Slides 2 - Business Process Mining / Management (BPM)


BPM is a body of principles, methods and tools to design, analyze, execute and monitor
business processes.

A set of logically related tasks performed to achieve a defined business outcome.” Davenport
(1990) A business process is a closed, repeating set of interconnected tasks performed by
(human or machine) agents in a temporal or causal order required to fulfill a business function
the aim of which is to create value.
Example exercise 1:

A typical order-to-cash process is triggered by the receipt of a purchase order from a customer.
The purchase order has to be checked against the stock regarding the availability of the item(s)
requested. Depending on stock availability the purchase order may be confirmed or rejected. If
the purchase order is confirmed, an invoice is emitted and the goods requested are shipped.
The process completes by archiving the order or if the order is rejected.
pontos de decisão (XORsplit) e pontos de fusão de fluxos alternativos (XOR-join)

New Solution
Example exercise 2: Order distribution process

A company has two warehouses, one in Amsterdam, the other in Hamburg, that store different
products. When an order is received, it is distributed across these warehouses: if some of the
relevant products are maintained in Amsterdam, a sub-order is sent there; likewise, if some
relevant products are maintained in Hamburg, a sub-order is sent there. Afterwards, the order
is registered and the process completes.

Example exercise 3:
Process Maps

Um mapa de processos é um gráfico em que: cada atividade é representada por um nó, e um


arco da atividade A para a atividade B significa que B é diretamente seguida por A em pelo
menos um traço no registo.

Os arcos num mapa de processos podem ser anotados com:

• Frequência absoluta: quantas vezes B segue diretamente A?


• Frequência relativa: em que percentagem das vezes em que A é executado, é
diretamente seguido por B?
• Tempo: qual é o tempo médio entre a ocorrência de A e a ocorrência de B?
➔ Celonis – exercícios!

Slides 3 - Classification and Clustering Algorithms


An algorithm is a sequence of instructions for solving a problem, i.e. for obtaining a required
output for any legitimate input in a finite time interval.

Classification

Classification is the problem of identifying to which (existing) class a new observation belongs
to.

a. Random Forest (decision tree)

Constructing multiple decision trees during training phase. Final decision on majority of trees:
Random Forest Strenghts

• runs efficiently on large data sets


• short training time
• small risk of overfitting
• for large data, it produces highly accurate predictions
• can deal with small data or missing data

Random Forest Weaknesses

• fast training, but slow in making predictions (large number of decision trees)

b. k-nearest neighbor (KNN)

An (new) object is classified by majority vote for its neighbour classes.

The object is assigned to the most common class among its K nearest neighbours.

The distance is measures by a function.


KNN example – slide 16 ao 19

Se K for demasiado pequeno, é sensível aos pontos de ruído.

Se K for demasiado grande, K pode incluir pontos maioritários de outras classes.

A regra geral é K < sqrt(n), n é o número de exemplos.


Clustering

(Aglomeração - A organização de dados não etiquetados, em grupos de similaridade chamados


clusters. Um cluster é um conjunto de itens de dados que são "semelhantes" entre si e
"dissemelhantes" dos itens de dados noutros clusters.)

The organization of unlabeled data, into similarity groups called clusters. A cluster is a
collection of data items which are “similar” between them, and “dissimilar” to data items in
other clusters.
a. K-Means (algoritmo de agrupamento particional)

Exemplo nos slides 30 a 35 -> Muito importante


Slides 4 – Behavioral Aspects

Why decision support?

The primary role is to improve the quality of decision making. Better decisions can be achieve
by:

• Using more/better information in rational decisions, i.e. using evidence and facts
• Reducing irrational/biased decisions

Debiasing

• Identify the existence and nature of the potential bias. This includes understanding the
environment of the bias and the cognitive triggers of the bias;
• Consider alternative means for reducing or eliminating the bias;
• Monitor and evaluate the effectiveness of the debiasing technique chosen. The
possibility of negative side effects should be a particular concern.
Analysis Cycle

• Make the decision-maker articulate what they know about the decision.

• Encourage decision-makers to search for discrepant information or information that


challenges the adopted or preferred position.

• Offer ways to decompose the problem into more understandable subproblems or themes.

• Consider a wider set of decision situations or scenarios. Then, consider the nature of the
current situation in light of the expanded conception.

• Propose alternative formulations of the problem. For example, reformulate a production


problem as a marketing problem.
Analysis Cycle - Debiasing

• Identify the existence and nature of the potential bias.


• Identify the likely impact and the magnitude of the bias.
• Consider alternative means for reducing or eliminating the bias.
• Reassure the user that the presence of biases is not a criticism of their cognitive
abilities.

You might also like