0% found this document useful (0 votes)

9 views61 pages

Tutorial Part2

This document discusses combination rules for combining evidence in Dempster-Shafer theory. It reviews Dempster's rule and other combination rules proposed to address issues with conflicting evidence, such as Zadeh's paradox. It also discusses interpretations of the degree of conflict and differences between combination rules.

Uploaded by

Thuong Vu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views61 pages

Tutorial Part2

Uploaded by

Thuong Vu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Dempster-Shafer Reasoning with Uncertainty

Theory and Applications

Huỳnh Văn Nam

Japan Advanced Institute of Science and Technology

1-1 Asahidai, Nomi, Ishikawa, 923-1292 Japan
Email: [email protected]

Đại Học Bách Khoa TP HCM, 22/02/2011

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 1 / 58
Part 2 – Combination and Applications
Evidence Combination and Conflict
Combination Rules: A Review
Conflict Revisited
Difference Between two BoEs
Discounting and Combination Solution
Applications to Ensemble Learning
Application to Ensemble Classification
Application to Ensemble Clustering
An Illustrative Application
Word Sense Disambiguation
Multi-Representation of Context
Discounting-and-Combination Method for WSD
Experimental Results

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 2 / 58
Combination of Evidence in D-S Theory

∎ Criticisms on the counterintuitive results of applying Dempster’s

combination rule to conflicting beliefs soon emerged since its inception.
∎ In Dempster’s rule of combination, the combined mass assigned to the
empty set considered as the conflict is distributed proportionally to the
other masses.
∎ Zadeh (1984) presented an example where Dempster’s rule of
combination produces unsatisfactory results.
∎ Since then, many alternatives have been proposed in the literature.
∎ The study of combination rules in D-S theory when evidence is in
conflict has emerged again recently as an interesting topic, especially in
data/information fusion applications.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 3 / 58
Evidence Combination and Conflict

Outline
Evidence Combination and Conflict
Combination Rules: A Review
Conflict Revisited
Difference Between two BoEs
Discounting and Combination Solution
Applications to Ensemble Learning
Application to Ensemble Classification
Application to Ensemble Clustering
An Illustrative Application
Word Sense Disambiguation
Multi-Representation of Context
Discounting-and-Combination Method for WSD
Experimental Results

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 4 / 58
Evidence Combination and Conflict Combination Rules: A Review

Dempster’s Rule and Conflict

∎ Let m1 and m2 be two mass functions defined on frame Θ.

∎ Denote m⊕ = (m1 ⊕ m2 ) the combined mass function by Dempster’s
rule of combination:
1
m⊕ (A) ≜ ∑ m1 (B) × m2 (C), ∀A ⊆ Θ, A =/ ∅
1 − κ B∩C=A

where
κ= ∑ m1 (B) × m2 (C)
B∩C=∅
∎ κ can be interpreted as the combined mass assigned to the empty set
before normalization. So, it is also denoted by m⊕ (∅) and
conventionally considered as the degree of conflict.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 5 / 58
Evidence Combination and Conflict Combination Rules: A Review

Dempster’s Rule and Conflict

Zadeh’s example
∎ One doctor believes a patient has either meningitis – with a probability
of 0.99, or a brain tumor – with a probability of only 0.01.
∎ A second doctor believes the patient suffers from concussion – with a
probability of 0.99, and also believes the patient has a brain tumor –
with a probability of only 0.01.
Combining these two pieces of evidence with Dempster’s rule yields

m⊕ (brain tumor) = Bel⊕ (brain tumor) = 1

8 This result implies complete support for the diagnosis of a brain tumor,
which both doctors believed very unlikely.
ê Many alternative rules of combination have been developed.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 6 / 58
Evidence Combination and Conflict Combination Rules: A Review

Smets’s Rule of Combination

The transferable belief model [Smets & Kennes, Artif. Intell. 66 (1994)]:
∎ Justifies the use of belief functions to model subjective, personal beliefs.
∎ In general, in the definition of a mass function, the condition m(∅) = 0
is not required.

Smets’s rule of combination (2005)

Let m1 and m2 be two mass functions defined on frame Θ. The so-called
conjunctive rule of combination, denoted as m? = m1 ? m2, is defined as

m? (A) ≜ ∑ m1 (B) × m2 (C), ∀A ⊆ Θ

B∩C=A

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 7 / 58
Evidence Combination and Conflict Combination Rules: A Review

Smets’s Rule of Combination

3 Masses are not renormalized.

3 The conflict is stored in the mass given to the empty set ⇒ the open
world assumption, i.e. the “actual world” (i.e., the true value of variable
X) might not be in Θ.
Zadeh’s example revisited:
∎ m? (brain tumor) = 0.0001
∎ m? (meningitis) = 0
∎ m? (concussion) = 0
∎ m? (∅) = 0.9999

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 8 / 58
Evidence Combination and Conflict Combination Rules: A Review

Yager’s Rule of Combination

∎ Yager’s solution (1987): the conflict is transferred to the universe Θ.

∎ Given two mass functions m1 and m2 on frame Θ, then the mass
function that results from the application of Yager’s combination rule is
given by
⎧
⎪ ∑ m (B) × m2 (C), ∀A ⊆ Θ, A =/ Θ, A =/ ∅
⎪
⎪ B∩C=A 1
mY⊕ (A) = ⎨ 0, if A = ∅
⎪
⎪
⎪
⎩ m 1 (Θ) × m2 (Θ) + m ⊕ (∅), if A = Θ

where
m⊕ (∅) = ∑ m1 (B) × m2 (C)
B∩C=∅

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 9 / 58
Evidence Combination and Conflict Combination Rules: A Review

Yager’s Rule of Combination

3 Yager’s combination rule is commutative but not associative!

Zadeh’s example again:

∎ mY⊕ (brain tumor) = 0.0001
∎ mY⊕ (∅) = 0
∎ mY⊕ (Θ) = 0.9999

ê The conflict of sources of information to be combined is treated as

ignorance.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 10 / 58
Evidence Combination and Conflict Combination Rules: A Review

Dubois and Prade’s Rule of Combination

∎ Let m1 and m2 be two mass functions on frame Θ. The disjunctive rule
of combination, denoted m⊎ = (m1 ⊎ m2 ), is defined by

m⊎ (A) = ∑ m1 (B) × m2 (C), ∀A ⊆ Θ

B∪C=A

∎ Interpretation: Only one of the two sources of evidence represented by

m1 and m2 is fully reliable, but we do not know which source it is.
∎ Dubois and Prade (1988) also proposed a “hybrid” rule intermediate
between the conjunctive and disjunctive sums as follows:

mH (A) = ∑ m1 (B) × m2 (C) + ∑ m1 (B) × m2 (C)

B∩C=A B∩C=∅,B∪C=A

for any A ⊆ Θ and A =/ ∅, and mH (A) = 0, if A = ∅.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 11 / 58
Evidence Combination and Conflict Combination Rules: A Review

Dubois and Prade’s Rule of Combination

∎ Dubois and Prade’s “hybrid” rule is not associative, but it usually

provides a good summary of partially conflicting items of evidence!
Solution to Zadeh’s example:
∎ mH (brain tumor) = 0.0001
∎ mH ({meningitis, brain tumor}) = 0.0099
∎ mH ({concussion, brain tumor}) = 0.0099
∎ mH ({meningitis, concussion}) = 0.9801
∎ mH (∅) = 0
∎ mH (Θ) = 0

ê a solution more flexible than Yager’s solution for the transfer of the
conflictual masses.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 12 / 58
Evidence Combination and Conflict Combination Rules: A Review

Remarks

∎ Many other suggestions have been made, creating a “jungle” of

combination rules.
∎ A good survey of the combination rules and their applications:
[Sentz & Ferson, Combination of Evidence in Dempster-Shafer Theory,
Sandia National Laboratories SAND 2002-0835, 2002]

Observation
∎ Most of these works usually began with analyzing some counterintuitive

examples when applying existing combination rules, and then proposed

new ones which would give more reasonable results to these particular
situations.
∎ This approach may only yield solutions being good ‘locally’, and
consequently, it is difficult to be theoretically justified.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 13 / 58
Evidence Combination and Conflict Conflict Revisited

m⊕ (∅) as Conflict?

Liu [Artif. Intell. 170 (2006)] argued that value m⊕ (∅) cannot be used as
a measure of conflict between two bodies of evidence but only represents
the mass of uncommitted belief as a result of combination.
Example – Two identical mass functions
Let us consider two identical mass functions m1 = m2 on Θ = {θi }5i=1 :
∎ m1 (θi ) = m2 (θi ) = 0.2 for i = 1, . . . , 5
∎ Then, m⊕ (∅) = 0.8, which is quite high whilst it appears the total
absence of conflict as two mass functions are identical.

Remark:
More generally, we always get m⊕ (∅) > 0 with two identical mass function
whenever their focal elements defines a partition of the frame!

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 14 / 58
Evidence Combination and Conflict Conflict Revisited

Liu’s Criteria for Conflict

Two mass functions m1 and m2 are said to be in conflict if and only if

m⊕ (∅) > and difBetP(m1 , m2 ) >

where ∈ [0, 1] is a threshold of conflict tolerance and difBetP(m1 , m2 )

is defined by

difBetP(m1 , m2 ) = max(∣BetPm1 (A) − BetPm2 (A)∣)

A⊆Θ

and called the distance between betting commitments of the two mass
functions.
+ A comprehensive analysis of combination rules and conflict
management:
[P. Smets, Analyzing the combination of conflicting belief functions,
Information Fusion 8 (2007)].

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 15 / 58
Evidence Combination and Conflict Conflict Revisited

Liu’s Criteria for Conflict

Example: Consider the following pair of mass functions on the same frame
Θ = {θi ∣i = 1, . . . , 7}

m1 ({θ1 , θ2 , θ3 , θ4 }) = 1; and m2 ({θ4 , θ5 , θ6 , θ7 }) = 1

Then, m⊕ (∅) = 0, i.e, these mass functions are not in conflict at all.
However, using the second criterion we easily get:

difBetP(m1 , m2 ) = 0.75

Note that m1 and m2 have assigned, by definition, the total mass exactly
to {θ1 , θ2 , θ3 , θ4 } and {θ4 , θ5 , θ6 , θ7 }, respectively, and to none of the
proper subsets of them. So intuitively these two mass functions are partly
in conflict. Such a partial conflict does not be judged by means of m⊕ (∅)
but difBetP(m1 , m2 ) as shown above.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 16 / 58
Evidence Combination and Conflict Difference Between two BoEs

Distance Between two Mass Functions

∎ Let B1 = (Fm1 , m1 ) and B2 = (Fm2 , m2 ) be two bodies of evidence on

the same frame Θ.
∎ The distance between m1 and m2 , denoted by d(m1 , m2 ), is defined as
follows
d(m1 , m2 ) = max(∣m1 (A) − m2 (A)∣)
A⊆Θ
∎ Obviously, d(m1 , m2 ) = 0 if and only if m1 = m2 .
This distance is considered as a quantitative measure for judging the
difference between two bodies of evidence B1 and B2 (Huynh, 2009).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 17 / 58
Evidence Combination and Conflict Difference Between two BoEs

Difference Between two BoEs

Denote dif F (m1 , m2 ) the symmetric difference between two families of

focal elements Fm1 and Fm2 , i.e.,

dif F (m1 , m2 ) = (Fm1 ∖ Fm2 ) ∪ (Fm2 ∖ Fm1 )

∎ If dif F (m1 , m2 ) = Fm1 ∪ Fm2 , and A ∩ B = ∅ for any A ∈ Fm1 and

B ∈ Fm2 , then m⊕ (∅) = 1 – fully conflict.
∎ If dif F (m1 , m2 ) = ∅ and d(m1 , m2 ) > 0, then qualitatively two sources
are not in conflict but having different preferences in distributing their
masses to focal elements.

ê How different between two sources in realization of the question of

where the true hypothesis lies.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 18 / 58
Evidence Combination and Conflict Difference Between two BoEs

Difference Between two BoEs

Let us denote

dif(B1 , B2 ) = ⟨d(m1 , m2 ), dif F (m1 , m2 )⟩

and call it the difference measure of two bodies of evidence.

∎ The conflict between two bodies of evidence originates from either or

both of d(m1 , m2 ) (quantitative) and dif F (m1 , m2 ) (qualitative).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 19 / 58
Evidence Combination and Conflict Difference Between two BoEs

Difference Between two BoEs

∎ Liu’s criterion of using difBetP(m1 , m2 ) is somewhat weaker than

using the direct distance of d(m1 , m2 ).
Example: consider again the following pair of mass functions:

m1 ({θ1 , θ2 , θ3 , θ4 }) = 1; and m2 ({θ4 , θ5 , θ6 , θ7 }) = 1

Then, we have d(m1 , m2 ) = 1 whilst difBetP(m1 , m2 ) = 0.75.

∎ In addition, if m1 = m2 we have difBetP(m1 , m2 ) = 0 but the reverse
does not hold in general.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 20 / 58
Evidence Combination and Conflict Difference Between two BoEs

Quantifying Conflict

∎ We have recently argued that only a part of value m⊕ (∅) should be

used to quantify a conflict qualitatively stemming from dif F (m1 , m2 ).
∎ Let
mcomb
⊕ (∅) = ∑ m1 (A)m2 (B)
A,B∈F1 ∩F2 ,A∩B=∅

∎ Clearly, mcomb
⊕ (∅) is a part of m⊕ (∅) and intuitively representing the
mass of uncommitted belief as a result of combination rather than a
conflict.
∎ Therefore, the conflict is properly represented by the remainder of
m⊕ (∅), i.e.
△
m⊕ (∅) − mcomb
⊕ (∅) = mconf
⊕ (∅)

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 21 / 58
Evidence Combination and Conflict Difference Between two BoEs

Quantifying Conflict

Remark
With this formulation of conflict, the fact used to question the validity of
Dempster’s rule that two identical probability measures are always
conflicting becomes inappropriate!

Example
Consider again two identical mass functions on Θ = {θi ∣i = 1 . . . 5}:
m1 (θi ) = m2 (θi ) = 0.2 for i = 1, . . . , 5. Then we get mcomb
⊕ (∅) = 0.8 and
m⊕ (∅) = 0, and hence no conflict appears between the two at all.
conf

⊕ (∅) = 0 whenever two mass functions

ê Generally, we always get mconf
being combined are identical.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 22 / 58
Evidence Combination and Conflict Difference Between two BoEs

Quantifying Conflict

Zadeh’s famous example revisited

Consider two mass functions m1 and m2 defined on Θ = {a, b, c} as:
∎ m1 (a) = 0.99, m1 (b) = 0.01
∎ m2 (c) = 0.99, m2 (b) = 0.01
⊕ (∅) = 0.98, which accurately reflects a very high
Then we get mconf
∎

conflict between the two sources of evidence.

Critical remark
With such a high conflict but still assuming both sources are fully reliable
to proceed with directly applying Demspter’s rule on them (to get
unsatisfactory results) seems irrational!

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 23 / 58
Evidence Combination and Conflict Discounting and Combination Solution

A Solution to Solving Conflict

Main Idea
∎ According to Smets’ two-level view of evidence (Smets, 1994), to make

decisions based on evidence, beliefs encoding evidence must be

transformed into probabilities using the so-called pignistic
transformation.
∎ Guided by this view, we propose to discount a mass function involving
in combination based upon how sure in its decision when it is used
alone for decision making.
∎ More particularly, we provide a method for defining discount rates of
mass functions being combined using the entropy of their corresponding
pignistic probability functions.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 24 / 58
Evidence Combination and Conflict Discounting and Combination Solution

A Solution to Solving Conflict

Ambiguity Measure
∎ Let m1 and m2 be two mass functions on the frame Θ and BetPm1 and
BetPm2 be pignistic probability functions of m1 and m2 , respectively.
∎ For i = 1, 2, we denote

H(mi ) = − ∑ BetP mi (θ) log2 (BetP mi (θ))

θ∈Θ

the Shannon entropy expression of pignistic probability distribution

BetPmi .
∎ This measure has been used in Jousselme et al (2006) as an ambiguity
measure of belief functions.
∎ Clearly, H(mi ) ∈ [0, log2 (∣Θ∣)].

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 25 / 58
Evidence Combination and Conflict Discounting and Combination Solution

A Solution to Solving Conflict

Entropy-based Discount Rate:
∎ The discount rate of BPA mi (i = 1, 2), denoted δ(mi ), is defined by

H(mi )
δ(mi ) =
log2 (∣Θ∣)

∎ That is, the higher uncertainty (in its decision) a source of evidence is,
the higher discount rate it is applied.
General Discounting and Combination Rule:
(1−δ(m1 )) (1−δ(m2 ))
m⊕ = m1 ⊕ m2
(1−δ(mi ))
where ⊕ is a combination operator in general and mi is the
discounted mass function obtaining from mi .

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 26 / 58
Evidence Combination and Conflict Discounting and Combination Solution

Selecting a Combination Rule

The information from dif(B1 , B2 ) and mconf
⊕ (∅) can properly provide
helpful suggestions for conflict management on selecting appropriate
combination rules in some typical situations.

∎ If dif F (m1 , m2 ) = F1 ∪ F2 and A ∩ B = ∅ for any A ∈ Fm1 and

B ∈ Fm2 , we have mconf ⊕ (∅) = 1 and two sources are fully conflict.

ê A discounting and then combination strategy should be applied, where

different attitudes may suggest different combination rules for use.

∎ If dif F (m1 , m2 ) = ∅ and d(m1 , m2 ) > 0, we have mconf

⊕ (∅) = 0: Two
sources qualitatively are not in conflict but having different beliefs
attributed to focal elements.
ê A compromise attitude may suggest to use the trade-off rule (Dubois and
Prade, 1988), or its special case of averaging operator.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 27 / 58
Evidence Combination and Conflict Discounting and Combination Solution

Selecting a Combination Rule

∎ If dif F (m1 , m2 ) =/ ∅, then we have d(m1 , m2 ) > 0.

ê If m⊕ (Θ) = m1 (Θ) × m2 (Θ) > 0 two sources may provide complementary
information each other, and then Dempster’s rule can be applied.
ê If m⊕ (Θ) = 0, two sources may be in a partial conflict and then depending
⊕ (∅) whether it is tolerated and information on meta-belief is
on value mconf
available or not, one may apply discounting and then combination strategy
or a disjunctive combination rule.

3 Reference: Huynh VN, Discounting and combination scheme in evidence

theory for dealing with conflict in information fusion. MDAI 2009,
Springer.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 28 / 58
Applications to Ensemble Learning

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 29 / 58
Applications to Ensemble Learning

D-S Theory and Its Applications

∎ D-S theory has been theoretically well studied and widely applied to
such areas of application as
● Classification, Identification, Recognition
● Decision Making, Expert Systems
● Fault Detection and Failure Diagnosis
● Image Processing, Medical Applications
● Risk and Reliability
● Robotics, Multiple Sensors
● Signal Processing
● Etc.

+ A good collection of references to applications of D-S theory:

[Sentz & Ferson, Combination of Evidence in Dempster-Shafer Theory,
Sandia National Laboratories SAND 2002-0835, 2002]

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 30 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Classifier Combination

Observation
As observed in studies of machine learning systems:
∎ the set of patterns misclassified by different classification systems would
not necessarily overlap.
∎ different classifiers potentially offer complementary information about
patterns to be classified.

Remark: The observation highly motivated the interest in combining

classifiers during the last two decades (Kittler et al., IEEE PAMI 1998).
Combination Scenarios:
∎ All classifiers use the same representation of the input
∎ Each classifier uses its own representation of the input

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 31 / 58
Applications to Ensemble Learning Application to Ensemble Classification

D-S theory in Classifier Combination

∎ Application of D-S theory to classifier combination has received

attention since early 1990s.
∎ In the context of single-class classification problem, the frame of
discernment is often modeled by the set of all possible classes used to
assign to an input pattern.
∎ Given an input pattern, each individual classifier produces an output
considered as a source of information serving for classification of the
input pattern.
∎ These sources of information from all classifiers participating in the
combination process will be combined to make the final decision on the
classification.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 32 / 58
Applications to Ensemble Learning Application to Ensemble Classification

D-S theory in Classifier Combination

∎ Let C = {c1 , c2 , . . . , cM } be the set of classes – the frame of

discernment of the problem.
∎ Assume that we have R classifiers: {ψ1 , . . . , ψR }.
∎ For an input x, each classifier ψi produces an output ψi (x) defined as

ψi (x) = [si1 , . . . , siM ]

where sij indicates the degree of confidence or support in saying that

“the pattern x is assigned to class cj according to classifier ψi .”
ê Note that sij can be a binary value or a continuous numeric value and
its semantic interpretation depends on what type of learning algorithm
used to build ψi .

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 33 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Xu’s Combination Method

∎ Each individual classifier produces a crisp decision on classifying an
input x, which is used as the evidence come from the corresponding
classifier.
∎ Then this evidence is associated with prior knowledge defined in terms
of performance indexes of the classifier to define its corresponding mass
function.
∎ Performance indexes of a classifier are defined by recognition,
substitution and rejection rates obtained by testing the classifier on a
test sample set.

3 Reference: Xu et al., Several methods for combining multiple classifiers

and their applications in handwritten character recognition. IEEE Trans.
SMC 22 (1992).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 34 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Xu’s Combination Method

∎ Let the recognition rate and substitution rate of ψi be ir and is
(usually ir + is < 1, due to the rejection action), respectively
∎ The mass function mi from ψi (x) is defined by
1. If ψi rejected x, i.e. ψi (x) = [0, . . . , 0], mi has only a focal element C with
mi (C) = 1.
2. If ψi (x) = [0, . . . , 0, sij = 1, 0, . . . , 0], then mi ({cj }) = ir , mi (¬{cj }) = is ,
where ¬{cj } = C ∖ {cj }, and mi (C) = 1 − ir − is .
∎ In a similar way one can obtain all mi (i = 1, . . . , R) from R classifiers
ψi (i = 1, . . . , R).
∎ Then Dempster’s rule is applied to combine these mi ’s to obtain a
combined m = m1 ⊕ . . . ⊕ mR , which is used to make the final decision
on the classification of x.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 35 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Rogova’s Combination Method

∎ Used a proximity measure between a reference vector of each class and

a classifier’s output vector.
∎ The reference vector is the mean vector µij of the output set of each
classifier ψi for each class cj .
∎ Then, for any input pattern x, the proximity measures
dij = φ(µij , ψi (x)) are transformed into the following mass functions:

mi ({cj }) = dij , mi (C) = 1 − dij

m¬i (¬{cj }) = 1 − ∏(1 − dik ), m¬i (C) = ∏(1 − dik )
=j
k/ =j
k/

which together constitute the knowledge about cj .

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 36 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Rogova’s Combination Method

∎ Hence, these mi and m¬i are combined to define the evidence from
classifier ψi on classifying x as mi ⊕ m¬i .
∎ Finally, all evidences from all classifiers are combined using Dempster’s
rule to obtain an overall mass function for making the final decision on
the classification.

3 Reference: Rogova, Combining the results of several neural network

classifiers. Neural Networks 7 (1994).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 37 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Al-Ani & Deriche’s Combination Method

∎ The distance between the output classification vector provided by each

single classifier and a reference vector is used to estimate mass
functions.
∎ These mass functions are then combined using Dempster’s rule to
obtain a new output vector that represents the combined confidence in
each class label.
∎ However, instead of defining a reference vector as the mean vector of
the output set of a classifier for a class as in Rogova’s work, it is
measured such that the mean square error (MSE) between the new
output vector obtained after combination and the target vector of a
training data set is minimized.
ê This interestingly makes their combination algorithm trainable.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 38 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Al-Ani & Deriche’s Combination Method

∎ Given an input x, the mass function mi derived from classifier ψi is
defined as follows:

dji
mi ({cj }) =
k=1 di + gi
∑M k
gi
mi (C) =
∑k=1 dki + gi
M

where dji = exp(−∥vji − ψi (x)∥2 ), vji is a reference vector and gi is a

coefficient.
∎ Both vji and gi are estimated via the minimized MSE learning process.

3 Reference: Al-Ani & Deriche, A new technique for combining multiple

classifiers using the Dempster-Shafer theory of evidence, Journal of
Artificial Intelligence Research 17 (2002).
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 39 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Bell’s Combination Method

∎ A new method and technique for representing and combining outputs

from different classifiers for text categorization.
∎ Different from all the above mentioned methods, Bell et al. (2005)
directly used outputs of individual classifiers to define the so-called
2-points focused mass functions.
∎ These 2-points focused mass functions are then combined using
Dempster’s rule to obtain an overall mass function for making the final
classification decision.

3 Reference: Bell, Guan & Bi, On combining classifiers mass functions for
text categorization, IEEE Trans. KDE 17 (2005).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 40 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Bell’s Combination Method

∎ Given an input x, the output ψi (x) from classifier ψi is normalized:
sij
pi (cj ) = M
, for j = 1, . . . , M
∑k=1 sik
∎ Then the collection {pi (cj )}M
j=1 is arranged so that

pi (ci1 ) ≥ pi (ci2 ) ≥ . . . ≥ pi (ciM )

∎ The mass function mi induced from ψi on the classification of x:

mi ({ci1 }) = pi ({ci1 })
mi ({ci2 }) = pi ({ci2 })
mi (C) = 1 − mi ({ci1 }) − mi ({ci2 })

∎ This mass function is called the 2-points focused mass function and the
set {{ci1 }, {ci2 }, C} is referred to as a triplet.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 41 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Le’s Combination Method

∎ Built Naive Bayes classifiers corresponding to distinct representations of

the input.
∎ Then weighted them by their accuracies obtained by testing with a test
sample set, where weighting is modeled by the discounting operator.
∎ Finally, discounted mass functions are combined to obtain the final
mass function which is used for making the classification decision.

3 Reference: Le, Huynh, Shimazu & Nakamori, Combining classifiers for

word sense disambiguation based on Dempster-Shafer theory and OWA
operators, Data & Knowledge Engineering 63 (2007).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 42 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Le’s Combination Method

∎ Let fi be the i-th representation of an input x and classifier ψi building

on fi produces a posterior probability distribution P (⋅∣fi ) on C.
∎ Assume that αi is the weight of ψi defined by its accuracy.
∎ Then the piece of evidence represented by P (⋅∣fi ) is discounted at a
discount rate of (1 − αi ), resulting in a mass function mi defined by

mi ({cj }) = αi × P (cj ∣fi ), for j = 1, . . . , M

mi (C) = 1 − αi

∎ These discounted mass functions are then combined using either

Dempster’s rule or averaging operator.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 43 / 58
Applications to Ensemble Learning Application to Ensemble Classification

Remarks on Le’s Method

∎ This method of weighting clearly focuses on only the strength of
individual classifiers, which is defined by testing them on the designed
sample data set.
∎ Therefore it does not be influenced by an input pattern under
classification.
∎ However, the information quality of soft decisions or outputs provided
by individual classifiers might vary from pattern to pattern.
ê The general discounting and combination strategy for solving conflict
discussed above has been applied to classifier combination.

3 Reference: Huynh, Nguyen & Le, Adaptively entropy-based weighting

classifiers in combination using Dempster-Shafer theory for word sense
disambiguation, Computer Speech and Language 24 (2010).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 44 / 58
Applications to Ensemble Learning Application to Ensemble Clustering

Lattice Intervals

∎ Let (L, ≤) be a lattice.

∎ A (lattice) interval of L is defined as

[a, b] = {x ∈ L∣a ≤ x ≤ b}

for a, b ∈ L and a ≤ b.
∎ Let IL be the set of intervals, including the empty set, of L.
∎ (IL , ⊆) is a lattice with
● meet (⊓) = intersection (∩)
● joint (⊔) defined by [a, b] ⊔ [c, d] = [a ∧ c, b ∨ d]
● least element = ∅L ; greatest element = L

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 45 / 58
Applications to Ensemble Learning Application to Ensemble Clustering

Lattice of Intervals - Example

∎ 4-element lattice L = {x, y, z, t} ∎ Lattice of intervals (IL , ⊆)

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 46 / 58
Applications to Ensemble Learning Application to Ensemble Clustering

Application of Belief Structures on (IL , ⊆)

∎ Application to multi-label classification:

+ Denoeux, Younes & Abdallah, Representing uncertainty on set-valued

variables using belief functions, Artif. Intell. 174 (2010).

∎ Application to ensemble clustering:

+ Masson & Denoeux, Belief functions and cluster ensembles. ECSQARU

2009 (Springer).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 47 / 58
Applications to Ensemble Learning Application to Ensemble Clustering

Partitions of a Finite Set

∎ In clustering, the frame of discernment is the set of all partitions of a

finite data set D, denoted P(D).
∎ This set can be partially ordered using the following relation:
● A partition p is said to be finer than a partition p′ (or, equivalently p′ is
coarser than p) if the clusters of p can be obtained by splitting those of p′ ,
denoted p ⪯ p′ .

∎ The poset (P(D), ⪯) is a lattice.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 48 / 58
Applications to Ensemble Learning Application to Ensemble Clustering

Ensemble Clustering

∎ Ensemble clustering aims at combining the outputs of several clustering

algorithms to form a single clustering structure.
∎ This problem can be addressed using D-S theory by assuming that:
● There exists a “true” partition p∗ .
● Each clusterer provides evidence about p∗ .
● The evidence from multiple clusterers can be combined to draw plausible
conclusions about p∗ .

∎ To implement this scheme, we need to manipulate mass functions, the

focal elements of which are sets of partitions.
∎ This is feasable by restricting ourselves to intervals of the lattice
(P(D), ⪯).

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 49 / 58
An Illustrative Application

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 50 / 58
An Illustrative Application Word Sense Disambiguation

Word Sense Disambiguation

Polysemous Words
A polysemous word has more than one possible meaning (sense). These
senses are determined depending on the context where the word appears.

Example
“Interest”
∎ Context 1: “My guess would be that interest rates will decline
moderately into the spring of 1961”.
∎ Context 2: “A few of his examples are of very great interest, and the
whole discussion of some importance for theory.”

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 51 / 58
An Illustrative Application Word Sense Disambiguation

Word Sense Disambiguation

Polysemous Words
A polysemous word has more than one possible meaning (sense). These
senses are determined depending on the context where the word appears.

WSD
∎ Involving the association of a given word in a text or discourse with a

particular sense among numerous potential senses of that word.

∎ This is an “intermediate task” necessarily to accomplish most natural
language processing tasks.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 51 / 58
An Illustrative Application Word Sense Disambiguation

Word Sense Disambiguation

Polysemous Words
A polysemous word has more than one possible meaning (sense). These
senses are determined depending on the context where the word appears.

WSD as a classification problem:

Given an ambiguous word w:
∎ c1 , c2 ,. . . , cm – possible senses (classes) of w
∎ given N contexts of w, in each of which w is tagged with the right
sense (the training data).
∎ for new occurrence of w in a context C, WSD aims at identifying the
most appropriate sense of w given C.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 51 / 58
An Illustrative Application Multi-Representation of Context

How to Use Context in WSD?

Generally, a given context C can be used in two ways

The bag-of-words approach
the context is considered as words in some window surrounding the target
word w

The relational information based approach

the context is considered in terms of some relation to the target such as
1. distance from the target,
2. syntactic relations,
3. phrasal collocation, etc.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 52 / 58
An Illustrative Application Multi-Representation of Context

How to Use Context in WSD?

Example (Target Word: Interest)

“My[PRP] guess[NN] would[MD] be[VB] that[IN] interest[NN] rates[NNS]
will[MD] decline[VBP] moderately[RB] into[IN] the[DT] spring[NN] of[IN]
1961[CD]”
Bag of words my, guess, rates, decline, . . .
Collocations of words interest rates, interest rates will,
that interest,. . .
collocations of part-of-speech interest [NNS], interest [NNS] [MD],
[IN] interest,. . .
(word, position) (be, -2), (that, -1), (rates, 1), . . .
(part-of-speech tag, position) (VB, -2), (IN, -1), (NNS, 1)

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 52 / 58
An Illustrative Application Discounting-and-Combination Method for WSD

Classifier Combination-First Scenario

Individual Classifiers in Combination

∎ The Naive Bayes (NB),

∎ Maximum Entropy Model (MEM),

∎ Support Vector Machines (SVM).
The selection of these learning methods is basically guided by the direct
use of output results for defining mass functions.

Combination Algorithms
1. Discounting-and-Dempster’s combination algorithm (DCA1 )
2. Discounting-and-averaging combination algorithm (DCA2 )

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 53 / 58
An Illustrative Application Discounting-and-Combination Method for WSD

Classifier Combination-Second Scenario

Individual Classifiers in Combination

∎ The same NB learning algorithm used for individual classifiers, however,

each of which has been built using a distinct set of features

corresponding to a distinct representation of a polysemous word to be
disambiguated.
Note that NB is commonly accepted as one of learning methods represents
state-of-the-art accuracy on supervised WSD (Escudero, 2000).

Combination Algorithms
1. Discounting-and-Dempsert’s combination algorithm (DCA1 )
2. Discounting-and-averaging combination algorithm (DCA2 )

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 54 / 58
An Illustrative Application Experimental Results

The Experimental Results

Bảng: Experimental results for the first scenario of combination

Individual classifiers Combined classifiers

%
NB MEM SVM DCA1 DCA2
Senseval-2 65.6 65.5 63.5 66.3 66.5
Senseval-3 72.9 72.0 72.5 73.3 73.3

Remark:
The results yielded by the discounting-and-averaging combination
algorithm are comparable or even better than that given by the
discounting-and-orthogonal sum combination algorithm, while the former
is computational more simple than the latter.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 55 / 58
An Illustrative Application Experimental Results

The Experimental Results

Bảng: Experimental results for the second scenario of combination

Individual classifiers Combined classifiers

%
C1 C2 C3 C4 C5 C6 DCA1 DCA2
Senseval-2 56.7 54.6 54.7 56.8 56.8 52.5 64.4 65.0
Senseval-3 62.4 62.3 64.1 61.9 63.9 59.5 71.0 72.3

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 56 / 58
An Illustrative Application Experimental Results

The Experimental Results

Bảng: A comparison with the best system in the contests of Senseval-2 and
Senseval-3

Accuracy-based weighting Adaptively weighting

% Best systems
DS1 [Le et al (2007)] DCA2
Senseval-2 64.2 64.7 66.3
Senseval-3 72.9 72.4 73.3

Remark:
Both developed combination algorithms deriving from the discounting and
combination scheme yield an improvement in overall accuracy compared to
previous work for WSD.

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 57 / 58
An Illustrative Application Experimental Results

CẢM ƠN MỌI NGƯỜI

ĐÃ LẮNG NGHE !!!

Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 58 / 58

Random Forest
No ratings yet
Random Forest
32 pages
ML Lesson Plan (21AI63)
No ratings yet
ML Lesson Plan (21AI63)
8 pages
Topic 3 Dealing With Uncertainty Slides
No ratings yet
Topic 3 Dealing With Uncertainty Slides
204 pages
Dempster-Shafer Theory Combination Rule
No ratings yet
Dempster-Shafer Theory Combination Rule
96 pages
DM Ch6 (Classification and Prediction)
No ratings yet
DM Ch6 (Classification and Prediction)
39 pages
Developmental Dyslexia Detection Using Machine Lea
No ratings yet
Developmental Dyslexia Detection Using Machine Lea
7 pages
Dempster-Shafer Theory: Introduction, Connections With Rough Sets and Application To Clustering
No ratings yet
Dempster-Shafer Theory: Introduction, Connections With Rough Sets and Application To Clustering
48 pages
Class Weights Random Forest Algorithm For Processing Class Imbalanced Medical Data
No ratings yet
Class Weights Random Forest Algorithm For Processing Class Imbalanced Medical Data
12 pages
1 s2.0 S0888613X13002958 Main PDF
No ratings yet
1 s2.0 S0888613X13002958 Main PDF
18 pages
Evidence Combination From An Evolutionary Game Theory Perspective
No ratings yet
Evidence Combination From An Evolutionary Game Theory Perspective
31 pages
Notes Artificial Intelligence Unit 3
No ratings yet
Notes Artificial Intelligence Unit 3
17 pages
What Is Uncertainty? Bayesian Reasoning Bias of The Bayesian Method
No ratings yet
What Is Uncertainty? Bayesian Reasoning Bias of The Bayesian Method
26 pages
International Journal of Approximate Reasoning: Johan Schubert
No ratings yet
International Journal of Approximate Reasoning: Johan Schubert
12 pages
Credibility Factor
No ratings yet
Credibility Factor
32 pages
General Fusion Operators From Cox's Postulates
No ratings yet
General Fusion Operators From Cox's Postulates
20 pages
Dempster Shafer Theory
No ratings yet
Dempster Shafer Theory
9 pages
Chapter 13
No ratings yet
Chapter 13
22 pages
UT Dallas Syllabus For cs4375.501 06f Taught by Yu Chung NG (Ycn041000)
No ratings yet
UT Dallas Syllabus For cs4375.501 06f Taught by Yu Chung NG (Ycn041000)
6 pages
Improvement of Proportional Conflict Redistribution Rules of Combination of Basic Belief Assignments
No ratings yet
Improvement of Proportional Conflict Redistribution Rules of Combination of Basic Belief Assignments
27 pages
Dempster-Shafer's Basic Probability Assignment Based On Fuzzy Membership Functions
No ratings yet
Dempster-Shafer's Basic Probability Assignment Based On Fuzzy Membership Functions
9 pages
Fundamentals of Dempster-Shafer Theory: Presented by
No ratings yet
Fundamentals of Dempster-Shafer Theory: Presented by
10 pages
Bayesian Fusion
No ratings yet
Bayesian Fusion
32 pages
Unification of Fusion Theories (UFT)
No ratings yet
Unification of Fusion Theories (UFT)
14 pages
Application of DSmT-ICM With Adaptive Decision Rule To Supervised Classi Cation in Multisource Remote Sensing
No ratings yet
Application of DSmT-ICM With Adaptive Decision Rule To Supervised Classi Cation in Multisource Remote Sensing
11 pages
A Comprehensive Review On Ensemble Solar Power Forecasting AlgorithmsJournal of Electrical Engineering and Technology
No ratings yet
A Comprehensive Review On Ensemble Solar Power Forecasting AlgorithmsJournal of Electrical Engineering and Technology
15 pages
Dempster Shafer
No ratings yet
Dempster Shafer
30 pages
Rockova 19 A
No ratings yet
Rockova 19 A
10 pages
Dempster Shafer Theory
100% (1)
Dempster Shafer Theory
19 pages
On Evidential Combination Rules For Ensemble Classifiers: Henrik Bostr Om Ronnie Johansson Alexander Karlsson
No ratings yet
On Evidential Combination Rules For Ensemble Classifiers: Henrik Bostr Om Ronnie Johansson Alexander Karlsson
8 pages
Flexible Risk Evidence Combination Rules in Breast Cancer Precision Therapy
No ratings yet
Flexible Risk Evidence Combination Rules in Breast Cancer Precision Therapy
16 pages
UNIT IV 4 DempsterShaferTheory
100% (1)
UNIT IV 4 DempsterShaferTheory
19 pages
Unit Vi
No ratings yet
Unit Vi
14 pages
1 s2.0 S2214509522001784 Main
No ratings yet
1 s2.0 S2214509522001784 Main
17 pages
Lecturer 1116
No ratings yet
Lecturer 1116
10 pages
DempsterShafer Theory
No ratings yet
DempsterShafer Theory
9 pages
Machine Learning and Data Analytics Using Python Lab
No ratings yet
Machine Learning and Data Analytics Using Python Lab
36 pages
Non Monotonic Reasoning System
No ratings yet
Non Monotonic Reasoning System
19 pages
ISKE2007 Wang Ping
No ratings yet
ISKE2007 Wang Ping
4 pages
A Methodology To Combine Interval Neutrosophic Focal Elements and Their Basic Probability Assignment in Evidence Theory
No ratings yet
A Methodology To Combine Interval Neutrosophic Focal Elements and Their Basic Probability Assignment in Evidence Theory
8 pages
Uncertainty
No ratings yet
Uncertainty
60 pages
1 s2.0 S1566253523000155 Main
No ratings yet
1 s2.0 S1566253523000155 Main
16 pages
Dempster
No ratings yet
Dempster
2 pages
Time
No ratings yet
Time
13 pages
Credit Card Fraud Detection Based On Improved Variational Autoencoder Generative Adversarial Network
No ratings yet
Credit Card Fraud Detection Based On Improved Variational Autoencoder Generative Adversarial Network
12 pages
Dempster-Shafer Theory
No ratings yet
Dempster-Shafer Theory
4 pages
Compusoft, 3 (9), 1083-1086 PDF
No ratings yet
Compusoft, 3 (9), 1083-1086 PDF
4 pages
Statistical Reasoning: 8.1 Probability and Bayes' Theorem
100% (1)
Statistical Reasoning: 8.1 Probability and Bayes' Theorem
8 pages
Modeling of Belief Functions For Multi-Target Tracking
No ratings yet
Modeling of Belief Functions For Multi-Target Tracking
6 pages
Week11 Uncertainty Expert Systems
No ratings yet
Week11 Uncertainty Expert Systems
28 pages
Dempster Shafer Theory
No ratings yet
Dempster Shafer Theory
7 pages
Tree Vs LSTM For SCM
No ratings yet
Tree Vs LSTM For SCM
17 pages
6 StatisticalReasoning
No ratings yet
6 StatisticalReasoning
3 pages
Certain It y Factor
No ratings yet
Certain It y Factor
18 pages
BUE AI - Lec 03-Expert Systems Uncertainty
No ratings yet
BUE AI - Lec 03-Expert Systems Uncertainty
49 pages
Statistical Reasoning
No ratings yet
Statistical Reasoning
19 pages
Evidence: Function Rather Than A
No ratings yet
Evidence: Function Rather Than A
1 page
Uncertainty in KBS
No ratings yet
Uncertainty in KBS
5 pages
The Power of Deep Learning Techniques For Predicting Student Performance in Virtual Learning Environments A Systematic Literature Review
No ratings yet
The Power of Deep Learning Techniques For Predicting Student Performance in Virtual Learning Environments A Systematic Literature Review
29 pages
CP4252 ML Syllabus
No ratings yet
CP4252 ML Syllabus
4 pages
09 AI Probability Based Expert Systems
No ratings yet
09 AI Probability Based Expert Systems
64 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
24 pages
1 Dempster Shafer Theory
No ratings yet
1 Dempster Shafer Theory
9 pages
MCA 2023 Syllabus - 27-10-2023
No ratings yet
MCA 2023 Syllabus - 27-10-2023
107 pages
Make Up Assignment - Data Science
No ratings yet
Make Up Assignment - Data Science
4 pages
Aiml - Courseplan - Final 04.01.2025
No ratings yet
Aiml - Courseplan - Final 04.01.2025
11 pages
Customer Churn Internship Report PDF
No ratings yet
Customer Churn Internship Report PDF
34 pages
Ensemble Methods Vs Traditional ML Approaches An Empirical Analysis On Web Based Attack Detection in The Context of Industry 5-CHAKIR OUMAIMA
No ratings yet
Ensemble Methods Vs Traditional ML Approaches An Empirical Analysis On Web Based Attack Detection in The Context of Industry 5-CHAKIR OUMAIMA
23 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Review 3DPlanNet Generating 3D Models From 2D Floor Plan Images Using Ensemble Methods
No ratings yet
Review 3DPlanNet Generating 3D Models From 2D Floor Plan Images Using Ensemble Methods
4 pages
MLS 1 - Decision Trees and Random Forests
No ratings yet
MLS 1 - Decision Trees and Random Forests
16 pages
Unit 4 - Ensemble Techniques and Unsupervised Learning
No ratings yet
Unit 4 - Ensemble Techniques and Unsupervised Learning
39 pages
DST Exercise Solutions
No ratings yet
DST Exercise Solutions
3 pages
Session 2 On Discreatization - Binning Notes
No ratings yet
Session 2 On Discreatization - Binning Notes
14 pages
IMP Questions & Ans On ML & CI Using Python
No ratings yet
IMP Questions & Ans On ML & CI Using Python
21 pages
Comments On "A New Combination of Evidence Based On Compromise" by K. Yamada
No ratings yet
Comments On "A New Combination of Evidence Based On Compromise" by K. Yamada
5 pages
Belief Exponential Divergence For D-S Evidence Theory and Its Application in Multi-Source Information Fusion
No ratings yet
Belief Exponential Divergence For D-S Evidence Theory and Its Application in Multi-Source Information Fusion
15 pages
Wa0010
No ratings yet
Wa0010
7 pages
Hca 1
No ratings yet
Hca 1
71 pages
IEEE Xplore Citation Plain Text Download 2025.1.5.19.3.25
No ratings yet
IEEE Xplore Citation Plain Text Download 2025.1.5.19.3.25
3 pages
Belief Function Combination and Conflict Management
No ratings yet
Belief Function Combination and Conflict Management
14 pages
Théorie de La Prueve Generalisée
No ratings yet
Théorie de La Prueve Generalisée
14 pages
A Correlation Coefficient For Belief Functions
No ratings yet
A Correlation Coefficient For Belief Functions
13 pages
An Ensemble Technique To Predict Parkinson's Disease Using Machine Learning Algorithms
No ratings yet
An Ensemble Technique To Predict Parkinson's Disease Using Machine Learning Algorithms
17 pages
Ai Unit Ivm
No ratings yet
Ai Unit Ivm
24 pages
Certainty ND Dempster
No ratings yet
Certainty ND Dempster
13 pages
Internship Report On Machine Learning Techniques
No ratings yet
Internship Report On Machine Learning Techniques
29 pages
Dempster-Shafer Theory
No ratings yet
Dempster-Shafer Theory
17 pages

Tutorial Part2

Uploaded by

Tutorial Part2

Uploaded by

Dempster-Shafer Reasoning with Uncertainty

Theory and Applications

Huỳnh Văn Nam

Japan Advanced Institute of Science and Technology

Đại Học Bách Khoa TP HCM, 22/02/2011

∎ Criticisms on the counterintuitive results of applying Dempster’s

Dempster’s Rule and Conflict

∎ Let m1 and m2 be two mass functions defined on frame Θ.

Dempster’s Rule and Conflict

m⊕ (brain tumor) = Bel⊕ (brain tumor) = 1

Smets’s Rule of Combination

Smets’s rule of combination (2005)

m? (A) ≜ ∑ m1 (B) × m2 (C), ∀A ⊆ Θ

Smets’s Rule of Combination

3 Masses are not renormalized.

Yager’s Rule of Combination

∎ Yager’s solution (1987): the conflict is transferred to the universe Θ.

Yager’s Rule of Combination

3 Yager’s combination rule is commutative but not associative!

Zadeh’s example again:

ê The conflict of sources of information to be combined is treated as

Dubois and Prade’s Rule of Combination

m⊎ (A) = ∑ m1 (B) × m2 (C), ∀A ⊆ Θ

∎ Interpretation: Only one of the two sources of evidence represented by

mH (A) = ∑ m1 (B) × m2 (C) + ∑ m1 (B) × m2 (C)

for any A ⊆ Θ and A =/ ∅, and mH (A) = 0, if A = ∅.

Dubois and Prade’s Rule of Combination

∎ Dubois and Prade’s “hybrid” rule is not associative, but it usually

∎ Many other suggestions have been made, creating a “jungle” of

examples when applying existing combination rules, and then proposed

Liu’s Criteria for Conflict

m⊕ (∅) >  and difBetP(m1 , m2 ) > 

where  ∈ [0, 1] is a threshold of conflict tolerance and difBetP(m1 , m2 )

difBetP(m1 , m2 ) = max(∣BetPm1 (A) − BetPm2 (A)∣)

Liu’s Criteria for Conflict

m1 ({θ1 , θ2 , θ3 , θ4 }) = 1; and m2 ({θ4 , θ5 , θ6 , θ7 }) = 1

Distance Between two Mass Functions

∎ Let B1 = (Fm1 , m1 ) and B2 = (Fm2 , m2 ) be two bodies of evidence on

Difference Between two BoEs

Denote dif F (m1 , m2 ) the symmetric difference between two families of

dif F (m1 , m2 ) = (Fm1 ∖ Fm2 ) ∪ (Fm2 ∖ Fm1 )

∎ If dif F (m1 , m2 ) = Fm1 ∪ Fm2 , and A ∩ B = ∅ for any A ∈ Fm1 and

ê How different between two sources in realization of the question of

Difference Between two BoEs

dif(B1 , B2 ) = ⟨d(m1 , m2 ), dif F (m1 , m2 )⟩

and call it the difference measure of two bodies of evidence.

∎ The conflict between two bodies of evidence originates from either or

Difference Between two BoEs

∎ Liu’s criterion of using difBetP(m1 , m2 ) is somewhat weaker than

m1 ({θ1 , θ2 , θ3 , θ4 }) = 1; and m2 ({θ4 , θ5 , θ6 , θ7 }) = 1

Then, we have d(m1 , m2 ) = 1 whilst difBetP(m1 , m2 ) = 0.75.

∎ We have recently argued that only a part of value m⊕ (∅) should be

⊕ (∅) = 0 whenever two mass functions

Zadeh’s famous example revisited

conflict between the two sources of evidence.

A Solution to Solving Conflict

decisions based on evidence, beliefs encoding evidence must be

A Solution to Solving Conflict

H(mi ) = − ∑ BetP mi (θ) log2 (BetP mi (θ))

the Shannon entropy expression of pignistic probability distribution

A Solution to Solving Conflict

Selecting a Combination Rule

∎ If dif F (m1 , m2 ) = F1 ∪ F2 and A ∩ B = ∅ for any A ∈ Fm1 and

ê A discounting and then combination strategy should be applied, where

∎ If dif F (m1 , m2 ) = ∅ and d(m1 , m2 ) > 0, we have mconf

Selecting a Combination Rule

∎ If dif F (m1 , m2 ) =/ ∅, then we have d(m1 , m2 ) > 0.

3 Reference: Huynh VN, Discounting and combination scheme in evidence

D-S Theory and Its Applications

+ A good collection of references to applications of D-S theory:

Remark: The observation highly motivated the interest in combining

D-S theory in Classifier Combination

∎ Application of D-S theory to classifier combination has received

D-S theory in Classifier Combination

∎ Let C = {c1 , c2 , . . . , cM } be the set of classes – the frame of

ψi (x) = [si1 , . . . , siM ]

m⊕ (∅) > and difBetP(m1 , m2 ) >

where ∈ [0, 1] is a threshold of conflict tolerance and difBetP(m1 , m2 )