Tutorial Part2
Tutorial Part2
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 1 / 58
Part 2 – Combination and Applications
Evidence Combination and Conflict
Combination Rules: A Review
Conflict Revisited
Difference Between two BoEs
Discounting and Combination Solution
Applications to Ensemble Learning
Application to Ensemble Classification
Application to Ensemble Clustering
An Illustrative Application
Word Sense Disambiguation
Multi-Representation of Context
Discounting-and-Combination Method for WSD
Experimental Results
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 2 / 58
Combination of Evidence in D-S Theory
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 3 / 58
Evidence Combination and Conflict
Outline
Evidence Combination and Conflict
Combination Rules: A Review
Conflict Revisited
Difference Between two BoEs
Discounting and Combination Solution
Applications to Ensemble Learning
Application to Ensemble Classification
Application to Ensemble Clustering
An Illustrative Application
Word Sense Disambiguation
Multi-Representation of Context
Discounting-and-Combination Method for WSD
Experimental Results
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 4 / 58
Evidence Combination and Conflict Combination Rules: A Review
where
κ= ∑ m1 (B) × m2 (C)
B∩C=∅
∎ κ can be interpreted as the combined mass assigned to the empty set
before normalization. So, it is also denoted by m⊕ (∅) and
conventionally considered as the degree of conflict.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 5 / 58
Evidence Combination and Conflict Combination Rules: A Review
8 This result implies complete support for the diagnosis of a brain tumor,
which both doctors believed very unlikely.
ê Many alternative rules of combination have been developed.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 6 / 58
Evidence Combination and Conflict Combination Rules: A Review
The transferable belief model [Smets & Kennes, Artif. Intell. 66 (1994)]:
∎ Justifies the use of belief functions to model subjective, personal beliefs.
∎ In general, in the definition of a mass function, the condition m(∅) = 0
is not required.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 7 / 58
Evidence Combination and Conflict Combination Rules: A Review
3 The conflict is stored in the mass given to the empty set ⇒ the open
world assumption, i.e. the “actual world” (i.e., the true value of variable
X) might not be in Θ.
Zadeh’s example revisited:
∎ m? (brain tumor) = 0.0001
∎ m? (meningitis) = 0
∎ m? (concussion) = 0
∎ m? (∅) = 0.9999
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 8 / 58
Evidence Combination and Conflict Combination Rules: A Review
where
m⊕ (∅) = ∑ m1 (B) × m2 (C)
B∩C=∅
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 9 / 58
Evidence Combination and Conflict Combination Rules: A Review
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 10 / 58
Evidence Combination and Conflict Combination Rules: A Review
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 11 / 58
Evidence Combination and Conflict Combination Rules: A Review
ê a solution more flexible than Yager’s solution for the transfer of the
conflictual masses.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 12 / 58
Evidence Combination and Conflict Combination Rules: A Review
Remarks
Observation
∎ Most of these works usually began with analyzing some counterintuitive
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 13 / 58
Evidence Combination and Conflict Conflict Revisited
m⊕ (∅) as Conflict?
Liu [Artif. Intell. 170 (2006)] argued that value m⊕ (∅) cannot be used as
a measure of conflict between two bodies of evidence but only represents
the mass of uncommitted belief as a result of combination.
Example – Two identical mass functions
Let us consider two identical mass functions m1 = m2 on Θ = {θi }5i=1 :
∎ m1 (θi ) = m2 (θi ) = 0.2 for i = 1, . . . , 5
∎ Then, m⊕ (∅) = 0.8, which is quite high whilst it appears the total
absence of conflict as two mass functions are identical.
Remark:
More generally, we always get m⊕ (∅) > 0 with two identical mass function
whenever their focal elements defines a partition of the frame!
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 14 / 58
Evidence Combination and Conflict Conflict Revisited
and called the distance between betting commitments of the two mass
functions.
+ A comprehensive analysis of combination rules and conflict
management:
[P. Smets, Analyzing the combination of conflicting belief functions,
Information Fusion 8 (2007)].
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 15 / 58
Evidence Combination and Conflict Conflict Revisited
Then, m⊕ (∅) = 0, i.e, these mass functions are not in conflict at all.
However, using the second criterion we easily get:
difBetP(m1 , m2 ) = 0.75
Note that m1 and m2 have assigned, by definition, the total mass exactly
to {θ1 , θ2 , θ3 , θ4 } and {θ4 , θ5 , θ6 , θ7 }, respectively, and to none of the
proper subsets of them. So intuitively these two mass functions are partly
in conflict. Such a partial conflict does not be judged by means of m⊕ (∅)
but difBetP(m1 , m2 ) as shown above.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 16 / 58
Evidence Combination and Conflict Difference Between two BoEs
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 17 / 58
Evidence Combination and Conflict Difference Between two BoEs
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 18 / 58
Evidence Combination and Conflict Difference Between two BoEs
Let us denote
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 19 / 58
Evidence Combination and Conflict Difference Between two BoEs
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 20 / 58
Evidence Combination and Conflict Difference Between two BoEs
Quantifying Conflict
∎ Clearly, mcomb
⊕ (∅) is a part of m⊕ (∅) and intuitively representing the
mass of uncommitted belief as a result of combination rather than a
conflict.
∎ Therefore, the conflict is properly represented by the remainder of
m⊕ (∅), i.e.
△
m⊕ (∅) − mcomb
⊕ (∅) = mconf
⊕ (∅)
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 21 / 58
Evidence Combination and Conflict Difference Between two BoEs
Quantifying Conflict
Remark
With this formulation of conflict, the fact used to question the validity of
Dempster’s rule that two identical probability measures are always
conflicting becomes inappropriate!
Example
Consider again two identical mass functions on Θ = {θi ∣i = 1 . . . 5}:
m1 (θi ) = m2 (θi ) = 0.2 for i = 1, . . . , 5. Then we get mcomb
⊕ (∅) = 0.8 and
m⊕ (∅) = 0, and hence no conflict appears between the two at all.
conf
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 22 / 58
Evidence Combination and Conflict Difference Between two BoEs
Quantifying Conflict
Critical remark
With such a high conflict but still assuming both sources are fully reliable
to proceed with directly applying Demspter’s rule on them (to get
unsatisfactory results) seems irrational!
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 23 / 58
Evidence Combination and Conflict Discounting and Combination Solution
Main Idea
∎ According to Smets’ two-level view of evidence (Smets, 1994), to make
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 24 / 58
Evidence Combination and Conflict Discounting and Combination Solution
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 25 / 58
Evidence Combination and Conflict Discounting and Combination Solution
H(mi )
δ(mi ) =
log2 (∣Θ∣)
∎ That is, the higher uncertainty (in its decision) a source of evidence is,
the higher discount rate it is applied.
General Discounting and Combination Rule:
(1−δ(m1 )) (1−δ(m2 ))
m⊕ = m1 ⊕ m2
(1−δ(mi ))
where ⊕ is a combination operator in general and mi is the
discounted mass function obtaining from mi .
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 26 / 58
Evidence Combination and Conflict Discounting and Combination Solution
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 27 / 58
Evidence Combination and Conflict Discounting and Combination Solution
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 28 / 58
Applications to Ensemble Learning
Outline
Evidence Combination and Conflict
Combination Rules: A Review
Conflict Revisited
Difference Between two BoEs
Discounting and Combination Solution
Applications to Ensemble Learning
Application to Ensemble Classification
Application to Ensemble Clustering
An Illustrative Application
Word Sense Disambiguation
Multi-Representation of Context
Discounting-and-Combination Method for WSD
Experimental Results
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 29 / 58
Applications to Ensemble Learning
∎ D-S theory has been theoretically well studied and widely applied to
such areas of application as
● Classification, Identification, Recognition
● Decision Making, Expert Systems
● Fault Detection and Failure Diagnosis
● Image Processing, Medical Applications
● Risk and Reliability
● Robotics, Multiple Sensors
● Signal Processing
● Etc.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 30 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Classifier Combination
Observation
As observed in studies of machine learning systems:
∎ the set of patterns misclassified by different classification systems would
not necessarily overlap.
∎ different classifiers potentially offer complementary information about
patterns to be classified.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 31 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 32 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 33 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 34 / 58
Applications to Ensemble Learning Application to Ensemble Classification
∎ Let the recognition rate and substitution rate of ψi be ir and is
(usually ir + is < 1, due to the rejection action), respectively
∎ The mass function mi from ψi (x) is defined by
1. If ψi rejected x, i.e. ψi (x) = [0, . . . , 0], mi has only a focal element C with
mi (C) = 1.
2. If ψi (x) = [0, . . . , 0, sij = 1, 0, . . . , 0], then mi ({cj }) = ir , mi (¬{cj }) = is ,
where ¬{cj } = C ∖ {cj }, and mi (C) = 1 − ir − is .
∎ In a similar way one can obtain all mi (i = 1, . . . , R) from R classifiers
ψi (i = 1, . . . , R).
∎ Then Dempster’s rule is applied to combine these mi ’s to obtain a
combined m = m1 ⊕ . . . ⊕ mR , which is used to make the final decision
on the classification of x.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 35 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 36 / 58
Applications to Ensemble Learning Application to Ensemble Classification
∎ Hence, these mi and m¬i are combined to define the evidence from
classifier ψi on classifying x as mi ⊕ m¬i .
∎ Finally, all evidences from all classifiers are combined using Dempster’s
rule to obtain an overall mass function for making the final decision on
the classification.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 37 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 38 / 58
Applications to Ensemble Learning Application to Ensemble Classification
dji
mi ({cj }) =
k=1 di + gi
∑M k
gi
mi (C) =
∑k=1 dki + gi
M
3 Reference: Bell, Guan & Bi, On combining classifiers mass functions for
text categorization, IEEE Trans. KDE 17 (2005).
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 40 / 58
Applications to Ensemble Learning Application to Ensemble Classification
mi ({ci1 }) = pi ({ci1 })
mi ({ci2 }) = pi ({ci2 })
mi (C) = 1 − mi ({ci1 }) − mi ({ci2 })
∎ This mass function is called the 2-points focused mass function and the
set {{ci1 }, {ci2 }, C} is referred to as a triplet.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 41 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 42 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 43 / 58
Applications to Ensemble Learning Application to Ensemble Classification
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 44 / 58
Applications to Ensemble Learning Application to Ensemble Clustering
Lattice Intervals
[a, b] = {x ∈ L∣a ≤ x ≤ b}
for a, b ∈ L and a ≤ b.
∎ Let IL be the set of intervals, including the empty set, of L.
∎ (IL , ⊆) is a lattice with
● meet (⊓) = intersection (∩)
● joint (⊔) defined by [a, b] ⊔ [c, d] = [a ∧ c, b ∨ d]
● least element = ∅L ; greatest element = L
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 45 / 58
Applications to Ensemble Learning Application to Ensemble Clustering
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 46 / 58
Applications to Ensemble Learning Application to Ensemble Clustering
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 47 / 58
Applications to Ensemble Learning Application to Ensemble Clustering
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 48 / 58
Applications to Ensemble Learning Application to Ensemble Clustering
Ensemble Clustering
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 49 / 58
An Illustrative Application
Outline
Evidence Combination and Conflict
Combination Rules: A Review
Conflict Revisited
Difference Between two BoEs
Discounting and Combination Solution
Applications to Ensemble Learning
Application to Ensemble Classification
Application to Ensemble Clustering
An Illustrative Application
Word Sense Disambiguation
Multi-Representation of Context
Discounting-and-Combination Method for WSD
Experimental Results
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 50 / 58
An Illustrative Application Word Sense Disambiguation
Polysemous Words
A polysemous word has more than one possible meaning (sense). These
senses are determined depending on the context where the word appears.
Example
“Interest”
∎ Context 1: “My guess would be that interest rates will decline
moderately into the spring of 1961”.
∎ Context 2: “A few of his examples are of very great interest, and the
whole discussion of some importance for theory.”
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 51 / 58
An Illustrative Application Word Sense Disambiguation
Polysemous Words
A polysemous word has more than one possible meaning (sense). These
senses are determined depending on the context where the word appears.
WSD
∎ Involving the association of a given word in a text or discourse with a
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 51 / 58
An Illustrative Application Word Sense Disambiguation
Polysemous Words
A polysemous word has more than one possible meaning (sense). These
senses are determined depending on the context where the word appears.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 51 / 58
An Illustrative Application Multi-Representation of Context
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 52 / 58
An Illustrative Application Multi-Representation of Context
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 52 / 58
An Illustrative Application Discounting-and-Combination Method for WSD
Combination Algorithms
1. Discounting-and-Dempster’s combination algorithm (DCA1 )
2. Discounting-and-averaging combination algorithm (DCA2 )
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 53 / 58
An Illustrative Application Discounting-and-Combination Method for WSD
Combination Algorithms
1. Discounting-and-Dempsert’s combination algorithm (DCA1 )
2. Discounting-and-averaging combination algorithm (DCA2 )
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 54 / 58
An Illustrative Application Experimental Results
Remark:
The results yielded by the discounting-and-averaging combination
algorithm are comparable or even better than that given by the
discounting-and-orthogonal sum combination algorithm, while the former
is computational more simple than the latter.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 55 / 58
An Illustrative Application Experimental Results
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 56 / 58
An Illustrative Application Experimental Results
Bảng: A comparison with the best system in the contests of Senseval-2 and
Senseval-3
Remark:
Both developed combination algorithms deriving from the discounting and
combination scheme yield an improvement in overall accuracy compared to
previous work for WSD.
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 57 / 58
An Illustrative Application Experimental Results
Huỳnh Văn Nam (JAIST) Evidence Theory and Applications HCMUT, Feb. 2011 58 / 58