Lesson 12 - Trees - 2
Lesson 12 - Trees - 2
Understand the connection between choices and its related possible outcomes through
the lens of ‘Decision Trees’
Draw a Decision Tree for a given related real world application problem
Understand the advantages and disadvantages when it comes to the concept of Decision
Trees
Comprehend and grasp the concept of post order, preorder traversal and ‘spanning
trees’. Along with its applications, especially in networking.
A decision tree is a map of the possible outcomes of a series of related choices. It allows an
individual or organization to weigh possible actions against one another based on their costs,
probabilities, and benefits.
additional nodes, which branch off into other possibilities. This gives it a tree-like shape. An
example is given in Figure 1.
Start with your decision and represent this on the left side of a sheet of paper with a small
square. Then, for each possible option, draw one line out from the square towards the right.
Leave plenty of space between these lines. Write each option on it's line.
2
Then take the lines one at a time. At the end of the line, do you get a particular result, or is it
uncertain or is there another decision to be made? If it is another decision, draw a square. If
uncertain, a circle, and if a result, draw nothing (sometimes triangles are used for results).
Review each square and circle. For the squares (decisions), draw lines for the choices,
marking them in as you go. For the circles (uncertainties) draw further lines for the possible
outcomes. Keep going until you have filled out the possibilities leading from your original
decision.
Evaluation: Now it's time for decision tree analysis to work out which option is most valuable
to you. First, estimate how much each option would be worth to you. (See Figure 5).
Then review each circle/point of uncertainty. Here you determine the probability of each
outcome. Make sure percentages add up to 100, or fractions amount to a total of 1. Your
decision tree diagrams will now look something like this.
3
Figure 5: A Decision Tree Diagram with estimate
Calculate: Start at the left hand values and work to the right. At any circles, multiply the end
value by the probability of it occurring. So in our example, the probability of promotion is
0.7. Multiply this by 80,000 to get 56,000.
Each of the end values is recalculated in this way. It's also useful to add in any costs that will
be incurred along the way. This gives a more accurate picture of the net value.
Advantages
Disadvantages
Decision trees are prone to errors in classification problems with many class and a
relatively small number of training examples.
Decision trees can be computationally expensive to train. The process of growing a
decision tree is computationally expensive.
4
Let us look at another example
Example 13.1
Let's assume we want to play badminton on a particular day — say Saturday — how will you
decide whether to play or not. Let's say you go out and check if it's hot or cold, check the
speed of the wind and humidity, how the weather is, i.e. is it sunny, cloudy, or rainy. You
take all these factors into account to decide if you want to play or not.
So, you calculate all these factors for the last ten days and form a lookup table like the one
below.
5
Now, you may use this table to decide whether to play or not. But, what if the weather pattern
on Saturday does not match with any of rows in the table? This may be a problem. A decision
tree would be a great way to represent data like this because it takes into account all the
possible paths that can lead to the final decision by following a tree-like structure.
Figure 1: Illustrates a learned decision tree. We can see that each node represents an attribute
or feature and the branch from each node represents the outcome of that node. Finally, its the
leaves of the tree where the final decision is made.
2. A company is deciding whether to develop and launch a new product. Research and
development costs are expected to be $400,000 and there is a 70% chance that the
product launch will be successful, and a 30% chance that it will fail. If it is successful,
the levels of expected profits and the probability of each occurring have been
6
estimated as follows, depending on whether the product’s popularity is high, medium
or low:
Probability Profits
High 02 $500,000 per annum for two
years
Medium 0.5 $400,000 per annum for two
years
Low 0.3 $300,000 per annum for two
years
If it is a failure, there is a 0.6 probability that the research and development work can be sold
for $50,000 and a 0.4 probability that it will be worth nothing at all. Calculate the expected
values and provide the recommendation that can be made to management.
13.2Tree Traversal
Ordered rooted trees are often used to store information. We need procedures for visiting
each vertex of an ordered rooted tree to access data.
Ordered rooted trees can also be used to represent various types of expressions, such as
arithmetic expressions involving numbers, variables, and operations.
The different listings of the vertices of ordered rooted trees used to represent expressions are
useful in the evaluation of these expressions.
Traversal Algorithms
Procedures for systematically visiting every vertex of an ordered rooted tree are called
traversal algorithms. We will describe two of the most commonly used such algorithms,
preorder traversaland post order traversal. Each of these algorithms can be defined
recursively.
7
Preorder traversal.
Definition 1:
Let T be an ordered rooted tree with root r. If T consists only of r, then r is the preorder
traversal of T. Otherwise, suppose that T1, T2, ... ,Tn are the subtrees at r from left to rightin
T. The preorder traversal begins by visiting r. It continues by traversing T1 in preorder, then
T2 in preorder, and so on, until Tn is traversed in preorder. Figure 1 displays preorder
traversal.
Example 13.2.1
In which order does a preorder traversal visit the vertices in the ordered rooted tree T shown
in Figure 2?
Solution:
8
The steps of the preorder traversal of T are shown in figure 3. We traverse T in preorder by
first listing the root a, followed by the preorder list of the subtree with root b, the preorder list
of the subtree with root c (which is just c) and the preorder list of the subtree with root d.
The preorder list of the subtree with root b begins by listing b, then the vertices of the subtree
with root e in pre order, and then the subtree with root f in preorder (which is just f). The
preorder list of the subtree with root d begins by listing d, followed by the preorder list of the
subtree with root g, followed by the subtree with root h (which is just h), followed by the
subtree with root i (which is just i).
The preorder list of the subtree with root e begins by listing e, followed by the preorder listing
of the subtree with root j (which is just j), followed by the preorder listing of the subtree with
root k. The preorder listing of the subtree with root g is g followed by l, followed by m. The
pre order listing of the subtree with root k is k, n, o, p. Consequently, the preorder traversal
ofT is a, b, e, j, k, n, o, p, j, c, d, g, l, m, h, i.
9
Figure 3: The Preorder Traversal of T
Postorder traversal.
Definition 2:
Let T be an ordered rooted tree with root r. If T consists only of r, then r is the
postordertraversalof T. Otherwise, suppose that T1, T2, ...,Tn are the subtrees at r from left to
right. The postorder traversal begins by traversing T1 in postorder, then T2 in postorder,
then Tn in postorder, and ends by visiting r.
10
Figure 4: Postal Traversal
Exercise 1 Exercise 2
11
13.3Spanning Trees
Definition 1:
Let G be a simple graph. A spanning tree of G is a subgraph of G that is a tree containing
every vertex of G.
Example 13.3.1
Find a spanning tree of the simple graph G shown in Figure 1.
Solution:
The graph G is connected, but it is not a tree because it contains simple circuits.Remove the
edge {a, e}. This eliminates one simple circuit, and the resulting subgraph is stillconnected
and still contains every vertex of G. Next remove the edge {e, f} to eliminate asecond simple
circuit. Finally, remove edge {c, g} to produce a simple graph with no simplecircuits. This
subgraph is a spanning tree, because it is a tree that contains every vertex of G.
The sequence of edge removals used to produce the spanning tree is illustrated below.
12
THEOREM 1:
A simple graph is connected if and only if it has a spanning tree.
Spanning trees are important in data networking, as the following Example shows.
Example 13.3.2
Spanning trees play an important role in multicasting over Internet Protocol (IP) networks. To
send data from a source computer to multiple receiving computers, each of which is a
subnetwork, data could be sent separately to each computer.
This type of networking, called unicasting, is inefficient, because many copies of the same
data are transmitted over the network. To make the transmission of data to multiple receiving
computers more efficient, IP multicasting is used.
For data to reach receiving computers as quickly as possible there should be no loops (which
in graph theory terminology are circuits or cycles) in the path that data take through the
network.
That is, once data have reached a particular router, data should never return to this router. To
avoid loops, the multicast routers use network algorithms to construct a spanning tree in the
graph that has the multicast source, the routers, and the subnetworks containing receiving
computers as vertices, with edges representing the links between computers and/or routers.
The root of this spanning tree is the multicast source. The subnetworks containing receiving
computers are leaves of the tree.
13
2. In Exercises 1-3 find a spanning tree for the graph shown byremoving edges in simple
circuits.
Exercise 1
Exercise 2
Exercise 3
Suggested Readings:
Chapter 9: Sections 9.3& 9.4, Kenneth Rosen, (2011) Discrete Mathematics and Its
Applications, 7th Edition, McGraw-Hill Education
14