0% found this document useful (0 votes)

53 views

ANSWERS TO 15-381 Final, Spring 2004: Friday May 7, 2004

This document contains answers to questions from a final exam for the 15-381 class at Carnegie Mellon University in Spring 2004. It provides the answers to 20 short questions (labeled a through t) testing knowledge of topics like search algorithms, machine learning classifiers, probability, and game theory. It also contains answers to two longer questions involving solving the N-queens problem using hill climbing, simulated annealing and genetic algorithms.

Uploaded by

tigger

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views

ANSWERS TO 15-381 Final, Spring 2004: Friday May 7, 2004

Uploaded by

tigger

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Name: ______________ Andrew Id (Block Capitals):

ANSWERS TO 15-381 Final, Spring 2004

Friday May 7, 2004

1. Place your name and your andrew email address on the front page.
2. You may use any and all notes, as well as the class textbook. Keep in mind, however, that this final was designed
in full awareness of such. You may NOT use the Internet, but you can use a calculator.
3. We only require that you provide the answers. We don’t need to see your work.
4. The maximum possible score on this exam is 100. You have 180 minutes.
5. Good luck!

Question Score
1 20
2 6
3 6
4 9
5 14
6 9
7 9
8 9
9 9
10 9
Total 100

1
1 Short Questions
(a) When you run Waltz algorithm on the following drawing, which of the following statements is true? Circle the
correct answer.
(i) The algorithm will label all edges uniquely.
(ii) The algorithm will report that some edges are ambiguous.
(iii) The algorithm will report that the image cannot be labeled consistently.
ANSWER: (ii)

(b) How many degrees of freedom does a rigid 3-d object have if it moves in a 3-d space?
ANSWER: 6

(i) It generates a random move from the moveset, and accepts this move.
(ii) It generates a random move from the whole state space, and accepts this move.
(iii) It generates a random move from the moveset, and accepts this move only if this move improves the
evaluation function.
(iv) It generates a random move from the whole state space, and accepts this move only if this move improves
the evaluation function.
ANSWER: (iii)

(d) Suppose you are using a genetic algorithm. Show the children of the following two strings if single point
crossover is performed with a cross-point between the 4th and the 5th digits:
1 4 6 2 5 7 2 3 and 8 5 3 4 6 7 6 1
ANSWER: 1 4 6 2 6 7 6 1 and 85345723

(e) What is the entropy of these examples: 1 3 2 3 1 3 3 2

ANSWER: 1.5

(f) Which of the following is the main reason of pruning a decision tree? Circle the correct answer.
(i) to save computational cost
(ii) to avoid over-fitting
(iii) to make the training error smaller

ANSWER: (ii)

(g) Which of the following does the Naive Bayes classifier assume? Circle the correct answer.
(i) All the attributes are independent.
(ii) All the attributes are conditionally independent given the output label.
(iii) All the attributes are jointly dependent to each other.

2
ANSWER: (ii)

(h) By which of the following networks can XOR function be learned? Circle the correct answer.
(i) linear perceptron
(ii) single layer Neural Network
(iii) 1-hidden layer Neural Network
(iv) none of the above
ANSWER: (iii)

(i) If we use K-means on a finite set of samples, which of the following statement is true? Circle the correct answer.
(i) K-means is not guaranteed to terminate.
(ii) K-means is guaranteed to terminate, but is not guaranteed to find the optimal clustering.
(iii) K-means is guaranteed to terminate and find the optimal clustering.
ANSWER: (ii)
(j) In the worst case, what is the number of nodes that will be visited by Breadth-First Search in a (non-looping)
tree with depth d and branching factor b?

ANSWER: O(bd )
(k) True or False : If a search tree has cycles, A* Search with an inadmissible heuristic might never converge when
run on that tree.

ANSWER: False
(l) Circle the Nash Equilibria in the following matrix-form game:
ANSWER:
Player 2

D E F

A 0, 1 3, 5 2, 1
Player 1

B 6, 3 1, 3 5, 2

C 4, 2 3, 4 7, 7

(m) Assume the following zero-sum game, where player 1 is the maximizer:
ANSWER:

3
Player 2
C D

A 2 0
Player 1

B 0 1

If Player 1 chooses strategy A with probability p, and if Player 2 always plays strategy C, what is the expected
value of the game?
ANSWER: 2p + 0(1 p) = 2p
(n) In the mixed strategy Nash equilibrium for the above game, with what probability does Player 1 use strategy A?
p
2 = 1(1 p)
p
3 =1
ANSWER: p = 1=3
(o) True or False : In a second-price, sealed bid auction, it is optimal to bid your true value. There is no advantage
to bluffing.

ANSWER: True
(p) How many values does it take to represent the joint distribution of 4 boolean variables?

ANSWER: 16
(q) If P(A) = 0.3, P(B) = 0.4, and P(AjB) = 0.6
(a) What is P(A^B)?
ANSWER: P (A ^ B ) = P (AjB ) P (B ) = 0:24
(b) What is P(BjA)?
ANSWER: P (B jA) = P (APjB(A)P) (B ) :
=08

(c) Are A and B independent?

ANSWER: No

4
(r) For the following questions, use the diagram below. If you do not have enough information to answer a question,
answer False.

A B

(a) True or False : A ? B

ANSWER: True
(b) True or False : A ? C
ANSWER: False
(c) True or False : I< A; fC g; B >
ANSWER: False
(d) True or False : I< C; fAg; B >
ANSWER: False
(s) True or False : Policy iteration will usually converge to a better policy than value iteration.
ANSWER: False
(t) True or False : For a densely connected MDP with many actions, policy iteration will generally converge faster
than value iteration.
ANSWER: True

5
2 Hill Climbing, Simulated Annealing and Genetic Algorithm
The N-queens problem requires you to place N queens on an N-by-N chessboard such that no queen attacks another
queen. (A queen attacks any piece in the same row, column or diagonal.) Here are some important facts:
We define the states to be any configuration where the N queens are on the board, one per column.
The moveset includes all possible states generated by moving a single queen to another square in the same
column. The function to obtain these states is called the successor function.
The evaluation function Eval(state) is the number of non-attacking pairs of queens in this state. (Please note
it is the number of NON-attacking pairs. )
In the following questions, we deal with the 6-queens problem (N=6).
1. How many possible states are there totally?
ANSWER: 66

2. For each state, how many successor states are there in the moveset?
ANSWER: 30

3. What value will the evaluation function Eval() return for the current state shown below?

ANSWER: 9

4. If you use Simulated Annealing (currently T=3), and the current state and the random next state are shown
below, will you accept this random next state immediately? or accept it with some probability? If it is the latter
case, what is the probablity?

c u r r e n t s t a t e r a n d o m n e x t s t a t e

ANSWER: For the current state, E 1 = 9. For the next state, E 2 = 6. So E 2 < E 1.
P = exp( (E 1 E 2)=T ) = exp( (9 =
6) 3)) = 1 =e
We will accept the next state with probability 1=e.

6
5. Suppose you use a Genetic Algorithm. The current generation includes four states, S 1 through S 4. The evalu-
ation values for each of the four states are: Eval(S 1) = 9, Eval(S 2) = 12, Eval(S 3) = 11, Eval(S 4) = 8.
Calculate the probability that each of them would be chosen in the ”selection” step (also called ”reproduction”
step).

E v a l ( S 1 ) = 9

E v a l ( S 2 ) = 1 2

E v a l ( S 3 ) = 1 1 E v a l ( S 4 ) = 8

ANSWER: The probabilities are 9/40, 12/40, 11/40, 8/40

6. In a Genetic Algorithm, each state of 6-queens can be represented as 6 digits, each indicating the position of
the queen in that column. Which action in genetic algorithm (among fselection, cross-over, mutationg) is most
similar to the successor function described in previous page?
ANSWER: Mutation

7
3 Cross Validation
Suppose you are running a majority classifier on the following training set. The training set is shown below. It consists
of 10 data points. Each data point has a class label of either 0 or 1. A majority classifier is defined to output the class
label that is in the majority in the training set, regardless of the input. If there is a tie in the training set, then always
output class label 1.

1. What is the training error? (report the error as a ratio)

ANSWER: 5/10

2. What is the leave-one-out Cross-Validation error? (report the error as a ratio)

ANSWER: 10/10

3. What is the two-fold Cross-Validation error? Assume the left 5 points belong to one partition while the right 5
points belong to the other partition. (report the error as a ratio)
ANSWER: 8/10

8
4 Probabilistic Reasoning/Bayes Nets
1. If A and B are independent then A is independent of B . True or False?
Show the work supporting your answer. You might find the following statements useful:

p(A _ B ) = p(A) + p(B ) p(A ^ B )

p( A) = 1 p(A)
A^ B = (A _ B )
p(AjX ) = 1 p( AjX )
p(B jX ) = 1 p( B jX )
Answer: True

P ( A^ B ) = P ( (A _ B )) = 1 P (A _ B ) = 1 P (A) P (B ) + p(A ^ B )
= P ( A) P (B ) + p(A)p(B ) = P ( A) P (B )(1 P (A))
= P ( A)(1 P (B )) = P ( A)P ( B )

(c) is the absence of A still independent of the absence of B? (yes, no)

ANSWER: Yes, the knowledge about B still doesn’t tell us anything about A (unless we know C).
(d) construct a Bayes Net to show the relationships of A, B and C. Indicate the necessary CPTs (Conditional
Probability Tables)

ANSWER:
P(A)=.8 P(B)=.6

A B

p(C | A ^ B) = 1
C p(C | ~A ^ B) = 1
p(C | A ^ ~B) = 1
p(C|~A ^ ~B) = 0

9
(e) is A conditionally independent of B given C? (yes, no)

ANSWER: No, A and B are dependent given C since C unblocks the path between A and B.
(f) suppose you know that C came to class, what is the probability of A coming if you know that B showed
up too?

ANSWER: 0.8
Since B coming to class could fully explain the appearence of C, the probability of P (AjB ^ C ) = P (A) =
:8. The result can also be obtained from the probabilities:

P (A ^ B ^ C ) P (A)P (B )P (C jA ^ B )
P (AjB ^ C )
P (B ^ C ) P (A ^ B ^ C ) + P ( A ^ B ^ C )
= =

P (A)P (B )
P (C jA ^ B )P (A)P (B ) + P (C j A ^ B )P ( A)P (B )
=

P (A)
= P (A)
P (A) + P ( A)
=

10
5 Neural Networks
1. Draw a linear perceptron network and calculate corresponding weights to correctly classify 4 points below. The
output node returns 1 if the weighted sum is greater than or equal to the threshold (.5). If it looks too complicated
you are probably wrong. You are allowed to make use of a ”constant 1” unit input.

x y out
0 0 1
0 1 1
1 0 0
1 1 1

ANSWER:

x
w1

w2
y out

The dataset is linearly separable as shown below. Any set of weights such that w1 < 0, w3 2 (0:5; :5 w1 ]
and w2 maxf0; :5 w1 w3 g would have been a correct solution to the problem. A common solution was
w1 = 1, w2 = 1, w3 = 1.

2. Is it possible to modify the network so that it will classify both - the dataset above and the one below with 100%
accuracy?

x y out
0 0 0
0 1 1
1 0 1
1 1 1

(a) by changing weights? (yes,no)

ANSWER: NO, no neural network is able to perfectly classify a dataset that has conflicting labels (for
example, record 0 0 is labeled 1 in the first set and 0 in the second)
(b) by adding more layers? (yes,no)
ANSWER: NO, the same reason as above.

11
6 Naive Bayes
Assume we have a data set with three binary input attributes, A, B, C, and one binary outcome attribute Y. The three
input attributes, A, B, C take values in the set f0,1g while the Y attribute takes values in the set fTrue, Falseg.

A B C Y
0 1 1 True
1 1 0 True
1 0 1 False
1 1 1 False
0 1 1 True
0 0 0 True
0 1 1 False
1 0 1 False
0 1 0 True
1 1 1 True

Statistics about the Data Set

NOTE: We believe that some (but not all) of these statistics will be useful to you

Y=T Y=F Y=True Y=False

The fraction with A=0 is 4/6 1/4 Among records with A=0 4/5 1/5
The fraction with A=1 is 2/6 3/4 Among records with A=1 2/5 3/5
The fraction with B=0 is 1/6 2/4 Among records with B=0 1/3 2/3
The fraction with B=1 is 5/6 2/4 Among records with B=1 5/7 2/7
The fraction with C=0 is 3/6 0/4 Among records with C=0 3/3 0/3
The fraction with C=1 is 3/6 4/4 Among records with C=1 3/7 4/7

If we are using a Naive Bayes Classifier with one binary valued output variable Y , the following theorem is true:

Theorem: A non-impossible set of input values, S , (i.e. a set of input values with P (S ) > 0) will have an unambigu-
ous predicted classification of Y = True , P (Y = True ^ S ) > P (Y = False ^ S )

1. How would a Naive Bayes classifier classify the record (A=1,B=1,C=0)? (True/False)

ANSWER: True

P (A = 1jY T rue)P (B = 1jY T rue)P (C = 0jY

2 5 3 6 1
= = = T rue)P (Y = T rue) = =
6 6 6 10 12

P (A = 1jY F alse)P (B = 1jY F alse)P (C = 0jY

3 2 0 4
= = = F alse)P (Y = F alse) = =0
4 4 4 10

and 1=12 > 0.

2. How would a Naive Bayes classifier classify the record (A=0,B=0,C=1)? (True/False)

ANSWER: FALSE

P (A = 0jY T rue)P (B = 0jY T rue)P (C = 1jY 61 63 106 = 0:0333

4
= = = T rue)P (Y = T rue) =
6

P (A = 0jY F alse)P (B = 0jY F alse)P (C = 1jY 42 44 104 = 0:05

1
= = = F alse)P (Y = F alse) =
4

and :03(3) < 0:05.

12
3. How would a Naive Bayes classifier classify the record (A=0,B=0,C=0)? (True/False)

ANSWER: TRUE

P (A = 0jY T rue)P (B = 0jY T rue)P (C = 0jY 61 63 106 = 301

4
= = = T rue)P (Y = T rue) =
6

P (A = 0jY F alse)P (B = 0jY F alse)P (C = 0jY 42 40 104 = 0

1
= = = F alse)P (Y = F alse) =
4

and 1
30
> 0.
4. Would it be possible to add just one record to the data set that would result in a Naive Bayes classifier changing
its classification of the record (A=1,B=0,C=1)? (Yes/No)

ANSWER: No

With the current data set the record (A=1,B=0,C=1) classifies to False since:
P (A = 1jY = T rue)P (B = 0jY = T rue)P (C = 1jY = T rue)P (Y = T rue) = 62 61 36 106 = 0:0166
< P (A = 1jY = F alse)P (B = 0jY = F alse)P (C = 1jY = F alse)P (Y = F alse) = 43 42 44 104 = 0:15
If the record we add has Y = T rue then the estimate of
P (A = 1jY = T rue)P (B = 0jY = T rue)P (C = 1jY = T rue)P (Y = T rue) would increase the most if the
added record also has A=1, B=0, and C=1 in which case the estimate of it would become 37 27 47 11
7
= 0:0445.
However this value is still less than, 0.15, the unchanged estimate of
P (A = 1jY = F alse)P (B = 0jY = F alse)P (C = 1jY = F alse)P (Y = F alse).
If the record we add has Y = F alse then the estimate of
P (A = 1jY = F alse)P (B = 0jY = F alse)P (C = 1jY = F alse)P (Y = F alse) would decrease the most if
the added record has A=0, B=1, and C=0 in which case the estimate of it would become 35 25 45 11
5
= 0:0873
which would still be greater than 0.0166 the unchanged estimate of P (A = 1jY = T rue)P (B = 0jY =
T rue)P (C = 1jY = T rue)P (Y = T rue).

13
7 Decision Tree
For this problem we will use the same data set below as in the Naive Bayes question. Again assume we have three
binary input attributes, A, B, C, and one binary outcome attribute Y. The three input attributes, A, B, C take values in
the set f0,1g while the Y attribute takes values in the set fTrue, Falseg.

A B C Y
0 1 1 True
1 1 0 True Specific Conditional Entropies
1 0 1 False H(YjA=0)=0.72 H(YjA=0,B=0)=0.00 H(YjA=1,C=0)=0.00
1 1 1 False H(YjA=1)=0.97 H(YjA=0,B=1)=0.81 H(YjA=1,C=1)=0.81
0 1 1 True H(YjB=0)=0.92 H(YjA=1,B=0)=0.00 H(YjB=0,C=0)=0.00
0 0 0 True H(YjB=1)=0.86 H(YjA=1,B=1)=0.92 H(YjB=0,C=1)=0.00
0 1 1 False H(YjC=0)=0.00 H(YjA=0,C=0)=0.00 H(YjB=1,C=0)=0.00
1 0 1 False H(YjC=1)=0.99 H(YjA=0,C=1)=0.92 H(YjB=1,C=1)=0.97
0 1 0 True
1 1 1 True

1. Which attribute (A,B, or C) has the highest information gain?

ANSWER: Attribute C since
H (Y jA) = P (A = 0) H (Y jA = 0) + P (A = 1) H (Y jA = 1) = 105 0:72 + 105 0:97 = 0:845
H (Y jB ) = P (B = 0) H (Y jB = 0) + P (B = 1) H (Y jB = 1) = 103 0:92 + 107 0:86 = 0:878
H (Y jC ) = P (C = 0) H (Y jC = 0) + P (C = 1) H (Y jC = 1) = 103 0:00 + 107 0:99 = 0:693
thus H (Y ) H (Y jX ) will be greatest when X=C.
2. Construct the full decision tree for this problem, without doing any pruning. If at any point you have a choice
between splitting on two equally desirable attributes, choose the one that comes first alphabetically. If there is a
tie on how to label a leaf, then choose True.

ANSWER:

Note the first attribute split on in the case C=1 is B instead of A since
H (Y jA; C = 1) = P (A = 0jC = 1)H (Y jA = 0; C = 1) + P (A = 1jC = 1)H (Y jA = 1; C = 1) =
3
0:92 + 47 0:81 = :857
> H (Y jB; C = 1) = P (B = 0jC = 1)H (Y jB = 0; C = 1) + P (B = 1jC = 1)H (Y jB = 1; C = 1) =
7

2
7
0:00 + 57 :97 = :693

14
3. How would your decision tree classify the record (A=0,B=0,C=1)? (True/False)
ANSWER: FALSE

4. How would your decision tree classify the record (A=1,B=0,C=0)? (True/False)
ANSWER: TRUE

5. If you pruned all nodes from your decision tree except the root node, now how would your decision tree classify
the record (A=0,B=0,C=1)? (True/False) Again, assume any ties are broken by choosing True. (NOTE: This
was clarified during the exam to mean that the tree would split on one attribute and then classify)
ANSWER: FALSE

6. If you pruned all nodes from your decision tree except the root node, now how would your decision tree classify
the record (A=1,B=0,C=0)? (True/False) Again, assume any ties are broken by choosing True.
ANSWER: TRUE

15
8 K-Means
The circles in the numbered boxes below represent the data points. In the first numbered box there are three squares,
representing the initial location of cluster centers of the k-means algorithm. Trace through the first nine iterations of
the k-means algorithm or until convergence is reached, whichever comes first. For each iteration draw three squares
corresponding to the location of the cluster centers during that iteration. (NOTE: It is not necessary to draw the exact
location of the squares, but it should be clear from your placement of the squares that you understand how k-means
performs quantitatively)

ANSWER:

16
9 Reinforcement Learning
9.1 Q-Learning
Perform Q-learning for a system with two states and two actions, given the following training examples. The discount
factor is = 0:5 and the learning rate is = 0:5. Assume that your Q-table is initialized to 0.0 for all values.

(Start = S1 , Action = a1 , Reward = 10, End = S2 )

S1 S2
a1 5.0 0.0
a2 0.0 0.0

(Start = S2 , Action = a2 , Reward = -10, End = S1 )

S1 S2
a1 5.0 0.0
a2 0.0 -3.75

(Start = S1 , Action = a2 , Reward = 10, End = S1 )

S1 S2
a1 5.0 0.0
a2 6.25 -3.75

(Start = S1 , Action = a1 , Reward = 10, End = S1 )

S1 S2
a1 9.0625 0.0
a2 6.25 -3.75

What is the policy that Q-learning has learned?

(1) = a1 (2) = a1

17
9.2 Certainty Equivalent Learning
In the diagram below, draw the state transitions and label them according to the values that would be discovered by
Certainty Equivalent learning given follwing training examples.
(Start = S1 , Action = a1 , Reward = 10, End = S2 )
(Start = S2 , Action = a2 , Reward = -10, End = S1 )
(Start = S1 , Action = a2 , Reward = 10, End = S1 )
(Start = S1 , Action = a1 , Reward = 10, End = S1 )
(Start = S1 , Action = a2 , Reward = 10, End = S1 )
(Start = S1 , Action = a1 , Reward = 10, End = S2 )
(Start = S2 , Action = a1 , Reward = -10, End = S2 )
(Start = S2 , Action = a2 , Reward = -10, End = S2 )
(Start = S2 , Action = a2 , Reward = -10, End = S1 )

1/3
2/3

a1 a1 1.0
2/3

S1 S2
a2 a2
R = +10 R = −10

1.0 1/3

What is the policy that CE has learned?

(1) = a2 (2) = a2

18
10 Markov Decision Processes
You are a wildy implausible robot who wanders among the four areas depicted below. You hate rain and get a reward
of -30 on any move that starts in the deck and -40 on any move that starts in the Garden. You like parties, and you are
indifferent to kitchens.

S S

Deck Party
CL CC
R = −30 R = +20

CC CL

CL CC

Kitchen Garden
CC CL
R=0 R = −40

S S

Actions: All states have three actions: Clockwise (CL), Counter-Clockwise (CC), Stay (S). Clockwise and Counter-
Clockwise move you through a door into another room, and Stay keeps you in the same location. All transitions have
are deterministic (probability 1.0).

1. How many distinct policies are there for this MDP?

34 = 81

2. Let J ? (Room) = expected discounted sum of future rewards assuming you start in “Room” and subsequently
act optimally. Assuming a discount factor = 0:5, give the J ? values for each room.

By eyeballing the problem and quickly checking the CL and S options in the
Kitchen, you can quickly determine that the optimal policy is:

(Deck) = CL, (Party) = S, (Kitchen) = S, (Garden) = CC

and write

J ? (Deck) = 30 + 0:5J ? (Party)

J ? (Party) = +20 + 0:5J ?(Party)
J ? (Kitchen) = 0 + 0:5J ?(Kitchen)
J ? (Garden) = 40 + 0:5J ? (Party)

Solving for the Js gives:

19
J ? (Deck) = 10
J ? (Party) = 40
J ? (Kitchen) = 0
J ? (Garden) = 20

3. The optimal policy when the discount factor, , is small but non-zero (e.g. :
= 0 1) is different from the optimal
policy when is large (e.g. = 0:9).
If we began with = 0:1, and then gradually increased , what would be the threshold value of above which
the optimal policy would change?

As the discount factor increases,the policy changes from S in the Kitchen to

CL. This change occurs at the point where the value of taking action S in the
Kitchen is equal to the value of taking action CL:

J S (Kitchen) = 0 + J S (Kitchen) = 0
J CL(Kitchen) = 0 + J ? (Deck)

We already know that the optimal policy in the Deck is CL, regardless of the
discount factor:

J ? (Deck) = 30 + J ? (Party)

and we know

J ? (Party) = +20 + J ? (Party)

J ? (Party) = 120
So we want to solve

( 30 + ( 120 )) = 0
( 120 ) = 0
30 +
20 = 30(1 )
= 3=5 = 0:6

SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
Predicting The Churn in Telecom Industry
No ratings yet
Predicting The Churn in Telecom Industry
9 pages
Exam1 s16 Sol
No ratings yet
Exam1 s16 Sol
10 pages
Final s05 Sols
No ratings yet
Final s05 Sols
20 pages
Practice Final CS61c
No ratings yet
Practice Final CS61c
19 pages
Exam 21
No ratings yet
Exam 21
17 pages
endsem_ML_regular_AK
No ratings yet
endsem_ML_regular_AK
7 pages
UMBC CMSC 471 Final Exam,: 1. True/False (20 Points)
No ratings yet
UMBC CMSC 471 Final Exam,: 1. True/False (20 Points)
6 pages
Rec5_Solns
No ratings yet
Rec5_Solns
14 pages
Midterm Spring13
No ratings yet
Midterm Spring13
10 pages
AI-Final
No ratings yet
AI-Final
11 pages
Exam1 f19 Sol
No ratings yet
Exam1 f19 Sol
9 pages
Exam2 s15 Sol
No ratings yet
Exam2 s15 Sol
10 pages
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
No ratings yet
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
27 pages
Exam1 s15 Sol
No ratings yet
Exam1 s15 Sol
10 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
30 pages
Document
No ratings yet
Document
4 pages
Final 2001f
No ratings yet
Final 2001f
18 pages
Fa11 Final
No ratings yet
Fa11 Final
21 pages
Auronova Consulting
No ratings yet
Auronova Consulting
8 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
ML Midterm Question Pool
No ratings yet
ML Midterm Question Pool
7 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Data Mining f20 Practice Final Solutions
No ratings yet
Data Mining f20 Practice Final Solutions
8 pages
T 2 BSM20
No ratings yet
T 2 BSM20
7 pages
Hili'l: University of Ghana
No ratings yet
Hili'l: University of Ghana
5 pages
Final Exam, 10701 Machine Learning, Spring 2009: Max. Score Score 1 2 3 4 5 6 7 8 9 10
No ratings yet
Final Exam, 10701 Machine Learning, Spring 2009: Max. Score Score 1 2 3 4 5 6 7 8 9 10
25 pages
AI sp12 Final Solutions
No ratings yet
AI sp12 Final Solutions
19 pages
Midterm 2002
No ratings yet
Midterm 2002
10 pages
UMBC CMSC 671 Final Exam: December 20, 2009
No ratings yet
UMBC CMSC 671 Final Exam: December 20, 2009
8 pages
10-601 Machine Learning Midterm Exam Fall 2011: Tom Mitchell, Aarti Singh Carnegie Mellon University
No ratings yet
10-601 Machine Learning Midterm Exam Fall 2011: Tom Mitchell, Aarti Singh Carnegie Mellon University
16 pages
CS 7641 CSE/ISYE 6740 Mid-Term Exam 2 (Fall 2016) Solutions: 1 Probability and Bayes' Rule (14 PTS)
No ratings yet
CS 7641 CSE/ISYE 6740 Mid-Term Exam 2 (Fall 2016) Solutions: 1 Probability and Bayes' Rule (14 PTS)
12 pages
Algorithm-Independent Learning
No ratings yet
Algorithm-Independent Learning
10 pages
Statistical Methods for ML
No ratings yet
Statistical Methods for ML
24 pages
CS325 Artificial Intelligence - Spring 2013 Midterm Solution Guide
No ratings yet
CS325 Artificial Intelligence - Spring 2013 Midterm Solution Guide
11 pages
12f-601-Midterm Machine Learning
No ratings yet
12f-601-Midterm Machine Learning
21 pages
Sample Questions Answers
No ratings yet
Sample Questions Answers
8 pages
IML-IITKGP - Assignment 7 Solution
No ratings yet
IML-IITKGP - Assignment 7 Solution
8 pages
601 sp09 Midterm Solutions
No ratings yet
601 sp09 Midterm Solutions
14 pages
SMAI End 2015 S
No ratings yet
SMAI End 2015 S
4 pages
15-381 Spring 2007 Final Exam SOLUTIONS
No ratings yet
15-381 Spring 2007 Final Exam SOLUTIONS
18 pages
Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions
No ratings yet
Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions
8 pages
Midterm Solutions
No ratings yet
Midterm Solutions
8 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
Final: CS 189 Spring 2013 Introduction To Machine Learning
No ratings yet
Final: CS 189 Spring 2013 Introduction To Machine Learning
9 pages
MLPUE1 Solution
No ratings yet
MLPUE1 Solution
9 pages
PracticeSolution 1
No ratings yet
PracticeSolution 1
15 pages
Machine 2020 Jul-Dec Practice 7,8
No ratings yet
Machine 2020 Jul-Dec Practice 7,8
37 pages
sem_5-2022
No ratings yet
sem_5-2022
12 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
MLRECT2 Solution
No ratings yet
MLRECT2 Solution
9 pages
AML Winter 2021 Solution
No ratings yet
AML Winter 2021 Solution
6 pages
Introduction To Machine Learning IIT KGP Week 2
100% (1)
Introduction To Machine Learning IIT KGP Week 2
14 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Finalexam New
No ratings yet
Finalexam New
9 pages
Sample Mid Term ACI
No ratings yet
Sample Mid Term ACI
3 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Trial Exam 2021 With Solutions
No ratings yet
Trial Exam 2021 With Solutions
10 pages
Probab 10
No ratings yet
Probab 10
3 pages
Sat Mathematics Review And Practice
From Everand
Sat Mathematics Review And Practice
Addison Shaw
1/5 (1)
Lange and Sippel MachineLearning Hydrology
No ratings yet
Lange and Sippel MachineLearning Hydrology
26 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
12 pages
Human Activity Recognization
No ratings yet
Human Activity Recognization
80 pages
Machine Learning Unit 2 MCQ
No ratings yet
Machine Learning Unit 2 MCQ
17 pages
DMDT53 002
No ratings yet
DMDT53 002
159 pages
08 Tree Advanced
No ratings yet
08 Tree Advanced
68 pages
Self-Quiz Unit 5 - Attempt Review
No ratings yet
Self-Quiz Unit 5 - Attempt Review
6 pages
Data Mining (Banking)
No ratings yet
Data Mining (Banking)
8 pages
Chapter 7 Supervised Learning
No ratings yet
Chapter 7 Supervised Learning
71 pages
EE2211 CheatSheet
No ratings yet
EE2211 CheatSheet
15 pages
Data Mining Techniques For Weather Prediction A Review
No ratings yet
Data Mining Techniques For Weather Prediction A Review
6 pages
DL Unit 1
No ratings yet
DL Unit 1
20 pages
Brainheaters Notes: SERIES 313-2018 (A.Y
No ratings yet
Brainheaters Notes: SERIES 313-2018 (A.Y
69 pages
Linux Programming and Data Mining Lab Manual
No ratings yet
Linux Programming and Data Mining Lab Manual
97 pages
Decision Fania Bab 6 Akhir
No ratings yet
Decision Fania Bab 6 Akhir
12 pages
Applied AI - Machine Learning Course Syllabus PDF
No ratings yet
Applied AI - Machine Learning Course Syllabus PDF
22 pages
Xgboostcomp
No ratings yet
Xgboostcomp
21 pages
Lab 12 Introduction To Rapidminer/Weka.: Objective
No ratings yet
Lab 12 Introduction To Rapidminer/Weka.: Objective
24 pages
Stop Explaining
No ratings yet
Stop Explaining
20 pages
Chapter 2 - Logistic Regression
No ratings yet
Chapter 2 - Logistic Regression
88 pages
Applications of Machine Learning Methods in Traffic Crash Severity Modelling Current Status and Future Directions
No ratings yet
Applications of Machine Learning Methods in Traffic Crash Severity Modelling Current Status and Future Directions
26 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
10 pages
Dicretization and Conc Hierarchy Details
No ratings yet
Dicretization and Conc Hierarchy Details
4 pages
11-A Risk Assessment and Prediction Framework For Diabetes Mellitus Using Machine Learning Algorithms
No ratings yet
11-A Risk Assessment and Prediction Framework For Diabetes Mellitus Using Machine Learning Algorithms
12 pages
DMDW Notes
100% (1)
DMDW Notes
62 pages
Soil Nutrient Analysis (1)
No ratings yet
Soil Nutrient Analysis (1)
9 pages
Machine Learning & Deep Learning Prodegree
No ratings yet
Machine Learning & Deep Learning Prodegree
6 pages
Based On URL Feature Extraction
No ratings yet
Based On URL Feature Extraction
6 pages
Unit 3 Data
No ratings yet
Unit 3 Data
37 pages

ANSWERS TO 15-381 Final, Spring 2004: Friday May 7, 2004

Uploaded by

ANSWERS TO 15-381 Final, Spring 2004: Friday May 7, 2004

Uploaded by

Name: ________________________________ Andrew Id (Block Capitals): __________________

ANSWERS TO 15-381 Final, Spring 2004

Friday May 7, 2004

(e) What is the entropy of these examples: 1 3 2 3 1 3 3 2

(c) Are A and B independent?

(a) True or False : A ? B

ANSWER: The probabilities are 9/40, 12/40, 11/40, 8/40

1. What is the training error? (report the error as a ratio)

2. What is the leave-one-out Cross-Validation error? (report the error as a ratio)

p(A _ B ) = p(A) + p(B ) p(A ^ B )

Other solutions such as showing that P ( Aj  B ) = P ( A) were also possible.

(c) is the absence of A still independent of the absence of B? (yes, no)

(a) by changing weights? (yes,no)

Statistics about the Data Set

Y=T Y=F Y=True Y=False

P (A = 1jY T rue)P (B = 1jY T rue)P (C = 0jY

P (A = 1jY F alse)P (B = 1jY F alse)P (C = 0jY

and 1=12 > 0.

P (A = 0jY T rue)P (B = 0jY T rue)P (C = 1jY 61 63 106 = 0:0333

P (A = 0jY F alse)P (B = 0jY F alse)P (C = 1jY 42 44 104 = 0:05

and :03(3) < 0:05.

P (A = 0jY T rue)P (B = 0jY T rue)P (C = 0jY 61 63 106 = 301

P (A = 0jY F alse)P (B = 0jY F alse)P (C = 0jY 42 40 104 = 0

1. Which attribute (A,B, or C) has the highest information gain?

(Start = S1 , Action = a1 , Reward = 10, End = S2 )

(Start = S2 , Action = a2 , Reward = -10, End = S1 )

(Start = S1 , Action = a2 , Reward = 10, End = S1 )

(Start = S1 , Action = a1 , Reward = 10, End = S1 )

What is the policy that Q-learning has learned?

What is the policy that CE has learned?

1. How many distinct policies are there for this MDP?

(Deck) = CL, (Party) = S, (Kitchen) = S, (Garden) = CC

J ? (Deck) = 30 + 0:5J ? (Party)

Solving for the Js gives:

As the discount factor increases,the policy changes from S in the Kitchen to

J ? (Party) = +20 + J ? (Party)

You might also like

Name: ______________ Andrew Id (Block Capitals):

Other solutions such as showing that P ( Aj B ) = P ( A) were also possible.

(Deck) = CL, (Party) = S, (Kitchen) = S, (Garden) = CC