5 - Online - Algorithms in Algorithms
5 - Online - Algorithms in Algorithms
slides credit: Marius Minea Dan Sheldon, Akshay Krishnamurthy, Andrew McGregor
Module 4: Online Algorithms and Competitive
Analysis
• PART 5.1: Types of algorithms
• PART 5.2: Buy vs. rent
• PART 5.3: Lost Cow Problem
• PART 5.4: Secretary Problem
• PART 5.5: MTF list Problem
Offline Algorithms
• The competitive ratio of an algorithm, is defined as the worst-case ratio of its cost divided
by the optimal cost, over all possible inputs.
• The competitive ratio of an online problem is the best competitive ratio achieved by an
online algorithm.
• The competitive ratio of an online algorithm ALG is the worst case (i.e., maximum) over
possible futures σ of the ratio:
ALG(σ)/OPT(σ),
•where ALG(σ) represents the cost of ALG on σ and OPT(σ) is the least possible cost on
σ
Buy vs Rent Problem
Renter’s Dilema
Ski-Rental Problem
Buy vs Rent Problem
• The Ski-Rental Problem
• If you knew in advance how many times t you would ski in your life then the
choice of whether to rent or buy is simple.
• If you will ski more than y times then buy before you start, otherwise always rent.
• This type of strategy, with perfect knowledge of the future, is known as an offline
strategy.
Buy vs Rent Problem
• In practice, you don't know how many times you will ski. What should you do?
• An online strategy will be a number k such that after renting k-1 times you will buy skis
(just before your kth visit).
• Claim: Setting k = y guarantees that you never pay more than twice the cost of the offline
strategy.
• Example: Assume y=7$ Thus, after 6 rents, you buy. Your total payment: 6+7=13$.
Buy vs Rent Problem
• Competitive Ratio
• Proof: when you buy skis in your kth visit, even if you quit right after this time, t ≥ y.
•What is competitive ratio of the algorithm that says “buy right away”?
•The worst case is we only go skiing once. Here the ratio is 500/50 = 10.
•Here’s a nice strategy: rent until you realize you should have bought, then buy. (In our case:
rent 9 times, then buy).
•Let’s call this algorithm better-late-than-never. Formally, if the rental cost is r and the
purchase cost is p then the algorithm is to rent ⌈p/r ⌉ − 1 times and then buy.
The Ski-Rental Problem
•When balancing small incremental costs against a big one-time cost, you
want to delay spending the big cost until you have accumulated roughly
the same amount in small costs.
Exercise
Old McDonald lost his favorite cow. It was last seen marching towards a
junction leading to two infinite roads. None of the witnesses can say if the cow
picked the left or the right route.
The Lost Cow Problem
• It is also known as the marriage problem, the sultan's dowry problem, the best choice
problem, etc.
Secretary Problem
• The Secretary Problem also known as marriage problem, the sultan’s dowry problem, and the best
choice problem is an example of Optimal Stopping Problem.
• Problem Statement: Imagine an administrator who wants to hire the best secretary out of n rankable
applicants for a position.
• The applicants are interviewed one by one in random order. A decision about each particular applicant
is to be made immediately after the interview.
• During the interview, the administrator can rank the applicant among all applicants interviewed so far,
but is unaware of the quality of yet unseen applicants.
Continue…
• The question is about the optimal strategy (stopping rule) to maximize
the probability of selecting the best applicant.
Optimal Stopping
• In mathematics, the theory of optimal stopping or early stopping is concerned with the
problem of choosing a time to take a particular action, in order to maximize an expected reward
or minimize an expected cost.
• If the decision to hire an applicant was to be taken in the end of interviewing all the n
candidates, a simple solution is to use maximum selection algorithm of tracking the running
maximum (and who achieved it), and selecting the overall maximum at the end.
• The difficult part in this problem is that the decision must be made immediately after
interviewing a candidate.
1/e law of Optimal Strategy
• According to this strategy the optimal win probability is always at least 1/e.
• The optimal stopping rule prescribes always rejecting the first n/e applicants that are
interviewed (where e is the base of the natural logarithm and has the value 2.71828)
• Then stopping at the first applicant who is better than every applicant interviewed so far (or
continuing to the last applicant if this never occurs).
• This strategy is called the 1/e stopping rule because the probability to select the best candidate is
1/e, in other words this strategy selects the best candidate about 37% of time.
Sample Space and Selection Space
• If you think carefully, it might seem obvious that one cannot select the first candidate because the first
candidate has no one to compare with.
• A better strategy is to chose few candidates as sample to set the benchmark for remaining candidates.
• So the sample will be rejected and will only be used for setting benchmark.
• If a sample is too small, we don’t get information enough for setting the benchmark for remaining
candidates.
• If a sample is too large, though we get plenty of information but we have also rejected too many of the
potential candidates. This leaves us with very few candidates to choose from, and hence making the strategy a
poor one.
Continue…
• The best strategy is to choose the perfect or optimal sample size (ideal sample size) which can
be done using 1/e law that is rejecting n/e candidates (this n/e is the sample size).
The optimal sample size and Probability of success
• where x = k / n
Continue…(Prob. Of Success is aprox. 0.368)
Paging Algorithm
Paging Algorithm
RAM
CPU
Paging Algorithms
• Data brought from slow memory into small fast
memory (cache) of size k
• Sequence of requests: equal size pages
• Hit: page in cache,
• Fault: page not in cache
Minimizing Paging Faults
• On a fault evict a page from cache
• Paging algorithm ≡ Eviction policy
p1 p2 p3 p4 p5 p1 p2 p3 p4 p5 p1 p2 p3 p4 … 8 faults
•
Paging- Cache Replacement Policies
Problem Statement:
•There are two levels of memory:
– fast memory M1 consisting of k pages (cache)
– slow memory M2 consisting of n pages (k < n).
• Pages in M1 are a strict subset of the pages in M2.
• Pages are accessible only through M1 .
• Accessing a page contained in M1 has cost 0.
• When accessing a page not in M1, it must first be brought in from M2 at a cost of 1 before it can
be accessed. This event is called a page fault.
Paging- Cache Replacement Policies
How to choose a page to evict each time a page fault occurs in a way that minimizes the
total number of page faults over time?
Paging- An Optimal Offline Algorithm
Paging- An Optimal Offline Algorithm
Online Paging Algorithms
Online Paging Algorithms
Online Paging Algorithms
Paging- a bound for any deterministic online algorithm
Paging- a bound for any deterministic online algorithm
MTF List Problem
Typically when we solve a problem we assume that we know all the data a priori.
However, in many situations the input is only presented to us as we proceed.
Definition:
The competitive-ratio of algorithm A is CA if for any n > N0
and for any sequence Rn,
where c is independent of n.
Definition 1:
An online algorithm Aon is -competitive if for all input
sequences C Aon ( ) C OPT ( )
In order to evaluate the online strategy we will compare its performance with that of
the best offline algorithm.
This is also called competitive analysis.
Definition 2:
An online algorithm Aon is -competitive if for all input
sequences
C Aon ( ) C OPT ( ) c
Definition
Input: linked list x1 x2 xl
a sequence I of requested accesses I 1 2 n
where i, i x1 , x2 , , x. l
The cost of accessing i is the location of the item in the list counted from
the front.
Given I (online), our objective is to minimize the cost of accessing the items in
the list
The List Accessing Problem
• While processing the accesses we can modify the list in two ways:
• paid transpositions: at any time we can swap two adjacent list items
at a cost of 1.
Deterministic Online Algorithms
Move-To-Front (MTF)
Move the requested item to the front of the list.
Transpose (TRANS)
Exchange the requested item with the immediately preceding item in the list
Frequency-Count (FC)
Maintain a frequency count for each item in the list. Items are stored in non-
decreasing order of accesses. After item is accessed its frequency counter is
updated and item moved forward (if necessary) to maintain list order.
We will prove the following two facts:
Theorem 1:
The Move-To-Front algorithm is 2-competitive. c 2
Theorem 2:
Let A be a deterministic online algorithm for the List Accessing Problem. If A is c-
competitive, then .
Pay attention to the fact that in theorem 2 we prove a lower bound to the
competitiveness.
Proof 1:
Definitions: The potential function : For anyt , 1 t n
(t) = The number of inversions in Move-To-
t
Front’s list with respect to OPT’s list, after
is served.
An inversion is a pair x,y of items such that x
occurs before y in Move-To-Front’s list and
after y in OPT’s list.
For any 1 t n
C MTF ( t ), C OPT ( t ) actual cost incurred
then C
t 1
MTF (t ) (n) (0) 2COPT (t ) n
t 1
Let OPT be the optimum static offline algorithm. OPT first sorts the
items in the list in order of nonincreasing request frequencies and
then serves I without making any exchanges.
If the list is sorted by request frequencies, the worst case is that all
frequencies are n/l (then we didn’t gain anything from sorting).
Thus accesses costs: l l
l 1
i 1
i f i i 1
i n
l n
2
We can take instead of OPT the static offline algorithm because we prove a lower
bound.
c OPT c STATIC A
Each request is made to the item that is stored at the last position in A’s list. n requests,
each will cause cost l, lead us to the cost nl.
If the frequencies are not equal the cost will be lower, because then we’ll put the more
frequent items closer to the beginning, causing more cheap accesses and less expensive
accesses.
Rearranging the list cost at most l(l-1)/2. Then the requests in I can be served at a
cost of at most n(l+1)/2.
Thus
l2 l 1 l 2 l 1
COPT ( I ) n(l 1) / 2 l (l 1) / 2 n nl
2 2 2 2l
3
2 l
2 COPT ( I ) nl C A ( I )
l 1 l 1
c 2 l 21 .
The theorem follows because the competitive ratio must hold for all list lengths.
Reference