0% found this document useful (0 votes)
17 views

Lecture1 Intro Streaming

The document discusses a lecture on algorithms and complexity. It covers the topics of streaming algorithms, sublinear time algorithms, and distributed algorithms. It also describes a testing algorithm for list sortedness, a streaming algorithm for counting distinct elements, and mentions an algorithm for all-pairs shortest paths.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Lecture1 Intro Streaming

The document discusses a lecture on algorithms and complexity. It covers the topics of streaming algorithms, sublinear time algorithms, and distributed algorithms. It also describes a testing algorithm for list sortedness, a streaming algorithm for counting distinct elements, and mentions an algorithm for all-pairs shortest paths.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Lecture 1: Introduction

Logistics
• Prerequisites:
Algorithms + Complexity
or
Probability + Computational Models with grade
Logistics
• Grade:
• 70% exam
• 30% HW assignments (5-6)
• 5 bonus points for participating in Mentimeter quiz during class
• Participate sin at least 11 (out of 13) quizzes

• Office hours: email me ([email protected])


Logistics

IF YOU DON’T FEEL WELL, STAY HOME


What Is This Course About?
• Traditional models of computing:
Algorithm:

data workspace
This Course
• Part I: Streaming Algorithms
• Part II: Sublinear-Time Algorithms
• Part III: Distributed Algorithms
Streaming Algorithms

Algorithm
(workspace)

data

Goal: compute
… approximately, w.h.p.
Streaming Algorithms
• Useful when:
• Data really is a stream
• Many cases where it’s not
Sublinear-Time Algorithms
Algorithm

𝑛
𝑥 ∈ {0 , 1}
?
?
?

Goal: compute
… approximately, w.h.p.
9
One Current Example….
Distributed Algorithms
data4

data1
data5

data3

data2

Goal: compute
… approximately, w.h.p.
Course Goals
• See some cool algorithms and lower bounds
• Get a “feel” for randomized algorithms and probability
Today: a Tasting Menu
• One sublinear-time algorithm
• One streaming algorithm
• One distributed algorithm
Testing List Sortedness in
Sublinear Time
[Ergün, Kannan, Kumar, Rubinfeld, Viswanathan ‘00]
List Sortedness
• Input: a list of integers
• Output: is sorted?
For every :

• Can’t answer without reading the entire list


• What can we do?
Property Testing

universe

NO

YES
???
Property
“close to ”
Need to change at
most of the object to
get
“far from ”
Property Testing (Formally)
Given and a property , distinguish between:
•,
• is -far from :
for all we have , where = “edit distance”

17
Back to Sortedness
• “-close to sorted”?
• Need to change at most values to get a sorted list
Naïve Attempt
• Sample uniformly random indices and verify
• How large should be?
• Bad example:

• How far from sorted?


• How large ?

• What about checking pairs, , for random ?


Actual Algorithm
Repeat times:
• Sample uniform index
• Perform binary search for the value
• If binary search ends at position different from – reject
Finally: accept
Example

3 2 1 6 5 4 9 8 7 12 11 10 15 14 13 16
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Correctness
• Need to show:
• If is sorted, we accept w.h.p.
• If is -far from sorted, we reject w.h.p.

• Say is a good index if binary search for ends up in position


• Claim: the elements at good indices are sorted!
Proof of Claim
• Let be good indices
• Let = last common index in the binary search for and for
• Then:
Using the Claim
• Suppose is -far from sorted
at most good indices in
(otherwise: replace just the bad indices)
at least bad indices in , so:

• How many samples needed to find a bad index w.h.p.?


Streaming Algorithm for Distinct
Elements
[Flajolet, Martin’84]
Distinct Elements
• , where
• Naïve solution?

• Claim: can’t do it deterministically with bits


• Another claim: can’t do it exactly with bits. But…
• Can get -multiplicative approximation in bits
Lower Bound of for Exact, Deterministic
Algorithms
Flajolet-Martin Algorithm
• Choose random hash function
• Define sequence of events over , of increasing probability:
Event

Event Use the smallest


Event event that occurred
Event
to estimate the
number of elements
Flajolet-Martin Algorithm
• Sequence of events
• Event : the binary encoding of the number ends with zeroes
Mentimeter Experiment
Flajolet-Martin Algorithm
• Let
• Let be a random hash function*
• To process :
• Let number of trailing zeroes in binary representation of

• Output:
Analysis of Flajolet-Martin
Analysis of Flajolet-Martin
Space Complexity
The Hash Function
• Pairwise-independence: for every and ,

• Example: for every prime , the family

is pairwise-independent.
• Representing ?
Improving the Accuracy
• Result must be of the form
• High variance
• How to improve?
Distributed Algorithm for All-
Pairs Shortest Paths

You might also like