2019 EDAPS19 Tutorial Bayesian Optimization Final Update v1
2019 EDAPS19 Tutorial Bayesian Optimization Final Update v1
IDLAB
-3- BO EXAMPLE
DEPARTMENT OF INFORMATION TECHNOLOGY
IDLAB
-3- BO EXAMPLE
GAUSSIAN PROCESS (GP)
• Definition
a Gaussian Process (GP) is a collection of random variables,
any finite number of which have a joint Gaussian distribution.
6
GAUSSIAN PROCESS - HEARTBEAT
80
heartbeats/min
70
60
0 12 24
time
60 70 80 @12:00
heartbeats/min 7
GAUSSIAN PROCESS - HEARTBEAT
80
80
heartbeats/min
heartbeats/min
70
70
60
60
0 12 24 0 6 12 18 24
time time
80
draw 2 samples from
Gaussian prior
heartbeats/min
@18:00
70
@1
8:0
60
0 :00
@06
60 70 80
heartbeats/min 8
@06:00
GAUSSIAN PROCESS - HEARTBEAT
80
80
80
heartbeats/min
heartbeats/min
heartbeats/min
70
70
70
60
60
60
0 12 24 0 12 24 0 12 24
time time time
9
GAUSSIAN PROCESS - HEARTBEAT
3 observations
(evidence)
80
80
heartbeats/min
heartbeats/min
heartbeats/min
70
70
70
60
60
60
0 12 24 0 12 24 0 12 24
time time time
10
GAUSSIAN PROCESS - HEARTBEAT
3 observations
(evidence)
Gaussian Process
mean + variance
function
80
heartbeats/min
70
60
0 12 24
time
11
a Gaussian Process knows what it doesn’t know
GAUSSIAN PROCESS - HEARTBEAT
80
80
80
heartbeats/min
heartbeats/min
heartbeats/min
70
70
70
60
60
60
0 12 24 0 12 24 0 12 24
time time time
80
80
80
heartbeats/min
heartbeats/min
heartbeats/min
70
70
70
60
60
60
0 12 24 0 12 24 0 12 24
time time time
12
slide: Michael Pearce, University of Warwick & Juergen Branke, Warwick Business School
GAUSSIAN PROCESS
13
GAUSSIAN PROCESS
14
GAUSSIAN PROCESS
15
GAUSSIAN PROCESS
joint distribution
all Gaussians J :
marginal distribution
16
GAUSSIAN PROCESS
17
GAUSSIAN PROCESS
• non-parametric
• data-driven
18
GAUSSIAN PROCESS
20
GAUSSIAN PROCESS
• non-parametric
• data-driven
• auto-hyperparametrization
• learnt from data
• Interpretable
21
GAUSSIAN PROCESS
• non-parametric
• data-driven
• auto-hyperparametrization
• learnt from data
• Interpretable
23
GAUSSIAN PROCESS
classification example
GP NN RF
24
GAUSSIAN PROCESS
example RF
GP NN
25
GAUSSIAN PROCESS
example NN RF
GP
26
GAUSSIAN PROCESS
example NN RF
GP
Achilles’ heel of ML
NN & RF : over-confidence far from training data
27
GAUSSIAN PROCESS
example NN RF
GP
Achilles’ heel of ML
NN & RF : over-confidence ➜ untrustworthy
Achilles’ heel of ML 28
GAUSSIAN PROCESS
example GP NN RF
Key benefit 29
DEPARTMENT OF INFORMATION TECHNOLOGY
IDLAB
-3- BO EXAMPLE
PROBLEM SETTING
• Modern engineering problems
• Large-scale problems
• Many parameters
• Multiple design requirements
• …
[Chinea]
• Difficult to design and optimize
• Expensive simulations / measurements
• no global analytical cost function
31
PROBLEM SETTING
• Global optimization problem
• Given !: Χ → ℝ where Χ ∈ ℝ'
() = argmin ! 1 3. 5. 6 1 <0
1∈2
32
PROBLEM SETTING
• Global optimization problem
• Given !: Χ → ℝ where Χ ∈ ℝ'
() = argmin ! 1 3. 5. 6 1 <0
1∈2
33
BAYESIAN OPTIMIZATION
• 2 key elements
• Surrogate-based approach
= Mathematical model
• Sequential sampling
34
BAYESIAN OPTIMIZATION
• Flow chart
Acquisition function
35
BAYESIAN OPTIMIZATION
Function
Initial samples evaluation
#
" # = "% %&' ! "#
36
BAYESIAN OPTIMIZATION
Function
Initial samples evaluation
#
" # = "% %&' ! "#
Expensive
37
BAYESIAN OPTIMIZATION
Function
Initial samples evaluation
#
" # = "% %&' ! "#
Mathematical
model
38
BAYESIAN OPTIMIZATION
Function
Initial samples evaluation
#
" # = "$ $)& ! "#
39
BAYESIAN OPTIMIZATION
Function
Initial samples evaluation
#
" # = "& &)% ! " #$%
40
BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
" # = "& &)% ! " #$%
41
BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
" # = "& &)% ! " #$%
42
BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
43
BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
Key 2 Key 1
44
BAYESIAN OPTIMIZATION (BO) Key 1
• 2 key elements
Key 1 • Surrogate-based approach
= Mathematical model
Key 2 • Sequential sampling
45
BAYESIAN OPTIMIZATION (BO) Key 1
• 2 key elements
Key 1 • Surrogate-based approach
• Probabilistic surrogate model: Gaussian Process (GP)
• Sequential sampling
46
BO: MODEL BUILDING Key 1
47
BO: MODEL BUILDING Key 1
48
BO: MODEL BUILDING Key 1
49
BO: MODEL BUILDING Key 1
50
BO: MODEL BUILDING Key 1
51
BO: MODEL BUILDING Key 1
Mean
/
1 -
.-"
!",$ % = , +$/
2)* +
Variance
52
BO: MODEL BUILDING Key 1
53
BO: MODEL BUILDING Key 1
54
BO: MODEL BUILDING Key 1
55
BO: MODEL BUILDING - KERNELS Key 1
[Adams]
66
BO: MODEL BUILDING - KERNELS Key 1
69
BO: MODEL BUILDING - KERNELS Key 1
70
BO: MODEL BUILDING - KERNELS Key 1
74
BO: MODEL BUILDING Key 1
75
BO: MODEL BUILDING Key 1
• Bayes’s rule:
the posterior of a model M given data D
is proportional to the likelihood of D given M
multiplied by the prior probability of M
marginal
posterior prior
likelihood
P(D | M ) P(M )
P (M | D) =
P ( D)
76
BO: MODEL BUILDING - INFERENCE Key 1
78
BO: MODEL BUILDING - INFERENCE Key 1
(K +s I ) k ( x, x )
-1
( x ) = k ( x , x ) - k ( x , x1:i )
T
s 2 2
n 1:i
79
BO: MODEL BUILDING - INFERENCE Key 1
(K +s I ) k ( x, x )
-1
( x ) = k ( x , x ) - k ( x , x1:i )
T
s 2 2
n 1:i
computational complexity
• training = O(N³)
• prediction mean = O(N)
• prediction variance = O(N2)
80 80
BO: MODEL BUILDING - EXAMPLE Key 1
81
BO: MODEL BUILDING - EXAMPLE Key 1
[Gonzàlez]
82
BO: MODEL BUILDING - EXAMPLE Key 1
1 curve
83
BO: MODEL BUILDING - EXAMPLE Key 1
3 curves
[Gonzàlez]
84
BO: MODEL BUILDING - EXAMPLE Key 1
100 curves
[Gonzàlez]
85
BO: MODEL BUILDING - EXAMPLE Key 1
many curves
[Gonzàlez]
86
BO: MODEL BUILDING - EXAMPLE Key 1
infinite curves
’ensemble’
[Gonzàlez]
87
BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
Key 2 Key 1
88
BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
Key 2 Key 1
Gaussian Process
89
BAYESIAN OPTIMIZATION (BO) Key 2
• 2 key elements
• Surrogate-based approach
• Gaussian Process (GP)
Key 2 • Sequential sampling
• Acquisition Function
90
BO: ACQUISITION FUNCTION Key 2
• Sequential sampling
• Find a balance between:
• Exploitation
• Sample where the model prediction is high (low)
• Exploration
• Sample where the model uncertainty is high
91
BO: ACQUISITION FUNCTION - EXAMPLE Key 2
Exploitation
Exploration
92
BO: ACQUISITION FUNCTION - EXAMPLE Key 2
Exploitation
Exploration
93
BO: ACQUISITION FUNCTION - EXAMPLE Key 2
Exploitation
Exploration
94
BO: ACQUISITION FUNCTION Key 2
• Sequential sampling
• Find a balance between:
• Exploitation
• Sample where the model prediction is high (low)
• Exploration
• Sample where the model uncertainty is high
95
BO: ACQUISITION FUNCTION - POI Key 2
æ µ ( x ) - f ( xbest ) ö
a ( x ) = P ( x > xbest ) = F çç ÷÷
è s ( x) ø
96
BO: ACQUISITION FUNCTION - POI Key 2
æ µ ( x ) - f ( xbest ) ö
a ( x ) = P ( x > xbest ) = F çç ÷÷
è s ( x) ø
97
BO: ACQUISITION FUNCTION - POI Key 2
æ µ ( x ) - f ( xbest ) ö
a ( x ) = P ( x > xbest ) = F çç ÷÷
è s ( x) ø
Posterior
Standard normal
standard deviation
cumulative distribution function
(CDF)
98
BO: ACQUISITION FUNCTION - POI Key 2
PoI
99
BO: ACQUISITION FUNCTION - EI Key 2
Standard normal
probability density function
(PDF)
100
BO: ACQUISITION FUNCTION - EI Key 2
Exploitation Exploration
101
BO: ACQUISITION FUNCTION Key 2
102
BO: ACQUISITION FUNCTION Key 2
104
BAYESIAN OPTIMIZATION IN A NUTSHELL
• Bayesian Optimization = strategy to transform
xM = arg max f ( x ) unsolvable
xÎC
105
WRAP-UP: BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
Key 2 Key 1
Acquisition function Gaussian Process
106
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
2 samples
107
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
2 samples
108
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
3 samples
109
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
4 samples
110
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
Continue
until ….
111
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
7 samples
112
104
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
Continue
until ….
113
BAYESIAN OPTIMIZATION - EXAMPLE
• Problem: Find minimum of 1D function
11 samples
114
DEPARTMENT OF INFORMATION TECHNOLOGY
IDLAB
-3- BO EXAMPLE
BENDED INTERCONNECTION
• Described in [Gazda, 2010]
• Simulated in ADS Momentum
• Objective function:
Bend
118
BENDED INTERCONNECTION – INIT
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
Bend
120
BENDED INTERCONNECTION – INIT
• Design parameters
• Width ϵ [0.5 - 2.1] mm
• Length ϵ [ 45 - 55] mm
• Goal: Minimize differential-to-common-mode conversion
//1
Bend
+()*
1 1 //1
TDMCM = & ,-.// 0 + ,-.1/ 0 40
'()*
121
BENDED INTERCONNECTION – INIT
• Design parameters
• Width ϵ [0.5 - 2.1] mm
• Length ϵ [ 45 - 55] mm
• Goal: Minimize differential-to-common-mode conversion
//1
Bend
+()*
1 1 //1
TDMCM = & ,-.// 0 + ,-.1/ 0 40
'()*
122
BENDED INTERCONNECTION – INIT
• Design parameters
• Width ϵ [0.5 - 2.1] mm
• Length ϵ [ 45 - 55] mm
• Goal: Minimize differential-to-common-mode conversion
//1
Bend
+()*
1 1 //1
TDMCM = & ,-.// 0 + ,-.1/ 0 40
'()*
expensive
123
BENDED INTERCONNECTION – INIT
• Design parameters
• Width ϵ [0.5 - 2.1] mm
• Length ϵ [ 45 - 55] mm
• Goal: Minimize differential-to-common-mode conversion
//1
Bend
+()*
1 1 //1
TDMCM = & ,-.// 0 + ,-.1/ 0 40
'()*
Key 1
Gaussian Process
BENDED INTERCONNECTION – GP Key 1
• GP Model Building
• Covariance Function: Matérn 3/2
= able to model a wide class of functions (non-differentiable ones)
3 " − "′ 3 " − "′
! ", "′ = & ' 1 + exp −
ℓ ℓ
Bend
126
BENDED INTERCONNECTION – GP Key 1
• GP Model Building
• Covariance Function: Matérn 3/2
= able to model a wide class of functions (non-differentiable ones)
3 " − "′ 3 " − "′
! ", "′ = & ' 1 + exp −
ℓ ℓ
Bend
• Hyperparameters tuning
• 3 initial samples chosen over a LHD
• tuning via Maximum Likelihood Estimation (MLE) [Williams, 2006]
127
BENDED INTERCONNECTION – AQ
Function Model update
Initial samples evaluation
#
Final Results
" # = "' '(% ! " #$%
Bend
Key 2
Acquisition function
BENDED INTERCONNECTION – AQ Key 2
• Acquisition function
• Expected Improvement
$ " − & "'()* $ " − & "'()*
! " = $ " − & "'()* Φ +, " .
, " , "
Bend
129
BENDED INTERCONNECTION – BO
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
Bend
Key 2 Key 1
Acquisition function Gaussian Process
130
BENDED INTERCONNECTION – BO
• Bayesian Optimization software
• Several BO software packages
• GPflowOpt, GPyOpt, Spearmint, BayesOpt, RoBO
Bend
131
BENDED INTERCONNECTION – BO
• Bayesian Optimization software
• Several BO software packages
• GPflowOpt, GPyOpt, Spearmint, BayesOpt, RoBO
• open Source Python package, based on TensorFlow
Bend
132
BENDED INTERCONNECTION
• Optimization Results
Design Min Objective Number of
parameters function iterations
w=2.094 mm
0.6739 50
L=54.398 mm
Bend
133
BENDED INTERCONNECTION
• Optimization Results
Design Min Objective Number of
parameters function iterations
w=2.094 mm
0.6739 50
L=54.398 mm
Bend
Computational Time
S-parameters 86 min 35.82 s
simulations (ADS) (103.92 s per sample)
Bayesian Optimization 58.58 s
Total 87 min 34.4 s
134
DEPARTMENT OF INFORMATION TECHNOLOGY
IDLAB
-3- BO EXAMPLE
-4- WRAP-UP
WRAP-UP: BAYESIAN OPTIMIZATION
Function Model update
Initial samples evaluation
#
Final Results
" # = "& &)% ! " #$%
Key 2 Key 1
Acquisition function Gaussian Process
183
Questions?
184
REFERENCES
• [Gonzàlez]: J. Gonzàlez, “Introduction to Bayesian Optimization”, Masterclass, Lancaster University, Lancaster, UK, 2107
• [Adams]: R. P. Adams, “A Tutorial on Bayesian Optimization for Machine Learning”, Harvard University, Harvard, UK
• [Gazda]: C. Gazda, D. Vande Ginste, H. Rogier, R. Wu and D. De Zutter, “A Wideband Common-Mode Suppression Filter for
Bend Discontinuities in Differential Signaling Using Tightly Coupled Microstrips”, IEEE Transactions on Advanced Packaging,
vol. 33, no. 4, pp. 969-978, Nov. 2010
• [Williams]: C. E. Rasmussen and C. K. Williams, “Gaussian processes for machine learning”, vol. 1, MIT press Cambridge,
2006
• [BO-GU]: Available at https://github.com/GPflow/GPflowOpt . Related publication: N. Knudde, J. van der Herten, T. Dhaene,
and I. Couckuyt, “GPflowOpt: A Bayesian Optimization Library using TensorFlow,” arXiv preprint – arXiv:1711.03845, 2017.
Available: https://arxiv.org/abs/1711.03845
• [Knudde]: N. Knudde, I. Couckuyt, D. Spina, K. Łukasik, P. Barmuta, D. Schreurs and T. Dhaene, “Data-Efficient Bayesian
Optimization with Constraints for Power Amplifier Design”, Proceedings of IEEE MTT-S International Conference on Numerical
Electromagnetic and Multiphysics Modeling and Optimization (NEMO), Reykjavik, Iceland, 2018
• [Torun]: H. M. Torun, M. Larbi, M. Swaminathan, “A Bayesian Framework for Optimizing Interconnects in High-Speed
Channels”, Proceedings of IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling
and Optimization (NEMO), Reykjavik, Iceland, 2018
185
GENERAL REFERENCES ON BO
• C. E. Rasmussen and C. K. Williams, “Gaussian processes for machine learning”, vol. 1, MIT press Cambridge,
2006
• B. Shahriari, K. Swersky, Z. Wang, R. P. Adams and N. de Freitas, “Taking the Human Out of the Loop: A Review of
Bayesian Optimization”, Proceedings of the IEEE, vol. 104, no. 1, pp. 148-175, Jan. 2016.
• E. Brochu, V. M. Cora, N. de Freitas, “A Tutorial on Bayesian Optimization of Expensive Cost Functions, with
Application to Active User Modeling and Hierarchical Reinforcement Learning”, arXiv preprint – arXiv: 1012.2599,
2010. Available: http://arxiv.org/abs/1012.2599
• P. I. Frazier, “A Tutorial on Bayesian Optimization ”, arXiv preprint – arXiv: 1807.02811, 2018. Available:
https://arxiv.org/abs/1807.02811
186
Dr. Ivo Couckuyt
Dr. Domenico Spina
Prof. Tom Dhaene
Email: [email protected]
www.ugent.be
www.ugent.be/ea/idlab/en
www.imec.be