0% found this document useful (0 votes)
15 views

Stochastic Gradient Descent (SGD) :: Import As

The document compares three optimization algorithms: stochastic gradient descent (SGD), RMSprop, and Adam. SGD updates model parameters iteratively using mini-batches. RMSprop adapts learning rates for each parameter. Adam combines advantages of RMSprop and momentum, maintaining separate learning rates and averages of gradients. The document also includes plots comparing the algorithms on convergence speed, robustness, and ease of use.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Stochastic Gradient Descent (SGD) :: Import As

The document compares three optimization algorithms: stochastic gradient descent (SGD), RMSprop, and Adam. SGD updates model parameters iteratively using mini-batches. RMSprop adapts learning rates for each parameter. Adam combines advantages of RMSprop and momentum, maintaining separate learning rates and averages of gradients. The document also includes plots comparing the algorithms on convergence speed, robustness, and ease of use.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Stochastic Gradient Descent (SGD):

Characteristics:

Iteratively updates the model parameters using the gradients of the loss function with respect to
the parameters. Randomly selects a subset of the training data (mini-batch) for each iteration.
Advantages:

Simplicity and ease of implementation. Can perform well on large-scale datasets. Drawbacks:

Prone to getting stuck in local minima. Can have slow convergence, especially in the presence of
noisy gradients.

RMSprop (Root Mean Square Propagation):


Characteristics:

Adapts the learning rates of each parameter individually. Divides the learning rate by the root
mean square of the exponentially weighted moving average of squared gradients. Advantages:

Effective in dealing with sparse data and non-stationary objectives. Helps overcome some of the
issues with constant learning rates in SGD. Drawbacks:

May suffer from vanishing or exploding learning rates. Requires tuning of additional
hyperparameters.

Adam (Adaptive Moment Estimation):


Characteristics:

Combines the advantages of both RMSprop and momentum. Maintains separate learning rates
for each parameter and an exponentially decaying average of past gradients and squared
gradients. Advantages:

Fast convergence and robustness to noisy gradients. Automatic adjustment of learning rates for
each parameter. Drawbacks:

May exhibit erratic behavior on some non-convex optimization problems. Introduces additional
hyperparameters that need tuning.

import matplotlib.pyplot as plt

optimizers = ['SGD', 'RMSprop', 'Adam']


convergence_speed = [3, 4, 5] # Hypothetical scores (higher is
better)
robustness = [3, 4, 5] # Hypothetical scores (higher is better)
ease_of_use = [4, 3, 3] # Hypothetical scores (higher is better)

fig, ax = plt.subplots()
bar_width = 0.25
index = range(len(optimizers))

bar1 = ax.bar(index, convergence_speed, bar_width, label='Convergence


Speed')
bar2 = ax.bar([i + bar_width for i in index], robustness, bar_width,
label='Robustness')
bar3 = ax.bar([i + 2 * bar_width for i in index], ease_of_use,
bar_width, label='Ease of Use')

ax.set_xlabel('Optimizers')
ax.set_ylabel('Score')
ax.set_title('Comparison of Optimization Algorithms')
ax.set_xticks([i + bar_width for i in index])
ax.set_xticklabels(optimizers)
ax.legend()

plt.show()
import plotly.express as px

# Data for the line plot


data = {
'Optimizer': ['SGD', 'RMSprop', 'Adam'],
'Convergence Speed': [3, 4, 5],
'Robustness': [3, 4, 5],
'Ease of Use': [4, 3, 3],
}

# Create a DataFrame from the data


import pandas as pd
df = pd.DataFrame(data)

# Melt the DataFrame to have 'Score' as a variable


df_melted = pd.melt(df, id_vars='Optimizer', var_name='Score',
value_name='Value')

# Create a line plot using Plotly Express


fig = px.line(df_melted, x='Score', y='Value', color='Optimizer',
labels={'Value': 'Score', 'Score': 'Metric'},
title='Comparison of Optimization Algorithms')

# Update layout for better visualization


fig.update_layout(
xaxis_title='Score',
yaxis_title='Optimizer',
height=500,
width=800
)

# Show the plot


fig.show()

You might also like