Pytorch Tutorial: - Ntu Machine Learning Course
Pytorch Tutorial: - Ntu Machine Learning Course
Package Description
This Tutorial
torch a Tensor library like Numpy, with strong GPU support
torch.legacy(.nn/.optim) legacy code that has been ported over from torch for backward
compatibility reasons
Outline
• Neural Network in Brief
• Concepts of PyTorch
• Multi-GPU Processing
• RNN
• Transfer Learning
• Comparison with TensorFlow
Neural Network in Brief
• Supervised Learning
– Learning a function f, that f(x)=y
Batch 1
Batch N
Batch N
Batch N
Batch N Optimizer
Optimizer
Backward
Neural Network in Brief
Inside the Neural Network
Forward
Data
W W W W … W Label’
Forward
Gradient
Backward
Neural Network
Function:
- NN Modules
- Optimizer
- Loss Function
- Multi-Processing
Concepts of PyTorch
• Modules of PyTorch • Operations
– z=x+y
Data:
- Tensor – torch.add(x,y, out=z)
- Variable (for Gradient) – y.add_(x) # in-place
Function:
- NN Modules
- Optimizer
- Loss Function
- Multi-Processing
Concepts of PyTorch
• Modules of PyTorch • Numpy Bridge
Data:
- Tensor • To Numpy
- Variable (for Gradient) – a = torch.ones(5)
– b = a.numpy()
Function:
- NN Modules
- Optimizer • To Tensor
- Loss Function – a = numpy.ones(5)
- Multi-Processing – b = torch.from_numpy(a)
Concepts of PyTorch
• Modules of PyTorch • CUDA Tensors
Data:
- Tensor • Move to GPU
- Variable (for Gradient) – x = x.cuda()
– y = y.cuda()
Function: – x+y
- NN Modules
- Optimizer
- Loss Function
- Multi-Processing
Concepts of PyTorch
• Modules of PyTorch
Data:
Forward
- Tensor
- Variable (for Gradient) Neural Network
Function:
- NN Modules Tensor data
- Optimizer
- Loss Function
- Multi-Processing For Current Backward Process
Handled by PyTorch Automatically
Concepts of PyTorch
• Modules of PyTorch • Variable
• x = Variable(torch.ones(2, 2), requires_grad=True)
• print(x)
Data:
- Tensor
- Variable (for Gradient) • y=x+2
• z=y*y*3
• out = z.mean()
Function: • out.backward()
• print(x.grad)
- NN Modules
- Optimizer
- Loss Function
- Multi-Processing
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
Define modules
(must have)
Build network
(must have)
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
[Channel, H, W]: 1x32x32->6x28x28
conv1
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
[Channel, H, W]: 6x28x28
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
[Channel, H, W]: 6x28x28 -> 6x14x14
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules [Channel, H, W]: 6x14x14 -> 16x10x10
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
[Channel, H, W]: 16x10x10
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
relu
[Channel, H, W]: 16x10x10 -> 16x5x5
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
Flatten the Tensor fc1
(must have)
16x5x5
relu
fc2
Tensor: [Batch N, Channel, H, W]
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
x
conv1
relu
pooling
Define modules
conv2
(must have)
relu
pooling
Build network
fc1
(must have)
relu
fc2
relu
fc3
• http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network
Concepts of PyTorch
• Modules of PyTorch • NN Modules (torch.nn)
– Modules built on Variable
Data: – Gradient handled by PyTorch
- Tensor
- Variable (for Gradient)
• Common Modules
Function: – Convolution layers
- NN Modules – Linear layers
- Optimizer
– Pooling layers
- Loss Function
- Multi-Processing – Dropout layers
– Etc…
NN Modules
• Convolution Layer
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
– Example:
– torch.nn.conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
NN Modules
• Convolution Layer
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
Hin
Win
Cin
*: convolution
NN Modules
• Convolution Layer
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
*: convolution
NN Modules
• Convolution Layer
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
*: convolution
NN Modules
• Convolution Layer
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
*: convolution
NN Modules
• Convolution Layer p=1
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
*: convolution
NN Modules
• Convolution Layer p=1
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D k=3
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
*: convolution
NN Modules
• Convolution Layer p=1
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D k=3
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D s=1
*: convolution
NN Modules
• Convolution Layer
– N-th Batch (N), Channel (C)
– torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D
– torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D
– torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D
If dropout here
NN Modules
• Pooling Layer
– torch.nn.AvgPool2d(kernel_size=2, stride=2, padding=0)
– torch.nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
k=2
Function:
• Loss (torch.nn)
- NN Modules – L1Loss
- Optimizer – MSELoss
- Loss Function – CrossEntropy
– …
- Multi-Processing – 18 Loss Functions (PyTorch 0.2)
What We Build?
Define modules
(must have)
Build network
(must have)
http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
What We Build?
D_in=1000 D_out=100
Define modules
(must have)
y_pred
…
…
…
Build network
(must have) H=100
http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
What We Build?
D_in=1000 D_out=100
Define modules
(must have)
y_pred
…
…
…
Build network
(must have) H=100
http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
What We Build?
D_in=1000 D_out=100
Define modules
(must have)
y_pred
…
…
…
Build network
(must have) H=100
Reset Gradient
Backward
Update Step
http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
Concepts of PyTorch
• Modules of PyTorch • Basic Method
– torch.nn.DataParallel
Data: – Recommend by PyTorch
- Tensor
- Variable (for Gradient)
Function:
• Advanced Methods
- NN Modules
– torch.multiprocessing
- Optimizer
– Hogwild (async)
- Loss Function
- Multi-Processing
Multi-GPU Processing
• torch.nn.DataParallel
– gpu_id = '6,7‘
– os.environ['CUDA_VISIBLE_DEVICES'] = gpu_id
– net = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
– output = net(input_var)
• Important Notes:
– Device_ids must start from 0
– (batch_size/GPU_size) must be integer
Saving Models
• First Approach (Recommend by PyTorch)
• # save only the model parameters
• torch.save(the_model.state_dict(), PATH)
http://pytorch.org/docs/master/notes/serialization.html#recommended-approach-for-saving-a-model
Recurrent Neural Network (RNN)
output
hidden
self.i2h
input_size=50+20=70
input
http://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html#example-2-recurrent-net
Recurrent Neural Network (RNN)
http://pytorch.org/docs/master/notes/autograd.html#excluding-subgraphs-from-backward
Comparison with TensorFlow
Properties TensorFlow PyTorch
Static
Graph Dynamic
Dynamic (TensorFlow Fold)
Ramp-up Time - Win
Graph Creation and Debugging - Win
Feature Coverage Win Catch up quickly
Documentation Tie Tie
Serialization Win (support other lang.) -
Deployment Win (Cloud & Mobile) -
Data Loading - Win
Device Management Win Need .cuda()
Custom Extensions - Win
Thank You~!