# Lab  3 - Pytorch

##### Keywords: gradient descent, logistic regression, pytorch, sgd, minibatch sgd
##### Data: data/iris_dataset.pickle

## Contents
{:.no_toc}
* 
{: toc}

## Learning Aims

- Introduction to PyTorch
- Linear regression
- Logistic regression
- Automatic differentiation
- Gradient descent

## Lab Trajectory

- PyTorch Installation
- Why PyTorch?
- Working with PyTorch Basics

## Installing PyTorch

### Installation

#### OS X/Linux 
We shall be using PyTorch in this class.  Please go to the PyTorch website where they have a nicely designed interface for installation instructions depending on your OS (Linux/OS X), your package management system (pip/conda) and your CUDA install (8/9/none).  http://pytorch.org.  Your installation instructions will look something like:

- conda install pytorch torchvision -c pytorch 

or

- pip3 install http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp36-cp36m-linux_x86_64.whl 
- pip3 install torchvision

#### Windows
PyTorch doesn't have official Windows support as yet, but there are Windows binaries available due to Github user @peter123.  Please see his PyTorch for Windows repo https://github.com/peterjc123/pytorch-scripts for installation instructions for different versions of Windows and CUDA.  In all likelihood your installation instructions will be:

- conda install -c peterjc123 pytorch
- pip install torchvision


#### Testing Installation

If the code cell shows an error, then your PyTorch installation is not working and you should contact one of the teaching staff.

In [6]:
### Code Cell to Test PyTorch

import torch
print(torch.__version__)
import torchvision
import torchvision.transforms as transforms
print(torchvision.__version__)

x = torch.rand(5, 3)
print(x)

transforms.RandomRotation(0.7)
transforms.RandomRotation([0.9, 0.2])

t = transforms.RandomRotation(10)
angle = t.get_params(t.degrees)

print(angle)


0.3.0.post4
0.2.0

 0.2901  0.8863  0.4383
 0.9738  0.9825  0.1046
 0.8069  0.1135  0.3565
 0.4906  0.4698  0.9623
 0.1116  0.4729  0.3536
[torch.FloatTensor of size 5x3]

-8.447796634522254


## Why PyTorch?

*All the quotes will come from the PyTorch About Page http://pytorch.org/about/ from which I'll plagiarize shamelessly.  After all, who better to tout the virtues of PyTorch than the creators?*


### What is PyTorch?

According to the PyTorch about page, "PyTorch is a python package that provides two high-level features:

- Tensor computation (like numpy) with strong GPU acceleration
- Deep Neural Networks built on a tape-based autograd system"

### Why is it getting so popular?

#### It's quite fast

"PyTorch has minimal framework overhead. We integrate acceleration libraries such as Intel MKL and NVIDIA (CuDNN, NCCL) to maximize speed. At the core, it’s CPU and GPU Tensor and Neural Network backends (TH, THC, THNN, THCUNN) are written as independent libraries with a C99 API.
They are mature and have been tested for years.

Hence, PyTorch is quite fast – whether you run small or large neural networks."

#### Imperative programming experience

"PyTorch is designed to be intuitive, linear in thought and easy to use. When you execute a line of code, it gets executed. There isn’t an asynchronous view of the world. When you drop into a debugger, or receive error messages and stack traces, understanding them is straight-forward. The stack-trace points to exactly where your code was defined. We hope you never spend hours debugging your code because of bad stack traces or asynchronous and opaque execution engines."

"PyTorch is not a Python binding into a monolothic C++ framework. It is built to be deeply integrated into Python. You can use it naturally like you would use numpy / scipy / scikit-learn etc. You can write your new neural network layers in Python itself, using your favorite libraries and use packages such as Cython and Numba. Our goal is to not reinvent the wheel where appropriate."

#### Takes advantage of GPUs easily

"PyTorch provides Tensors that can live either on the CPU or the GPU, and accelerate compute by a huge amount.

We provide a wide variety of tensor routines to accelerate and fit your scientific computation needs such as slicing, indexing, math operations, linear algebra, reductions. And they are fast!"


#### Dynamic Graphs!!!

"Most frameworks such as TensorFlow, Theano, Caffe and CNTK have a static view of the world. One has to build a neural network, and reuse the same structure again and again. Changing the way the network behaves means that one has to start from scratch.

With PyTorch, we use a technique called Reverse-mode auto-differentiation, which allows you to change the way your network behaves arbitrarily with zero lag or overhead. Our inspiration comes from several research papers on this topic, as well as current and past work such as autograd, autograd, Chainer, etc.

While this technique is not unique to PyTorch, it’s one of the fastest implementations of it to date. You get the best of speed and flexibility for your crazy research."



## Working with PyTorch Basics

Enough of the sales pitch!  Let's start to understand the PyTorch basics.

The basic unit of PyTorch is a tensor (basically a multi-dimensional array like a np.ndarray).

![](https://cdn-images-1.medium.com/max/2000/1*_D5ZvufDS38WkhK9rK32hQ.jpeg)

(image borrowed from https://hackernoon.com/learning-ai-if-you-suck-at-math-p4-tensors-illustrated-with-cats-27f0002c9b32 )

We can create PyTorch tensors directly.

In [7]:

## You can create torch.Tensor objects by giving them data directly

#  1D vector
vector_input = [1., 2., 3., 4., 5., 6.]
vector = torch.Tensor(vector_input)

# Matrix
matrix_input = [[1., 2., 3.], [4., 5., 6]]
matrix = torch.Tensor(matrix_input)

# Create a 3D tensor of size 2x2x2.
tensor_input = [[[1., 2.], [3., 4.]],
          [[5., 6.], [7., 8.]]]
tensor3d = torch.Tensor(tensor_input)


print(vector)
print(matrix)
print(tensor3d)


 1
 2
 3
 4
 5
 6
[torch.FloatTensor of size 6]


 1  2  3
 4  5  6
[torch.FloatTensor of size 2x3]


(0 ,.,.) = 
  1  2
  3  4

(1 ,.,.) = 
  5  6
  7  8
[torch.FloatTensor of size 2x2x2]



They can be created without any initialization or initialized with random data from uniform (rand()) or normal (randn()) distributions

In [9]:
# Tensors with no initialization
x_1 = torch.Tensor(2, 5)
y_1 = torch.Tensor(3, 5)
print(x_1)
print(y_1)

# Tensors initialized from uniform
x_2 = torch.rand(5, 3)
y_2 = torch.rand(5, 5)

print(x_2)
print(y_2)

# Tensors initialized from normal
x_3 = torch.randn(5, 3)
y_3 = torch.randn(5, 5)

print(x_3)
print(y_3)


 0.0000e+00  1.0842e-19  6.3095e+27  1.0845e-19  1.8217e-44
 0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00
[torch.FloatTensor of size 2x5]


 0.0000e+00  1.0842e-19  0.0000e+00  1.0842e-19  5.6052e-45
 0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  1.0842e-19
 6.3059e+27 -1.5849e+29  2.8026e-45  1.0842e-19  4.9077e+27
[torch.FloatTensor of size 3x5]


 0.5759  0.0052  0.5583
 0.6048  0.9838  0.5592
 0.9206  0.8251  0.5032
 0.5607  0.0485  0.4050
 0.6590  0.7941  0.6106
[torch.FloatTensor of size 5x3]


 0.0325  0.7753  0.2360  0.6659  0.7960
 0.1888  0.4185  0.9106  0.8155  0.1502
 0.6387  0.9303  0.7255  0.1813  0.5066
 0.9799  0.9844  0.2526  0.0286  0.1560
 0.8586  0.2915  0.5509  0.5185  0.5027
[torch.FloatTensor of size 5x5]


-1.7841 -0.1001 -0.6045
 0.1409  0.6862  0.8469
 0.8223  2.1229 -0.2956
 0.6558 -1.1188 -0.2326
 1.2631  0.2665 -0.0208
[torch.FloatTensor of size 5x3]


-0.0194 -2.0925  0.9395 -0.0195  1.3913
 1.9729  0.5524 -1.0353 -0.0404 -0.4854
-0.3671 -

The expected operations (arithmetic operations, addressing, etc) are all in place.

In [12]:
# Expect (2,5)
x_1.size()

print(x_1)


# Addition
print(x_2)
print(x_3)

print(x_2+ x_3)

# Addressing
print(x_3[:, 2])


 0.0000e+00  1.0842e-19  6.3095e+27  1.0845e-19  1.8217e-44
 0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00
[torch.FloatTensor of size 2x5]


 0.5759  0.0052  0.5583
 0.6048  0.9838  0.5592
 0.9206  0.8251  0.5032
 0.5607  0.0485  0.4050
 0.6590  0.7941  0.6106
[torch.FloatTensor of size 5x3]


-1.7841 -0.1001 -0.6045
 0.1409  0.6862  0.8469
 0.8223  2.1229 -0.2956
 0.6558 -1.1188 -0.2326
 1.2631  0.2665 -0.0208
[torch.FloatTensor of size 5x3]


-1.2081 -0.0949 -0.0462
 0.7457  1.6700  1.4062
 1.7429  2.9480  0.2075
 1.2165 -1.0703  0.1724
 1.9221  1.0606  0.5898
[torch.FloatTensor of size 5x3]


-0.6045
 0.8469
-0.2956
-0.2326
-0.0208
[torch.FloatTensor of size 5]



It's easy to move between PyTorch and Numpy worlds with numpy() and torch.from_numpy()

In [15]:
# PyTorch --> Numpy
print(x_1)
print(x_1.numpy())

print(type(x_1))
print(type(x_1.numpy()))

numpy_x_1 = x_1.numpy()
pytorch_x_1 = torch.from_numpy(numpy_x_1)

print(type(numpy_x_1))
print(type(pytorch_x_1))


 0.0000e+00  1.0842e-19  6.3095e+27  1.0845e-19  1.8217e-44
 0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00
[torch.FloatTensor of size 2x5]

[[  0.00000000e+00   1.08420217e-19   6.30950545e+27   1.08446661e-19
    1.82168800e-44]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00]]
<class 'torch.FloatTensor'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'torch.FloatTensor'>


Finally PyTorch provides some convenience mechanisms for concatenating Tensors via torch.cat() and reshaping them with  .view() 

In [15]:
## Concatenating

# By default, it concatenates along the first axis (concatenates rows)
x_1 = torch.randn(2, 5)
y_1 = torch.randn(3, 5)
z_1 = torch.cat([x_1, y_1])
print(z_1)

# Concatenate columns:
x_2 = torch.randn(2, 3)
y_2 = torch.randn(2, 5)
# second arg specifies which axis to concat along
z_2 = torch.cat([x_2, y_2], 1)
print(z_2)

## Reshaping
x = torch.randn(2, 3, 4)
print(x)
print(x.view(2, 12))  # Reshape to 2 rows, 12 columns
# Same as above.  If one of the dimensions is -1, its size can be inferred
print(x.view(2, -1))


 0.0000e+00  1.0842e-19  6.3095e+27  1.0845e-19  1.8217e-44
 0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00
[torch.FloatTensor of size 2x5]

[[  0.00000000e+00   1.08420217e-19   6.30950545e+27   1.08446661e-19
    1.82168800e-44]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00]]
<class 'torch.FloatTensor'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'torch.FloatTensor'>


Ok -- in order to understand variables in PyTorch, let's take a break and learn about Artificial Neural Networks.

## PyTorch Variables and the Computational Graph

Ok -- back to PyTorch.

The other fundamental PyTorch construct besides Tensors are Variables.  Variables are very similar to tensors, but they also keep track of the graph (including their gradients for autodifferentiation).  They are defined in the autograd module of torch.

In [21]:
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F

# Let's create a variable by initializing it with a tensor
first_tensor = torch.Tensor([23.3])

first_variable = Variable(first_tensor, requires_grad=True)

print("first variables gradient: ", first_variable.grad)
print("first variables data: ", first_variable.data)



first variables gradient:  None
first variables data:  
 23.3000
[torch.FloatTensor of size 1]



Now let's create some new variables. We can do so implicitly just by creating other variables with functional relationships to our variable.

In [30]:
x = first_variable
y = (x ** x) * (x - 2) # y is a variable
z = F.tanh(y) # z has a functional relationship to y, it's a variable
z.backward()

print("y.data: ", y.data)
print("y.grad: ", y.grad)

print("z.data: ", z.data)
print("z.grad: ", z.grad)

print("x.grad:", x.grad)



y.data:  
 1.5409e+33
[torch.FloatTensor of size 1]

y.grad:  None
z.data:  
 1
[torch.FloatTensor of size 1]

z.grad:  None
x.grad: Variable containing:
 0
[torch.FloatTensor of size 1]



Variables come with a .backward() that allows them to do autodifferentiation via backwards propagation.  

## Constructing a model with PyTorch

Constructing a model with PyTorch is based on a design pattern with a fairly repeatable three step process:

- Design your model (including relationships between your variables)
    - Generally done by defining a subclass of torch.nn.Module
- Construct your loss and optimizer
- Train your model using your optimizer and forwards and backwards steps in your model

In [49]:
from sklearn.datasets import make_regression
import numpy as np
np.random.seed(99)
x1_data, y1_data, coef = make_regression(30,10, 10, bias=1, noise=2, coef=True)

x1_data = [x1_data[i:i+1] for i in range(0, len(x1_data), 1)]
y1_data = [y1_data[i:i+1] for i in range(0, len(y1_data), 1)]

In [50]:
import torch
from torch.autograd import Variable

x_data = Variable(torch.Tensor(x1_data))
y_data = Variable(torch.Tensor(y1_data))


class Model(torch.nn.Module):

    def __init__(self):
        """
        In the constructor we instantiate two nn.Linear module
        """
        super(Model, self).__init__()
        self.linear = torch.nn.Linear(10, 1)  # One in and one out

    def forward(self, x):
        """
        In the forward function we accept a Variable of input data and we must return
        a Variable of output data. We can use Modules defined in the constructor as
        well as arbitrary operators on Variables.
        """
        y_pred = self.linear(x)
        return y_pred

# our model
model = Model()


# Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
criterion = torch.nn.MSELoss(size_average=False)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Training loop
for epoch in range(500):
    
    # Forward pass: Compute predicted y by passing x to the model
    y_pred = model(x_data)

    # Compute and print loss
    loss = criterion(y_pred, y_data)
    print(epoch, loss.data[0])

    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()


# After training
ytrain_pred = model(x_)


0 726254.125
1 124822.0
2 54632.37890625
3 28299.19921875
4 15898.6708984375
5 9425.998046875
6 5830.21484375
7 3738.6923828125
8 2474.47607421875
9 1683.9019775390625
10 1174.31640625
11 837.0162353515625
12 608.6522827148438
13 451.1288757324219
14 340.83050537109375
15 262.689208984375
16 206.82933044433594
17 166.6243896484375
18 137.53953552246094
19 116.42092895507812
20 101.04288482666016
21 89.822998046875
22 81.62397003173828
23 75.62706756591797
24 71.23649597167969
25 68.02033996582031
26 65.66402435302734
27 63.936065673828125
28 62.66962432861328
29 61.74079132080078
30 61.05937957763672
31 60.55976867675781
32 60.1931037902832
33 59.92445373535156
34 59.7275505065918
35 59.5826301574707
36 59.47638702392578
37 59.39829635620117
38 59.341041564941406
39 59.29940414428711
40 59.26884078979492
41 59.24623107910156
42 59.229835510253906
43 59.21748733520508
44 59.208683013916016
45 59.20254898071289
46 59.197391510009766
47 59.19389724731445
48 59.19124984741211
49 59.1893043

381 59.18421936035156
382 59.18425750732422
383 59.184226989746094
384 59.184242248535156
385 59.18424987792969
386 59.184234619140625
387 59.18421936035156
388 59.18425750732422
389 59.184226989746094
390 59.184242248535156
391 59.18424987792969
392 59.184234619140625
393 59.18421936035156
394 59.18425750732422
395 59.184226989746094
396 59.184242248535156
397 59.18424987792969
398 59.184234619140625
399 59.18421936035156
400 59.18425750732422
401 59.184226989746094
402 59.184242248535156
403 59.18424987792969
404 59.184234619140625
405 59.18421936035156
406 59.18425750732422
407 59.184226989746094
408 59.184242248535156
409 59.18424987792969
410 59.184234619140625
411 59.18421936035156
412 59.18425750732422
413 59.184226989746094
414 59.184242248535156
415 59.18424987792969
416 59.184234619140625
417 59.18421936035156
418 59.18425750732422
419 59.184226989746094
420 59.184242248535156
421 59.18424987792969
422 59.184234619140625
423 59.18421936035156
424 59.18425750732422
425 59.1842

NameError: name 'x_' is not defined