NumPy Basics
NumPy Basics
com)
Numpy
Numpy is the core library for scientific computing in Python.
Numpy provides a high-performance multidimensional array object, and tools for working with these
arrays.
Numpy makes programming mathematical functions more akin to writing the mathematical
functions!
import numpy as np
Arrays
A numpy array is a analogous to python list but the elements of the array should be of same type.
a = np.array([1, 2, 3])
a
array([1, 2, 3])
type(a)
numpy.ndarray
We associate numpy arrays with two properties shape and rank, which describe the array about the
dimension and shape it is of.
Rank of a: 1
Shape of a: (3,)
Total number of elements in the array: 3
Data type of the elements of a: int64
print(a[0], a[1], a[2]) # Prints "1 2 3"
a[0] = 5 # Change an element of the array
print(a)
1 2 3
[5 2 3]
Rank of b: 2
Shape of b: (2, 3)
Total number of elements in b: 6
Data type of the elements of b: float64
Array b:
0 | 1. | 2. | 3. | |___________|___________|___________| | | | | 1 | 4. | 5. | 6. |
|___________|___________|___________|
^
2 rows (axis=0) shape=(2, 3)
b[0] –> b[0, :] –> [1., 2., 3.] b[1] –> b[1, :] –> [4., 5., 6.]
b[:, 0] –> [1., 4.] b[:, 1] –> [2., 5.] b[:, 2] –> [3., 6.]
# accessing elements
print(b[0, 0], b[0, 1], b[1, 0]) # Prints "1 2 4"
Array Creation
Numpy provides lots of ways to create a numpy array.
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]
[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 1. 1. 1.]
[ 1. 1. 1.]]
numpy full
[[10 10]
[10 10]]
numpy random
[[ 0.36212922 0.24199098]
[ 0.77907491 0.75820274]]
numpy arange
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
numpy linspace
np.linspace(0, 10, 5)
%%file mydata.dat
1 11
2 92
3 81
4 52
5 14
6 23
7 22
8 11
9 0
10 1
Writing mydata.dat
np.genfromtxt("mydata.dat",)
array([[ 1., 11.],
[ 2., 92.],
[ 3., 81.],
[ 4., 52.],
[ 5., 14.],
[ 6., 23.],
[ 7., 22.],
[ 8., 11.],
[ 9., 0.],
[ 10., 1.]])
?np.genfromtxt
Array Indexing
Numpy offers several ways to index into arrays.
Slicing
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python
sequences.
a = np.linspace(0, 500, 6)
print(a)
# Last element
a[-1]
500.0
Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by
commas. When accessing multidimensional arrays, we must specify a slice for each dimension of the array
a =np.array([
np.linspace(1, 3, 3),
np.linspace(4, 6, 3),
np.linspace(7, 9, 3)
])
print(a)
[[ 1. 2. 3.]
[ 4. 5. 6.]
[ 7. 8. 9.]]
When fewer indices are provided than the number of axes, the missing indices are considered complete
slices
First row:
[ 1. 2. 3.]
bool_idx = (a > 2)
print(bool_idx)
[[False False]
[ True True]
[ True True]]
print(a[bool_idx])
[3 4 5 6]
[3 4 5 6]
print(a, "\n")
m, n = np.where(a > 2)
print("Axis-0: ", m)
print("Axis-1: ",n)
[[1 2]
[3 4]
[5 6]]
Axis-0: [1 1 2 2]
Axis-1: [0 1 0 1]
a[m,n]
array([3, 4, 5, 6])
diag
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
print("A:\n", A)
np.diag(A)
A:
[[ 0 1 2 3 4]
[10 11 12 13 14]
[20 21 22 23 24]
[30 31 32 33 34]
[40 41 42 43 44]]
reverse diagonal
A[:, ::-1]
array([[ 4, 3, 2, 1, 0],
[14, 13, 12, 11, 10],
[24, 23, 22, 21, 20],
[34, 33, 32, 31, 30],
[44, 43, 42, 41, 40]])
np.diag(A[:, ::-1])
take
v = np.arange(-3,3)
v
row_indices = [1, 3, 5]
v[row_indices]
array([-2, 0, 2])
---------------------------------------------------------------------------
<ipython-input-40-752504d3dd6d> in <module>()
----> 1 [-3, -2, -1, 0, 1, 2][row_indices]
array([-2, 0, 2])
Linear Algebra
Elementwise-array operations
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
print(x + y, "\n")
print(np.add(x, y))
[[ 6. 8.]
[ 10. 12.]]
[[ 6. 8.]
[ 10. 12.]]
print(x - y, "\n")
print(np.subtract(x, y))
[[-4. -4.]
[-4. -4.]]
[[-4. -4.]
[-4. -4.]]
Elementwise product; both produce the array
print(x * y, "\n")
print(np.multiply(x, y))
[[ 5. 12.]
[ 21. 32.]]
[[ 5. 12.]
[ 21. 32.]]
print(x / y, "\n")
print(np.divide(x, y))
[[ 0.2 0.33333333]
[ 0.42857143 0.5 ]]
[[ 0.2 0.33333333]
[ 0.42857143 0.5 ]]
a2
print("Squaring...\n")
print("a: \n", a)
print("\na**2: \n", a**2)
print("\nnp.square(a): \n", np.square(a))
Squaring...
a:
[[1 2]
[3 4]
[5 6]]
a**2:
[[ 1 4]
[ 9 16]
[25 36]]
np.square(a):
[[ 1 4]
[ 9 16]
[25 36]]
---------------------------------------------------------------------------
<ipython-input-47-7d51e112cdad> in <module>()
1 list_a = [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
----> 2 list_a**2
ea
print("exp(a):\n", np.exp(a))
exp(a):
[[ 2.71828183 7.3890561 ]
[ 20.08553692 54.59815003]
[ 148.4131591 403.42879349]]
a1.2
a**1.2:
[[ 1. 2.29739671]
[ 3.73719282 5.27803164]
[ 6.89864831 8.58581449]]
Base10 logarithm:
[[ 0. 0.30103 ]
[ 0.47712125 0.60205999]
[ 0.69897 0.77815125]]
Base2 logarithm:
[[ 0. 1. ]
[ 1.5849625 2. ]
[ 2.32192809 2.5849625 ]]
Vector Operations
We can use the usual arithmetic operators to multiply, add, subtract, and divide vectors with scalar
numbers.
v1 = np.arange(0, 5)
v1
array([0, 1, 2, 3, 4])
print(v1 * 2)
print(v1 / 2)
print(v1 ** 2)
print(v1 * v1)
[0 2 4 6 8]
[ 0. 0.5 1. 1.5 2. ]
[ 0 1 4 9 16]
[ 0 1 4 9 16]
Inner Product
v2 = np.arange(5, 10)
v2
array([5, 6, 7, 8, 9])
v1 = [0, 1, 2, 3, 4]
v2 = [5, 6, 7, 8, 9]
v1 . v2 = 0 ∗ 5 + 1 ∗ 6 + 2 ∗ 7 + 3 ∗ 8 + 4 ∗ 9
np.dot(v1, v2)
80
sum = 0
for each element in vector:
sum += element * element
30
print(v1 @ v1)
30
Matrix Algebra
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]])
Transpose
A.T
Matrix-Vector Multiplication
v1
array([0, 1, 2, 3, 4])
A * v1
array([[ 0, 1, 4, 9, 16],
[ 0, 11, 24, 39, 56],
[ 0, 21, 44, 69, 96],
[ 0, 31, 64, 99, 136],
[ 0, 41, 84, 129, 176]])
A * A
array([[ 0, 1, 4, 9, 16],
[ 100, 121, 144, 169, 196],
[ 400, 441, 484, 529, 576],
[ 900, 961, 1024, 1089, 1156],
[1600, 1681, 1764, 1849, 1936]])
Matrix Multiplication
A.dot(A)
A @ A
Alternatively we can cast the array to Matrix , which enables normal arithmatic opertions to perform
matrix algebra.
A_mat = np.matrix(A)
v = np.matrix(v1).T # make it a column vector
Vector v:
[[0]
[1]
[2]
[3]
[4]]
type(A_mat)
numpy.matrixlib.defmatrix.matrix
A_mat * A_mat
v.T * A_mat
A_mat * v
matrix([[ 30],
[130],
[230],
[330],
[430]])
If we try to add, subtract or multiply objects with incomplatible shapes we get an error:
v = np.matrix([1,2,3,4]).T
A_mat.shape, v.shape
---------------------------------------------------------------------------
<ipython-input-71-c5c8686b9891> in <module>()
----> 1 A_mat * v
/Users/vikramkalabi/anaconda3/envs/carnd-term1/lib/python3.5/site-packages/numpy/m
atrixlib/defmatrix.py in __mul__(self, other)
307 if isinstance(other, (N.ndarray, list, tuple)) :
308 # This promotes 1-D vectors to row vectors
--> 309 return N.dot(self, asmatrix(other))
310 if isscalar(other) or not hasattr(other, '__rmul__') :
311 return N.dot(self, other)
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]])
A.sum()
550
A.sum(axis=0)
A.sum(axis=1)
array([ 10, 60, 110, 160, 210])
Statistics
Mean
Mean of A:
22.0
Column-wise mean of A:
[ 20. 21. 22. 23. 24.]
Row-wise mean of A:
[ 2. 12. 22. 32. 42.]
Variance
Variance of A:
202.0
Column-wise variance of A:
[ 200. 200. 200. 200. 200.]
Row-wise variance of A:
[ 2. 2. 2. 2. 2.]
Standard deviation
Standard Deviation of A:
14.2126704036
Column-wise Standard Deviation of A:
[ 14.14213562 14.14213562 14.14213562 14.14213562 14.14213562]
Row-wise Standard Deviation of A:
[ 1.41421356 1.41421356 1.41421356 1.41421356 1.41421356]
Maximum of A:
44
Column-wise Maximum of A:
[40 41 42 43 44]
Row-wise Maximum of A:
[ 4 14 24 34 44]
Broadcasting
source: Justin Johnson (http://cs.stanford.edu/people/jcjohns/)
Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when
performing arithmetic operations.
For example, suppose that we want to add a constant vector to each row of a matrix. We could do it like
this:
print("X: \n", X)
print("\nv:\n", v)
X:
[[1 2]
[3 4]
[5 6]
[7 8]]
v:
[1 2]
We can add the vector v to each row of the matrix x, storing the result in the matrix y
Y = np.zeros_like(X)
# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
Y[i, :] = X[i, :] + v
print(Y)
[[ 2 4]
[ 4 6]
[ 6 8]
[ 8 10]]
Adding v to every row of matrix X is equivalent to form a matrix vv by stacking multiple copies of v
vertically, then performing elementwise summation of X and vv
Stacked vectors:
[[1 2]
[1 2]
[1 2]
[1 2]]
Result:
[[ 2 4]
[ 4 6]
[ 6 8]
[ 8 10]]
Numpy broadcasting allows us to perform this computation without actually creating multiple copies of v .
Subject to certain constraints, the smaller array is broadcast across the larger array so that they have
compatible shapes.
[[ 2 4]
[ 4 6]
[ 6 8]
[ 8 10]]
1. If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both
shapes have the same length.
2. The two arrays are said to be compatible in a dimension if they have the same size in the dimension,
or if one of the arrays has size 1 in that dimension.
3. The arrays can be broadcast together if they are compatible in all dimensions.
4. After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of
shapes of the two input arrays.
5. In any dimension where one array had size 1 and the other array had size greater than 1, the first
array behaves as if it were copied along that dimension
v = np.array([1])
print("Rank of v: ", v.ndim)
X + v
Rank of v: 1
array([[2, 3],
[4, 5],
[6, 7],
[8, 9]])
To compute an outer product, we first reshape v to be a column vector of shape (3, 1) . We can then
broadcast it against w to yield an output of shape (3, 2) , which is the outer product of v and w :
[[ 4 5]
[ 8 10]
[12 15]]
A
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]])
m, n = A.shape
B = A.reshape((1, m*n))
print(B.shape)
print(B)
(1, 25)
[[ 0 1 2 3 4 10 11 12 13 14 20 21 22 23 24 30 31 32 33 34 40 41 42 43
44]]
B[0, 0:5] = -1
B = A.flatten()
print(B.shape)
print(B)
(25,)
[-1 -1 -1 -1 -1 10 11 12 13 14 20 21 22 23 24 30 31 32 33 34 40 41 42 43 44]
B[0:5] = 10
B
array([10, 10, 10, 10, 10, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,
32, 33, 34, 40, 41, 42, 43, 44])
A
array([[-1, -1, -1, -1, -1],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]])
array([[1, 2],
[3, 4],
[5, 6]])
array([[1, 2, 5],
[3, 4, 6]])
np.vstack((a,b))
array([[1, 2],
[3, 4],
[5, 6]])
np.hstack((a,b.T))
array([[1, 2, 5],
[3, 4, 6]])
Copy
To achieve high performance, assignments in Python usually do not copy the underlaying objects. This is
important for example when objects are passed between functions, to avoid an excessive amount of
memory copying when it is not necessary (technical term: pass by reference).
array([[1, 2],
[3, 4]])
# now B is referring to the same array data as A
B = A
# changing B affects A
B[0,0] = 10
array([[10, 2],
[ 3, 4]])
array([[10, 2],
[ 3, 4]])
If we want to avoid this behavior, so that when we get a new completely independent object B copied
from A , then we need to do a so-called “deep copy” using the function copy :
B = A.copy()
array([[-5, 2],
[ 3, 4]])
array([[10, 2],
[ 3, 4]])
Further Reading
http://numpy.scipy.org (http://numpy.scipy.org)
http://scipy.org/Tentative_NumPy_Tutorial (http://scipy.org/Tentative_NumPy_Tutorial)
http://scipy.org/NumPy_for_Matlab_Users (http://scipy.org/NumPy_for_Matlab_Users) - A Numpy
guide for MATLAB users.