CSE488_Lab3_Numpy
CSE488_Lab3_Numpy
Numpy
NumPy is a basic package for scientific computing with Python and especially for data
analysis. In fact, this library is the basis of a large amount of mathematical and scientific
Python packages, and among them the pandas library. This library, specialized for data
analysis, is fully developed using the concepts introduced by NumPy. In fact, the built-in
tools provided by the standard Python library could be too simple or inadequate for most
of the calculations in data analysis. Having knowledge of the NumPy library is important to
being able to use all scientific Python packages, and particularly, to use and understand the
pandas library.
Ndarray
The NumPy library is based on one main object: ndarray (which stands for N-dimensional
array). This object is a multidimensional homogeneous array with a predetermined
number of items: homogeneous because virtually all the items in it are of the same type and
the same size. In fact, the data type is specified by another NumPy object called dtype
(data-type); each ndarray is associated with only one type of dtype.
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
Moreover, another peculiarity of NumPy arrays is that their size is fixed, that is, once you
define their size at the time of creation, it remains unchanged. This behavior is different
from Python lists, which can grow or shrink in size.
import numpy as np
import numpy
a = np.array([1.5, 2, 3])
type(a)
numpy.ndarray
a.dtype
dtype('float64')
a.ndim
1
a.size
3
a.shape
(3,)
b = np.array([[1.3, 2.4],[0.3, 4.1]])
b
array([[1.3, 2.4],
[0.3, 4.1]])
b.dtype
dtype('float64')
b.ndim
2
b.size
4
b.shape
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
(2, 2)
Another important attribute is itemsize, which can be used with ndarray objects. It defines
the size in bytes of each item in the array, and data is the buffer containing the actual
elements of the array.
b.itemsize
8
Datatypes
The array() function does not accept a single argument. You have seen that each ndarray
object is associated with a dtype object that uniquely defines the type of data that will
occupy each item in the array. By default, the array() function can associate the most
suitable type according to the values contained in the sequence of lists or tuples. Actually,
you can explicitly define the dtype using the dtype option as argument of the function.
f = np.array([[1, 2, 3],[4, 5, 6]], dtype=complex)
f
array([[1.+0.j, 2.+0.j, 3.+0.j],
[4.+0.j, 5.+0.j, 6.+0.j]])
While the ones() function creates an array full of ones in a very similar way.
np.ones((3, 3))
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
By default, the two functions created arrays with the float64 data type. A feature that will
be particularly useful is arange() . This function generates NumPy arrays with numerical
sequences that respond to particular rules depending on the passed arguments. For
example, if you want to generate a sequence of values between 0 and 10, you will be passed
only one argument to the function, that is the value with which you want to end the
sequence.
np.arange(0, 10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
If the third argument of the arange() function is specified, this will represent the gap
between one value and the next one in the sequence of values. This third argument can also
be a float.
np.arange(1, 6, 0.6)
array([1. , 1.6, 2.2, 2.8, 3.4, 4. , 4.6, 5.2, 5.8])
So far you have only created one-dimensional arrays. To generate two-dimensional arrays
you can still continue to use the arange() function but combined with the reshape()
function. This function divides a linear array in different parts in the manner specified by
the shape argument
b = np.arange(0, 12)
b.reshape(3, 4)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Another function very similar to arange() is linspace(). This function still takes as its first
two arguments the initial and end values of the sequence, but the third argument, instead
of specifying the distance between one element and the next, defines the number of
elements into which we want the interval to be split.
np.linspace(0,10,5)
Finally, another method to obtain arrays already containing values is to fill them with
random values. This is possible using the random() function of the numpy.random module.
This function will generate an array with many elements as specified in the argument.
np.random.random((3,3))
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
Note that the numbers obtained will vary with every run.
Basic operations
So far you have seen how to create a new NumPy array and how items are defined in it.
Now it is the time to see how to apply various operations to them.
Arithmetic operators
The first operations that you will perform on arrays are the arithmetic operators. The most
obvious are adding and multiplying an array by a scalar
a = np.arange(4)
print(a+4)
print(a*2)
[4 5 6 7]
[0 2 4 6]
These operators can also be used between two arrays. In NumPy, these operations are
element-wise, that is, the operators are applied only between corresponding elements.
These are objects that occupy the same position, so that the end result will be a new array
containing the results in the same location of the operands.
b = np.arange(4,8)
print(a + b)
print(a * b)
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
[ 4 6 8 10]
[ 0 5 12 21]
Moreover, these operators are also available for functions, provided that the value returned
is a NumPy array. For example, you can multiply the array by the sine or the square root of
the elements of array b
a * np.sin(b)
array([-0. , -0.95892427, -0.558831 , 1.9709598 ])
A = np.arange(0, 9).reshape(3, 3)
B = np.ones((3, 3))
A * B
array([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
Matrix Product
The choice of operating element-wise is a peculiar aspect of the NumPy library. In fact, in
many other tools for data analysis, the * operator is understood as a matrix product when
it is applied to two matrices. Using NumPy, this kind of product is instead indicated by the
dot() function. This operation is not element-wise.
np.dot(B,A)
The result at each position is the sum of the products of each element of the corresponding
row of the first matrix with the corresponding element of the corresponding column of the
second matrix. image.png
An alternative way to write the matrix product is to see the dot() function as an object’s
function of one of the two matrices.
A.dot(B)
array([[ 3., 3., 3.],
[12., 12., 12.],
[21., 21., 21.]])
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
Note that since the matrix product is not a commutative operation, the order of the
operands is important. Indeed, A * B is not equal to B * A
Checkpoint: difference between element-wise and matrix product, what will be the output
of each of the following code snippets?
X = np.ones((2,3))
Y = np.ones((2,3))*5
X*Y
array([[5., 5., 5.],
[5., 5., 5.]])
X.dot(Y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-31-604dbdfdc745> in <module>
----> 1 X.dot(Y)
ValueError: shapes (2,3) and (2,3) not aligned: 3 (dim 1) != 2 (dim 0)
a = np.arange(4)
a += 1
a = np.arange(4)
a *= 2
Aggregate functions perform an operation on a set of values, an array for example, and
produce a single result. Therefore, the sum of all the elements in an array is an aggregate
function. Many functions of this kind are implemented within the class ndarray
a = np.array([3.3, 4.5, 1.2, 5.7, 0.3])
print(a.sum())
print(a.min())
print(a.max())
print(a.mean())
print(a.std())
15.0
0.3
5.7
3.0
2.0079840636817816
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
a = np.arange(10, 16)
print(a[2])
print(a[-3])
12
13
Moving on to the two-dimensional case, namely the matrices, they are represented as
rectangular arrays consisting of rows and columns, defined by two axes, where axis 0 is
represented by the rows and axis 1 is represented by the columns. Thus, indexing in this
case is represented by a pair of values: the first value is the index of the row and the second
is the index of the column. Therefore, if you want to access the values or select elements in
the matrix, you will still use square brackets, but this time there are two values [row index,
column index]
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
A[1, 2]
15
Slicing
Slicing allows you to extract portions of an array to generate new arrays. When when you
use the Python lists to slice arrays, the resulting arrays are copies, but in NumPy, the arrays
are views of the same underlying buffer.
a = np.arange(10, 16)
a_= a[1:5]
a_[0] = 20
a
To better understand the slice syntax, you also should look at cases where you do not use
explicit numerical values. If you omit the first number, NumPy implicitly interprets this
number as 0 (i.e., the initial element of the array). If you omit the second number, this will
be interpreted as the maximum index of the array; and if you omit the last number this will
be interpreted as 1.
a[::2]
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
In the case of a two-dimensional array, the slicing syntax still applies, but it is separately
defined for the rows and columns. For example, if you want to extract only the first row:
A = np.arange(10, 19).reshape((3, 3))
A[0,:]
array([10, 11, 12])
Instead, if you want to extract a smaller matrix, you need to explicitly define all intervals
with indexes that define them.
A[0:2, 0:2]
array([[10, 11],
[13, 14]])
A
array([[10, 11, 12],
[13, 14, 15],
[16, 17, 18]])
If the indexes of the rows or columns to be extracted are not contiguous(in sequence), you
can specify an array of indexes.
A[[0,2], 0:2]
array([[10, 11],
[16, 17]])
Iteration
We can use the same syntax to access elements of list for... in ... command
for i in a:
print(i)
10
20
12
13
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
14
15
Moving to the two-dimensional case, you could think of applying the solution of two nested
loops with the for construct. The first loop will scan the rows of the array, and the second
loop will scan the columns. Actually, if you apply the for loop to a matrix, it will always
perform a scan according to the first axis.
for row in A:
print(row)
[10 11 12]
[13 14 15]
[16 17 18]
If you want to make an iteration element by element, you can use the following construct,
using the for loop on A.flat.
for item in A.flat:
print(item)
10
11
12
13
14
15
16
17
18
NumPy offers an alternative and more elegant solution than the for loop. Generally, you
need to apply an iteration to apply a function on the rows or on the columns or on an
individual item. If you want to launch an aggregate function that returns a value calculated
for every single column or on every single row, there is an optimal way that leaves it to
NumPy to manage the iteration: the apply_along_axis() function.
This function takes three arguments: the aggregate function, the axis on which to apply the
iteration, and the array. If the option axis equals 0, then the iteration evaluates the
elements column by column, whereas if axis equals 1 then the iteration evaluates the
elements row by row. For example, you can calculate the average values first by column
and then by row.
np.apply_along_axis(np.mean, axis=0, arr=A)
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
A < 0.5
array([[False, True, False, False],
[ True, True, True, False],
[ True, True, True, True],
[False, False, True, False]])
Exercise 1
Evaluate the following equation:
𝒚 = 𝑾∗𝒙+ 𝒃
where 𝑾 is a 3x3 matrix of random values and 𝒙 & 𝒃 are column vectors of values [2,1,1]
and [1,2,1] respectively.
Mechatronics Engineering and Automation Program
CSE488: Computational Intelligence
Lab #03: Introduction to Numpy
x = np.array([2,1,1])
b = np.array([1,2,1])
W = np.random.random((3,3))
y = W.dot(x)+b
y
array([3.53037336, 3.36998405, 3.54586506])
Exercise 2
Define a function that represents the following equation:
𝑥 4 − 𝑥 3 + +𝑥 2 − 𝑥 + 1
Get the output of the function corresponding to the input range from -2 to 2 with step of
0.1.
x = np.arange(-2,2,0.1)
def y(x):
return x**4 - x**3 + x**2 -x +1
y(x)
array([31. , 26.4011, 22.3696, 18.8551, 15.8096, 13.1875, 10.9456,
9.0431, 7.4416, 6.1051, 5. , 4.0951, 3.3616, 2.7731,
2.3056, 1.9375, 1.6496, 1.4251, 1.2496, 1.1111, 1. ,
0.9091, 0.8336, 0.7711, 0.7216, 0.6875, 0.6736, 0.6871,
0.7376, 0.8371, 1. , 1.2431, 1.5856, 2.0491, 2.6576,
3.4375, 4.4176, 5.6291, 7.1056, 8.8831])