Array: 3.1 Generating Sequential Arrays
Array: 3.1 Generating Sequential Arrays
Array
3.1.1 linspace
If we are interested in generating the vector, whose elements are uniformly spaced and we know the
upper, lower limit and the number of elements, then in that case linspace is the preferred choice.
Because linspace lies in numpy library, so first we have imported the library and have given it
an abbreviated name. Then we call the linspace with lower limit, upper limit and the number of
element to be generated. In this example, 0 is the lower limit, 2 is the upper limit, and number of
elements are 9. Let us generate one more vector to understand more about this function, this time
we take lower limit as 0, upper limit as 2π, and number of elements to be 100.
By default the number of elements are 50, so if we do not specify the number of elements, we get
50 elements with equal spacing. We can use len function to get the length of any array.
3.1.2 arange
Suppose again we want to generate a vector whose elements are uniformly spaced, but this time we
do not know the number of elements, we just know the increment between elements. In such situ-
ation arange is used. arange also requires lower and upper bounds. In the following example we
are generating the vector having lower element as 10, upper element as 30 and having an increment
of 30. So from the knowledge of linspace we will do something like this.
Oh! What happened? Why did Python not print 30. Because arange function does not include
second argument in the elements. So we want to print upto 30, we would do.
This time we get the required output. The arange can also take a float increment. Let us generate
some vector with lower bound of 0, upper bound of 2 and with an increment of 0.3.
In the case of float increment also, the maximum value of generated elements is lesser than the
second argument given to the arange.
3.1.3 zeros
zeros is used when we want to generate all the items in vector as 0.
3.1.4 ones
ones is used when all the required elements in vector are 1. Let us say, we want to generate a
variable foo which has all the elements equal to one, and has the dimension of 3 × 2.
Remember that if the number of dimensions are more than one, the dimension are given as tuple,
e.g. (2,5).
3.2. Useful attributes and methods 23
3.1.5 empty
empty is useful in initializing the variables. This assigns the garbage values to the elements, which
are to be modified later.
Additionally in zeros, ones, empty, the data type (e.g. int, float etc.) also can be defined.
You can see that all the elements of foo are now integer, even though the values are useless.
3.1.6 rand
rand is used to generate uniformly distributed random variables over the range of 0 to 1.
3.1.7 randn
randn is used to generate random variable having normal distribution with mean equal to zero and
variance equal to one.
am using normally distributed random variable to demonstrate, but these attributed can be used with
any numpy array. We are generating a 2 dimensional vector of size 5 × 100.
Let us check the number of dimension (not the size, or shape of the array). Number of dimension
means how many dimensions are associated with array. For example, in mathematics terminology
vector has one dimension, matrix has two dimension.
>>> foo.ndim
2
>>> foo.shape
(5, 100)
The size attribute provides the total number of elements in the array. This is simply the multiplica-
tion of all the elements given by shape attributes.
>>> foo.size
500
The data type (i.e. float, integer etc.) is extracted using the attribute dtype.
>>> foo.dtype
dtype('float64')
This tells us that, the variable foo is float, and has 64 bits. The average or mean of the variable is
computed by using mean method.
>>> foo.mean()
-0.11128938014455608
This provides the mean of entire array (i.e. 500 elements in this case). Suppose we want to estimate
the mean across some dimension say second (1) dimension, then in this case we need to provide
additional parameter to mean, i.e. axis.
>>> foo.mean(axis=1)
array([-0.07311407, 0.0705939 , -0.09218394, 0.0775191 , 0.01026461])
The minimum, maximum, standard deviation and variance of the array are estimated using min, max,
std, and var methods.
Remember that the line starting with # represents the comments. Comments make it easier to read
and understand the code. So put comments whenever you do something, which is not easy to
interpret from the code.
The trace of the matrix represent the sum of diagonal elements, and has meaning in case of square
matrix. Python even allows to estimate the trace even when matrix is not square, and trace is com-
puted by using the trace attributes.
>>> foo.trace()
1.081773080044246
There are number of attributes associated with each class, dir function is a useful tool in exploring
the attributes and method associated with any variable, class, library etc. Let us see what all methods
and attributes our variable foo have.
>>> # to get the list of all the attributes associated with foo variable
>>> dir(foo)
['T', '__abs__', ............. 'flat', 'view']
The output of dir(foo) is very long, and is omitted for brevity. The attributes/method starting with
_ are supposed to be the private attributes and are often not needed.
3.3 Indexing
In this section, we will discuss how to refer to some elements in the numpy array. Remember that in
Python first indices is 0. We shall generate some array, say some array whose elements are powered
to 3 of the sequence [0,1, ..., 9].
Print the third item in the array. Third item means we need to put indices as 2.
>>> foo[2]
8
Suppose, we would like to print some sequence of array, say at indices of 2,3, and 4.
>>> foo[2:5]
array([ 8, 27, 64])
26 Chapter 3. Array
We used 2:5 to get the values at indices of 2,3 and 4. This is same as saying that
foo[np.arange(2,5,1)]. When we do not specify the third value in the indices for array, it is
by default taken as 1. If we want to print value at 2 to 8, with an interval of 3. Now because the
interval is not 1, so we need to define it.
>>> foo[2:10:3]
array([ 8, 125, 512])
If we leave the first entry in the index as blank i.e. to get array elements form the beginning of array
with an interval of 2 and upto 6, we issue the following command:
We get element upto the indices of 4, because arange does not go upto the second argument. We
can use indices also to modify the existing elements in the array, in the same way as we accessed
them. Let us replace the existing value of elements at 0,2 and 4 indices, by -1000.
We get the last elements of an array by indices -1. We can also use this to reverse the array, by
giving the increment of -1.
We can perform the calculation on entire numpy array at once. Suppose we are interested in esti-
mating the square root of the numpy array, we can use sqrt function of numpy library.
nan represents that the element is ‘Not A Number’. So when the value of element is negative the
output of sqrt become nan. The Warning issued by Python tells that there were some invalid values
in the input for which sqrt can not produce any sensible output, and it provides warning (not errors).
In reality, the square root of negative number is complex number, but because we did not define the
variable as complex, numpy can not perform operations of complex numbers on this. We need
library which handles complex number for such situation.
indices, and then assigning new value to it. First, we generate normally distributed random number
of size (2 × 5) to create an array, which we would like to manipulate.
>>> foo.T
array([[ 1.02063865, -0.82198131],
[ 1.52885147, 0.20995583],
[ 0.45588211, 0.31997462]])
We can access some elements of the array, and if we want, new values also can be assigned to them.
In this example, we shall first access element at (0,1) indices, and then we shall replace it by 5.
Finally we will print the variable to check if the variable got modified.
>>> foo[0,1]
-0.82198131397870833
>>> foo[0,1]=5
>>> foo
array([[ 1.02063865, 5. ],
[ 1.52885147, 0.20995583],
[ 0.45588211, 0.31997462]])
The shape of any array is changed by using the reshape method. During reshape operation, the
change in number of elements is not allowed. In the following example, first we shall create an
array having size of (3 × 6), and the we shall change its shape to (2 × 9).
Like we can access the any elements of the array and change it, in similar way we can access the any
attributes, and modify them. However, the modification is only allowed if the attributes is writeable,
and the new value makes some sense to the variable. We can use this behaviour, and change the
shape of variable using the shape attributes.
>>> foo.shape
(4, 3)
>>> foo
array([[-1.47446507, -0.46316836, 0.44047531],
[-0.21275495, -1.16089705, -1.14349478],
[-0.83299338, 0.20336677, 0.13460515],
[-1.73323076, -0.66500491, 1.13514327]])
>>> foo.shape = 2,6
>>> foo.shape
(2, 6)
>>> foo
array([[-1.47446507, -0.46316836, 0.44047531, -0.21275495, -1.16089705,
-1.14349478],
[-0.83299338, 0.20336677, 0.13460515, -1.73323076, -0.66500491,
1.13514327]])
In the above example, first an array is defined with a size of (4 × 3) and then its shape is assigned a
value of (2,6), which makes the array of size (2 × 6). As we can not change the number of elements,
so if we define one dimension of the new variable, second dimension can be computed with ease.
Numpy allow us to define -1 for the default dimension in this case. We can make the desired change
in the shape of variable by using default dimension also.
We can flatten the array (make array one dimensional) by using the ravel method, which is ex-
plained in the following example:
4.1 Introdution
This chapter will provide applications of python in hydrology. Most of the problems given in
this chapter are taken from the book titled “Applied Hydrology” by Chow et al, and for detailed
description of them, you should refer to the book. These examples include the equations commonly
encountered in the hydrology. I have choose these problems to teach Python by using examples,
and additionally in every example we will be learning new things about Python.
>>> T = 50
>>> es = 611*np.exp(17.27*T/(237.3+T))
>>> print(es)
12340.799081
Let us plot the variation of es versus T over the range of −100 ≤ T ≤ 100. The plt.plot(x,y)
makes the line plot of y versus x, with default color of blue. The plt.xlabel() and plt.ylabel()” are
used to write labels on x and y axis respectively. The input to xlable and ylabel must be a string,
or a variable which contains a string. The plt.show() displays the graph on computer screen.
The resulted plot is shown in Fig. 4.1. This example demonstrates how to graphically visualize the
variation of one variable with respect to the another variable, while former is explicit function of
later.
120000
100000
80000
es (Pa)
60000
40000
20000
0100 50 0 50 100
T (degree Celcius)
Figure 4.1: The variation of saturation vapor pressure (es ) versus temperature (T ).
4.3 Precipitation
where, g is the acceleration due to gravity, D is the diameter of the falling raindrop, ρw is the density
of water, ρa is the density of air, and Cd is the drag coefficient. The Stoke’s law can be used to
calculate drag coefficient (Cd = 24/Re), which is valid for raindrop having diameter less than 0.1
mm. Re is the Reynold number, which can be calculated as ρaV D/µa . Let us assume, that the Re is
given as 5.0, and the raindrop has diameter of 0.05 mm, and we want to estimate the Vt . (ρw = 998,
ρa = 1.2).
In this example we see that ‘;’ allows us to define many expressions in one line.
4.4. Rainfall 31
4.4 Rainfall
Often, we are given a rainfall recorded by a rain gauge which provides the rainfall depths recorded
for successive interval in time, and we want to compute the cumulative rainfall. In this example first
we shall create rainfall using the random numbers, and we shall also create time variable having
values [0,5,10, ...., 100].
Now we make a bar plot using the plt.bar(), for the rainfall which depicts temporal behaviour of
the rainfall.
The resulted bar plot of rainfall is shown in Fig 4.2. You might have noticed that in the section 4.2,
we used the plt.show(), while in the above example we used plt.savefig. The plt.show()
shows the graph on computer screen, which can be saved later, while the plt.savefig() saves the
graphs in computer, which can be viewed after opening the file. It is just matter of taste, what you
like, optionally both can be done on same graph. I prefer to save the figures in the computer and
then see them.
The cumulative sum is calculated by using the cumsum function of the numpy library.
Now we plot the cumulative rainfall. The resulted cumulative rainfall is shown in Fig. 4.3. The
plt.clf() clears the current figure, and is quiet useful when making multiples plots, and there is
any existing plot in the python memory. Just don’t use the clf in this, and see the difference.
>>> plt.clf()
>>> plt.plot(time,cum_rainfall)
>>> plt.xlabel('Time')
>>> plt.ylabel('Cummulative rainfall')
>>> plt.savefig('/home/tomer/articles/python/tex/images/cum_rain.png')
32 Chapter 4. Basic applications in Hydrology
Usually, we are given the rainfall at some rain gauges, and we want to make the isohyete (contour)
plot of the rainfall. To demonstrate this situation, fist we shall generate locations (x,y) and rainfall
for ten stations using random numbers. The generated locations of the rain gauges is shown in Fig.
4.4.
I prefer to add blank lines after a section of code, and comment on the top of section what it is doing.
This increases the readability of the code. The plt.scatter() makes the scatter plot, i.e. the dots
are plotted instead of lines. When there is no order in the data with respect to their position in the
array, then scatter plot is used. Like in this case, it is possible that two stations which are close by,
but might be placed at distant in the array.
The flow chart of preparing contour map is given in Fig. 4.5. First, we need to generate the
grid with regular spacing having the same extent as of the locations of rainfall gauges. Then,
from the given location and rainfall data, we need to compute data at regular grid using some
interpolation scheme. After this contour maps can be obtained. The griddata function of the
scipy.interpolate library is useful in obtaining the gridded data (data at regular grid). When
we need only one or few functions from the library, it is better to call them explicitly, e.g.
from scipy.interpolate import griddata, like in the following example. We use meshgrid
function of numpy library, to create the mesh from the given x and y vectors.
>>> from scipy.interpolate import griddata
>>> #generate the desired grid, where rainfall is to be interpolated
>>> X,Y = np.meshgrid(np.linspace(0,1,1000), np.linspace(0,1,1000))
>>>
>>> #perform the gridding
>>> grid_rain = griddata((x,y), rain, (X, Y))
Now, we can make the contour plot of the gridded data, which is made by plt.contourf() function.
The contourf makes filled contours, while contour() provides simple contour. Try using the
contour instead of contourf, and you will see the difference. We begin by clear current figure
by using the plt.clf(), as there might be some existing figure in the memory especially if you
are following all the examples in the same session. We are also overlaying the locations of rainfall
gauges using the plt.scatter(). The s and c are used to define the size and color of the markers
respectively. The plt.xlim() and plt.ylim() limits the extent of the x and y axis respectively.
>>> plt.clf()
>>> plt.contourf(X,Y,grid_rain)
>>> plt.colorbar()
>>> plt.xlabel('X')
>>> plt.ylabel('Y')