PostgreSQL Database Handbook PDF
PostgreSQL Database Handbook PDF
ii
Contents
1.1
Whats PostgreSQL? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Installing PostgreSQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
1.4
1.5
2.1
PostgreSQL commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1
Getting help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2
2.2
2.3
Enumerated types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
13
3.1
3.2
Introducing VACUUM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
18
4.1
Introducing indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3
Unique indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4
Multicolumn indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
22
5.1
5.2
5.3
More queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
27
6.1
6.2
6.3
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
32
7.1
7.2
7.3
7.4
7.5
7.6
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
36
8.1
8.2
Installing Barman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.3
9
iii
8.2.1
8.2.2
8.2.3
8.2.4
8.2.5
8.2.6
8.2.7
8.2.8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Automating backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
44
9.1
9.2
9.3
9.4
9.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
iv
Preface
PostgreSQL, often simply Postgres, is an object-relational database management system (ORDBMS) with an emphasis on extensibility and standards-compliance. As a database server, its primary function is to store data securely, and to allow for retrieval
at the request of other software applications. It can handle workloads ranging from small single-machine applications to large
Internet-facing applications with many concurrent users.
PostgreSQL is developed by the PostgreSQL Global Development Group, a diverse group of many companies and individual
contributors. It is free and open-source software, released under the terms of the PostgreSQL License, a permissive free-software
license. (https://en.wikipedia.org/wiki/PostgreSQL)
In this ebook, we provide a compilation of PostgreSQL tutorials that will help you set up and run your own database management
system. We cover a wide range of topics, from installation and configuration, to custom commands and datatypes. With our
straightforward tutorials, you will be able to get your own projects up and running in minimum time.
vi
1 / 50
Chapter 1
1.1
Whats PostgreSQL?
PostgreSQL, also known by its alias Postgres, is a cross-platform object-relational database management system (ORDBMs
for short). Its development started in the University of California at Berkeley in the mid `80s with a project they named simply
POSTGRES, which did not feature SQL as query language at first. In the mid `90s, two students added SQL to the code inherited
from the university, and PostgreSQL was born as an open-source project. Today, PostgreSQL has been long known (and has a
strong reputation for) for being able to handle significant workloads with a large number of concurrent users. In addition, it
provides bindings for many programming languages, making it an ideal solution for a client-server environment.
1.2
Installing PostgreSQL
In this article we will explain how to install a PostgreSQL server in Ubuntu Server 16.04 (IP address 192.168.0.54), how to load
a sample database, and how to install a client application (which will serve as an administrative tool) for Linux and Windows.
Step 1 - Launch a terminal and install the server and the web-based administration tool:
sudo aptitude install postgresql phppgadmin
Step 2 - Verify that the database service is running and listening on port 5432:
systemctl is-active postgresql
sudo netstat -npltu | grep postgres
The first command should indicate that unit postgresql is Active, and the second command should show that the service is
listening on the right port, as shown in Fig. 1.1:
Figure 1.1: Verifying that PostgreSQL is running and listening on port 5432
2 / 50
Step 3 - Switch to the postgres Linux account and create a new role for queries:
The installation process created a new Linux account named postgres. By default, this is the only account with permissions to
access the database prompt initially.
To switch to the postgres account, do
sudo -i -u postgres
And run the following command to create a new database role named gacanepa (enter the password twice when youre prompted
to do so):
createuser gacanepa --no-createdb --no-superuser --no-createrole --pwprompt
Although the options in the above command are self-explanatory, lets just say that this particular role will not be allowed to
create databases or roles, and will not have superuser privileges. Other options for the createuser command are available in its
man page (which you can access from the Linux command prompt as man createuser).
Step 4 - Create a new database
While youre still logged on as postgres, create a database:
createdb World_db
1.3
Once we have created the database, its time to populate it with actual data we can later query:
Step 5 - Download a sample database
The wiki links to several sample databases that we can download and use. For this example, we will download and install the
world database, which contains countries, cities, and spoken languages, among other data.
wget https://pgfoundry.org/frs/download.php/527/world-1.0.tar.gz[https://pgfoundry.org/frs/ download.php/527/world-1.0.tar.gz]
tar xzf world-1.0.tar.gz
As the tables are created and populated with data, the output should be similar to Fig. 1.3:
Figure 1.3: Restoring the database contents from the dump file
After completing the above 6 steps, we now have a fully-setup PostgreSQL database.
1.4
In order to allow remote (LAN) access to the web-based administration tool, follow these steps:
Step 7 - Integrate phppgadmin with Apache
Open /etc/apache2/conf-enabled/phppgadmin.conf, and comment out the following line:
Require local
then add
Require all granted
3 / 50
4 / 50
Finally, grant SELECT permissions to role gacanepa, and exit (q) the database prompt:
GRANT SELECT ON ALL TABLES IN SCHEMA public TO gacanepa;
\q
5 / 50
6 / 50
Figure 1.7: Our first query to the PostgreSQL database through phppgadmin
Click Execute at the bottom. The results should be as shown in Fig. 1.8:
1.5
If you are using Microsoft Windows, in addition to phppgadmin (which you can access through a web browser), you can also
install a client application named pgAdmin in order to connect to the database server. You can download it from the pgAdmin
PostgreSQL tools page at https://www.pgadmin.org/download/windows.php. The installation will only take a few clicks.
Although it is better known in Windows environments, pgAdmin is also available for Mac OS X as well.
When youre done with the installation, make sure the following lines are present in the configuration files. Otherwise, you will
NOT be able to connect to the database server from a machine other than where you installed and running.
In /etc/postgresql/9.5/main/postgresql.conf:
listen_addresses = *
will ensure the database server is listening on all interfaces, and because of the following line in /etc/postgresql/9.5/main/pg_hba.conf:
host
all
all
192.168.0.0/24
md5
7 / 50
you can now connect to the database server from any machine in the 192.168.0.0/24 network.
Once you added the above lines, open pgAdmin from Start All programs pgAdmin III. Then click on File Add server
and fill the connection details (see Fig. 1.9). If you fill the password box as shown below, the credentials will be saved in plain
text in your user profile. If you are using a shared computer that is probably not a good idea, so you may want to leave that field
blank if thats the case:
8 / 50
Chapter 2
2.1
PostgreSQL commands
In this tutorial we will introduce you to other useful PostgreSQL-specific (psql for short from now on) commands. To do so, lets
open the psql prompt by switching to the postgres Linux account and typing psql in the command line.
2.1.1
Getting help
Once in the psql prompt, type help and press Enter. The output should be similar to Fig. 2.1:
9 / 50
you need help with, use q to return to the psql prompt and then type h followed by the SQL command you have chosen. For
example, lets say we chose ALTER USER. To see the help for that command, do
\h ALTER USER
2.1.2
If you find yourself examining a database server you havent previously worked with, or if you are not familiar with the structure
of a given database, you may want to start off by listing the databases and their respective tables.
10 / 50
2.2
As a preparation to creating our own databases and tables from scratch (which we will cover in an upcoming tutorial), we need
to know how what are the allowed built-in, general-purpose data types for table fields. The PostgreSQL 9.5 documentation lists
the following data types and more:
a) Numeric types (with corresponding storage sizes and ranges) are listed in Fig. 2.5:
11 / 50
12 / 50
2.3
Enumerated types
Besides the general purpose data types, PostgreSQL allows us to create our own data types in the form of static, ordered set of
values (for example, the months of the year, or the days of the week), similarly to the enum type supported in several programming
languages. We will see the benefit of enumerated types when we create our first database and start inserting data into it.
2.4
Summary
Now that you have learned how to use basic psql commands and have reviewed the most used data types, we are better prepared
to dive deeper into PostgreSQL database administration. Stay tuned for the next tutorial!
13 / 50
Chapter 3
3.1
Using the World_db database, lets update by 7% the population of all cities in the city table. Before we do that, lets take a look
at the impact this operation would have on the current data by using a basic SELECT statement.
Before a mass update or removal, using SELECT to print the records that will be impacted by that operation is a wise thing to
do. Among other things, this can help you prevent undesired results (and the associated later regret), especially if you forget to
add a WHERE clause to the operation.
We will print the city name, its current population, and the population after our proposed update. To round the population increase
to the nearest integer, we will use the ROUND function as shown in Fig. 3.1:
14 / 50
Figure 3.1: Displaying the results of a preliminar SQL query before updating
SELECT name AS "Name", population AS "Current population", ROUND(population * 1.07)
New population" FROM city ORDER BY name;
AS " -
With the AS keyword you can create an alias for the associated field so that the results of the query will use it as header. As you
can see in Fig. 3.1, we renamed name and population to Name and Current population, respectively. In addition, we named
the results of the mathematical operation as New population.
Now lets do the actual update. In this case we will not use a WHERE clause as we actually want to update all cities. This will
result in the population update of all 4079 cities currently present in the city table, as we can see in Fig. 3.2:
Now lets delete all Australian cities where the Id is greater than 135 (this will exclude Canberra, the capital, which is referenced
in the country table). As before, use a SELECT first to examine the records that will be deleted:
SELECT name FROM city WHERE countrycode=AUS;
15 / 50
3.2
Introducing VACUUM
To formally introduce VACUUM, lets use what we learned in PostgreSQL commands and data types to display the help about
this command (see Fig. 3.4):
16 / 50
All of the below commands can be applied to the entire database (no arguments) or a single table (name the table at the end of
the command).
To collect the garbage present in the database, just do
VACUUM;
However, that will not free up the space used by the old records back to the operating system - it will only clean up the old records
and then make the space available to be reused by the same table. On the other hand,
VACUUM FULL;
will ensure that whatever space is freed up will be returned to the operating system.
Additionally,
VACUUM FULL VERBOSE;
As you can see above, VACUUM located and removed the space left behind by the deletion of the 8 records from the city table
earlier. On large scale updates and removals, this will translate into considerable space disk savings.
As good as the VACUUM command is, having to run it manually could become a tedious task. Thus, by default, theres an
AUTOVACUUM daemon that is enabled and does the job for you automatically while the database server is running. You can
17 / 50
find more details about its operations in the AUTOVACUUM PARAMETERS section of the main configuration file /etc/
postgresql/9.5/main/postgresql.conf.
You can verify that the AUTOVACUUM process is running with:
ps aux | grep autovacuum | grep -v grep
3.3
Summary
Freeing up space in tables that are constantly updated or where records are often deleted not only will help you save space, but
also improve the performance of queries performed on the table. Following the instructions shared in this article you will be
contributing to the health of your database and saving valuable storage space.
18 / 50
Chapter 4
4.1
Introducing indexes
The best way to introduce the concept and the use of indexes in a database is using a book analogy. If you buy a new book for
a college class, you will most likely start by looking at the index at the end of the book for a particular topic. There is no doubt
that this would be a much faster way to find the information that you need than thumbing through the book from the beginning.
Likewise, in the context of databases, an index is an actual structure that references the information found in a given table.
Particularly in PostgreSQL, an index consists of a copy of the indexed data along with the corresponding reference to its location.
Thus, insert and update queries are expected to become slower on columns with indexes. That said, the first rule of thumb
is: Avoid at the extent possible creating indexes on columns with frequent bulk inserts or updates. Use indexes on columns
that are mostly read-only or where the volume of insert / update operations is low. Additionally, indexes can also improve the
performance of update operations that use WHERE clauses.
4.2
Examples
Lets return to the book analogy for a moment and use the World_db database to illustrate the need for indexes. Lets modify a
little the query that we used as an introductory example in the first article of this series:
SELECT A.Id, A.name "City", A.district "District", B.name "Country", C.language "Language", CASE WHEN C.isofficial=TRUE THEN Yes WHEN C.isofficial=FALSE THEN No END " Official language?" FROM city A JOIN country B ON A.countrycode=B.code JOIN countrylanguage C ON A.countrycode=C.countrycode WHERE A.Id=72;
The above query will return all records where the Id column in the city table is 72. Since we are performing a JOIN operation
with other tables it is to be expected that we will get more than one result. In this case, we got 3 different records based on the
different languages associated with this city, as you can see in Fig. 4.1:
19 / 50
Then repeat the EXPLAIN ANALYZE plus the query. Results are shown in Fig. 4.3:
20 / 50
Figure 4.3: Running EXPLAIN ANALYZE against the SQL query AFTER creating an index
We can see that the use of the newly-created index was able to reduce the execution time by ~17% (0.159 ms compared to 0.191
ms).
On top of that, please refer to the figures in Fig. 4.4 that correspond to each query:
Figure 4.4: Estimated and actual startup and completion times before and after using an index
While the number of rows returned by each query was the same, the numbers inside parentheses show a performance increase.
The first number (0.085 in the first case and 0.053 in the second) represents the estimated start-up time of the associated query
step whereas the second number (0.130 and 0.098) indicates the actual execution time of such step.
The PostgreSQL documentation specifically states that learning how to use and interpret the EXPLAIN command is an art, and
as such, it takes time to understand and master. We used it here to analyze our query and demonstrate the bounties of having an
index in a table, but there is much more to EXPLAIN than that.
4.3
21 / 50
Unique indexes
There is a special type of index called unique. When it is used, it guarantees that the associated table will not have more than one
row with the same value and thus will helps us maintain data integrity and improve performance. Instead of a regular index, we
could have created an unique index in the city.Id column above as follows:
CREATE UNIQUE INDEX cityId_idx ON city(Id);
4.4
Multicolumn indexes
If you are likely to use more than one column in a SELECT query with a WHERE clause frequently, you may considering using
a multicolumn index on them. The syntax is similar to the case of a single index:
CREATE INDEX index ON table (column1, column2);
where column1 and column2 are the columns where the index will be created. Feel free to add more columns if needed.
4.5
Summary
In this article we have discussed the need for indexes to improve performance on SELECT queries that use WHERE clauses. If
you keep in mind the book analogy presented at the beginning, you will remember the fundamental concept behind using indexes.
22 / 50
Chapter 5
5.1
To being, lets switch to the postgresql Linux account and enter the psql prompt.
sudo -i -u postgres
psql
As we have explained previously, at this point we are not connected to any database.
Inside the database we were about to create, we will add two tables where we will store the actual information in an organized
manner. Our database will be called BookstoreDB and the two tables will be AuthorsTBL and BooksTBL with the following
fields in them (if you feel you need to brush up your memory about data types, feel free to refer to PostgreSQL commands and
datatypes):
23 / 50
Now the next step consists of populating the database with actual data.
5.2
24 / 50
Since BooksTBL contains a foreign key that points to AuthorsID in AuthorsTBL, we will need to create a few records in that
table first using the INSERT statement as follows. Note that each value must match the right field and data type:
INSERT INTO AuthorsTBL (AuthorName, LastPublishedDate) VALUES (J. K. Rowling, 2011-07-11 );
INSERT INTO AuthorsTBL (AuthorName, LastPublishedDate) VALUES (John Doe, 2015-08-29);
Afterwards, we can use the SELECT statement to query the AuthorsTBL. Note how the AuthorID field was populated automatically since its data type was set to serial and primary key (see Fig. 5.4):
25 / 50
Figure 5.5: Inserting data with a non-existent foreign key causes an error
As you can see, an insert with a foreign key referencing a non-existent primary key in AuthorsTBL fails.
5.3
More queries
The classic SELECT statement as used earlier will return all the fields in a given table (that is what the star sign * stands for).
We can also restrict the number of fields by listing them after the SELECT. For example, we can do
SELECT AuthorName FROM AuthorsTBL;
to retrieve only the AuthorName. Of course, thats going to be of little use, but it is worth mentioning.
We can also choose to combine records from both tables using a JOIN. This operation allow us to return a set of records from
two or more tables as if they were stored in a single one. To illustrate, we will list all book titles along with the author name and
perform the JOIN on the field that both tables have in common (AuthorID):
SELECT BooksTBL.BookName, AuthorsTBL.AuthorName FROM BooksTBL JOIN AuthorsTBL ON BooksTBL. AuthorID=AuthorsTBL.AuthorID;
If we only want to return those books where J. K. Rowling is the author, we can add a WHERE clause and either use AuthorID=1
or AuthorName=J. K. Rowling in the filter. Usually, integers are preferred in WHERE clauses, so we will go with
SELECT BooksTBL.BookName, AuthorsTBL.AuthorName FROM BooksTBL JOIN AuthorsTBL ON BooksTBL. AuthorID=AuthorsTBL.AuthorID WHERE AuthorsTBL.AuthorID=1;
You can view the result of the above queries in Fig. 5.6:
26 / 50
5.4
Summary
In this article we have explained how to create a database role and make it the owner of a database during creation. In addition,
we showed how to create tables -taking into consideration the available datatypes- and how to populate and query them. By using
JOINs and WHERE clauses you will be able to retrieve the necessary information as if it was all in the same table.
27 / 50
Chapter 6
6.1
Formally speaking, a CTE is a temporary result set that is created through the use of a WITH clause and is valid only during
the execution of a given query. Another distinguishing feature of a CTE is that it can either reference itself (recursive CTE) or
not (non-recursive CTE), providing the flexibility that common queries do not provide. A recursive CTE is often used when a
calculation needs to be reported as part of the final result set, whereas a non-recursive one is usually utilized for a regular query.
Additionally, its definition -meaning the fields it returns- is not stored as a separate database object.
Although Common Table Expressions can be used in SELECT, INSERT, UPDATE, or DELETE operations, we will only use the
first type as it is the easiest to understand. Once you feel comfortable with using CTEs that involve SELECTs only, refer to the
official PostgreSQL 9.5 documentation to learn how to use them with the other operation types.
All of these new concepts will better sink in as we illustrate them through examples, so lets begin.
6.2
As usual, we will use the World_db database we installed in the first article of this series. To begin, lets consider the following
query:
SELECT A.name "City", A.district "District",
B.name "Country", C.language "Language"
FROM city A JOIN country B ON A.countrycode=B.code
JOIN countrylanguage C ON A.countrycode=C.countrycode
WHERE A.name=Rosario AND C.isofficial=TRUE;
As you can probably guess by now, it will return the city name, the district, the country, and the official language where the city
name is Rosario. If you look carefully, this query uses 2 JOINs - not a bad thing in itself, but the readability certainly could use
some improvements.
Our first example of a Common Table Expression will be rather basic but does the job of introducing the concept:
28 / 50
WITH t AS (
SELECT A.name City, A.district District,
A.countrycode CountryCode, B.name Country
FROM city A JOIN country B ON A.countrycode=B.code)
SELECT t.City, t.District, t.Country, C.language
FROM t JOIN countrylanguage C on t.CountryCode = C.countrycode
WHERE t.City=Rosario AND C.isofficial=TRUE;
Before we go into PostgreSQL and run the above query, lets split it into two parts to explain what is happening.
Step 1 - Define the CTE using the WITH clause. For simplicity, we will name the CTE as t, but you can use other name if you
want.
WITH t AS (
SELECT A.name City, A.district District,
A.countrycode CountryCode, B.name Country
FROM city A JOIN country B ON A.countrycode=B.code)
If we were to do a SELECT * FROM t; at this point, we would get all the cities with their corresponding district and country.
You may well be saying to yourself, Then I dont see whats the point in using CTEs - but wait, Step 2 will shed some light on
the why.
Step 2 - Select the fields from the CTE and perform a JOIN with another table. As the CTE can be considered a temporary
result set, we can perform JOINs on other tables. However, in this case we can use the more descriptive names given by the CTE
instead of the original table names (are you seeing the readability improvements already?). Since both the city and country tables
contain a field called name, the CTE allows us to refer to the city and country names as City and Country instead.
SELECT t.City, t.District, t.Country, C.language
FROM t JOIN countrylanguage C on t.CountryCode = C.countrycode
WHERE t.City=Rosario AND C.isofficial=TRUE;
As you can see in Fig. 6.1, the result is identical to the original query:
29 / 50
30 / 50
(Programming 2, 2),
(Advanced Geometry, 3),
(Control systems, 3),
(English as a Second Language 1, 3),
(Literature, 3),
(Physics 2, 4),
(Calculus 2, 4),
(Graphs and Math, 7),
(English as a Second Language 2, 7),
(Basic algorithms, 8),
(Advanced algorithms, 8),
(Programming with C, 8);
In this case were interested in retrieving a list of classes and their children down to a given level. For example, we will start with
Algebra 1 (ClassID=2) and descend down to the last class that depends on it:
WITH RECURSIVE classes AS (
SELECT
ClassID,
ClassParentID,
ClassDescription
FROM
CollegeClasses
WHERE
ClassID = 2
UNION
SELECT
e.ClassID,
e.ClassParentID,
e.ClassDescription
FROM
CollegeClasses e
INNER JOIN classes s ON s.ClassID = e.ClassParentID
) SELECT * FROM classes;
This query, as in the previous section, deserves a detailed explanation. Lets begin by saying a recursive CTE consists of 4
components:
#1 - A non-recursive query. In this case, it is a query to retrieve the CollegeClass information where ClassID=2:
SELECT ClassID, ClassParentID, ClassDescription
FROM CollegeClasses WHERE ClassID = 2
#2 - The UNION or UNION ALL operator. Any of these operators allows us to combine one or more result sets into a single
one. The choice of one above the other will depend on whether you want to avoid duplicates (if any) or return them, respectively.
#3 - The recursive term. Note that the classes temporary table references itself in this part of the CTE:
SELECT e.ClassID, e.ClassParentID, e.ClassDescription
FROM CollegeClasses e INNER JOIN classes s ON s.ClassID = e.ClassParentID
#4 - The final statement, which is executed once the iterations in Part 3 have finished. In this case,
SELECT * FROM classes;
That said, lets take a look at the result of the query (see Fig. 6.2) and examine it to see if it meets our expectations:
31 / 50
6.3
Summary
In this article we have explained how to create recursive and non-recursive Common Table Expressions in PostgreSQL. As you
pursue the study of this topic, keep in mind that using CTEs is not a matter of improving performance, but readability and
maintainability.
32 / 50
Chapter 7
7.1
Then edit /etc/network/interfaces and make sure the configuration for enp0s3 (the main NIC) looks as follows:
iface enp0s3 inet static
address 192.168.0.55
netmask 255.255.255.0
gateway 192.168.0.1
dns-nameservers 8.8.8.8 8.8.4.4
127.0.0.1
192.168.0.54
33 / 50
ubuntu-slave
ubuntu-master
7.2
To begin, we will create a dedicated user (repuser in this case) and we will limit the number of simultaneous connections to 1.
Enter the psql command prompt and do:
CREATE USER repuser REPLICATION LOGIN CONNECTION LIMIT 1 ENCRYPTED PASSWORD rep4scg;
In /etc/postgresql/9.5/main/postgresql.conf, make sure the following settings and values are included:
listen_addresses = localhost,192.168.0.54
wal_level = hot_standby
max_wal_senders = 1
hot_standby = on
and in /etc/postgresql/9.5/main/pg_hba.conf:
hostssl
replication repuser
192.168.0.55
md5
Next, switch to user postgres, generate a public key and copy it to the slave. This will allow the master to replicate automatically
to the slave:
ssh-keygen -t rsa
ssh-copy-id 192.168.0.55
When prompted to enter the password for user postgres in the slave machine, do so before proceeding.
Now restart the database service:
sudo systemctl restart postgresql
7.3
Make sure the database service is stopped before proceeding. Otherwise, youre in for a nasty database corruption in a few
moments.
systemctl stop postgresql
Then edit /etc/postgresql/9.5/main/postgresql.conf and make sure the following settings / values are included:
listen_addresses = localhost,192.168.0.55
wal_level = hot_standby
max_wal_senders = 1
hot_standby = on
replication repuser
192.168.0.54
md5
7.4
34 / 50
3b- In the slave, create a .conf file with the connection info to the master server. We will name it recovery.conf and save it in /
var/lib/postgresql/9.5/main:
standby_mode = on
primary_conninfo = host=192.168.0.54 port=5432 user=repuser password=rep4scg
where user and password need to match the credentials created at the beginning of Step 1.
Now we can proceed to start the database server in the slave:
sudo systemctl start postgresql
7.5
In the master, we will switch to user postgres and execute a simple query to SELECT and then update a record from the city
table in the World_db database. At the same time, we will query that same record in the slave before and after performing the
UPDATE in the master. Refer to Fig. 7.1 for more details:
sudo -i -u postgres
psql
\c World_db;
then
SELECT name, countrycode, population FROM city WHERE name=Brisbane;
UPDATE city SET population=1402568 WHERE name=Brisbane;
35 / 50
7.6
Troubleshooting
If the database service refuses to start successfully, you will not be able to run psql in the Linux command line. In that case, you
will have to troubleshoot using the following resources:
systemctl -l status [email protected]
journalctl -xe
tail -f /var/log/postgresql/postgresql-9.5-main.log
Thats all folks! You should have a PostgreSQL hot standby replication in place.
36 / 50
Chapter 8
8.1
Traditionally, PostgreSQL database administrators used shell scripts and cron jobs to back up their databases. Although this
approach was considered efficient a decade (or so) ago, today there are tools that make this process hassle-free and easier to
maintain. Among these tools, Barman (Backup and Recovery Manager), a Python-based open source solution developed and
maintained by 2ndQuadrant (a firm that specializes in PostgreSQL services) stands out.
8.2
Installing Barman
More accurately, Barman is a backup, restore, and disaster recovery tool for PostgreSQL. We will install it on the virtual machine
that we called newserver (192.168.0.54) to migrate the databases from oldserver (192.168.0.55).
That said, lets install Barman:
sudo aptitude update && sudo aptitude install barman
Once the installation has completed successfully, proceed with the following steps.
8.2.1
In order for barman (which has been installed in newserver) to communicate with the PostgreSQL instance running on oldserver,
we need to create a dedicated database user. To do so, run the following command as postgres on oldserver and enter the desired
password for the new database user. Also, when youre prompted to confirm if the account should have superuser privileges,
enter y and press Enter
createuser --interactive -P barman
Then test the connection from newserver. We will check the connection against the postgres database, but you can use other
database (in that case, youll have to modify the SQL query inside single quotes):
psql -c SELECT version() -U barman -h 192.168.0.55 postgres
37 / 50
Figure 8.1: Creating a dedicated user account and testing the connection
Throughout this article, we will use the word Barman to refer to the program itself, whereas the all-lowercase barman will
represent either the command associated with the program or an account.
8.2.2
As part of the installation of Barman on newserver, a Linux account called barman was created. To set its password, do
sudo passwd barman
The actual format for the .pgpass file is hostname:port:database:username:password. If an asterisk is used in any of the first four
fields, it will match everything. Please note that username here represents the PostgreSQL user we created in Step 1, not the
Linux account we just referred to. The official documentation for this file can be found here.
This file can contain passwords to be used if a connection requires one (in this case, barman will use it to talk to the PostgreSQL
instance on oldserver).
8.2.3
38 / 50
In order to perform backups without user intervention we will need to set up and copy SSH keys for passwordless authentication.
Barman will make use of this method to copy data through rsync.
On newserver, switch to user barman and generate the keys
ssh-keygen -t rsa
(choose the default destination file for the public key and an empty passphrase).
Next, copy the public key to the authorized keys of user postgres on oldserver:
ssh-copy-id [email protected]
This will allow barman on newserver to connect to oldserver as user postgres. To test if the connection can be made without
password, as expected, you can run the following command (on success, it will not return anything):
ssh [email protected] -C true
Youll also need to allow barman to SSH into localhost as the local user postgres:
ssh-copy-id postgres@localhost
ssh postgres@localhost -C true
and copy the resulting key to the list of authorized keys for user barman on newserver:
ssh-copy-id [email protected]
8.2.4
On newserver, open the Barman main configuration file (/etc/barman.conf) and uncomment this line by removing the leading
semicolon:
;configuration_files_directory = /etc/barman.d
should read
configuration_files_directory = /etc/barman.d
(if /etc/barman.d does not exist, youll have to create it with mkdir /etc/barman.d)
And create a file named oldserver.conf with the following contents (the word inside square brackets represents the name that
barman will use to identify the connection details):
[oldserver]
description = "Our old PostgreSQL server"
conninfo = host=192.168.0.55 user=barman dbname=World_db
ssh_command = ssh [email protected]
retention_policy = RECOVERY WINDOW OF 2 WEEKS
where most variables are self-explanatory with the exception of retention_policy. This variable is used to determine for how
long backups should be kept (2 weeks in this case). This should be modified based on the expected activity and growth of the
database, and the available space on the filesystem where backups will be kept.
8.2.5
39 / 50
On oldserver:
Add this line to /etc/postgresql/9.5/main/pg_hba.conf:
host
all
all
192.168.0.54/24
trust
Then make sure the following variables on /etc/postgresql/9.5/main/postgresql.conf have the indicated values:
wal_level = archive
archive_mode = on
archive_command = rsync -a %p [email protected]:/var/lib/barman/oldserver/incoming/%f
As you will probably guess, the directory in the rsync connection string represents the directory where the backup files for
oldserver will be kept on newserver.
On newserver, make sure the following variable on /etc/postgresql/9.5/main/postgresql.conf has the indicated value:
data_directory = /var/lib/postgresql/9.5/data
If the directory called data does not exist under /var/lib/postgresql/9.5, create it before proceeding (that is where the data files
will be stored on newserver)
Then restart the postgresql service to activate the latest changes:
sudo systemctl restart postgresql
8.2.6
Once PostgreSQL has been configured on oldserver to allow connections from newserver, we are ready to test the configuration.
To do so, switch to user barman on newserver and do
barman check oldserver
barman list-server
The first command will check the SSH and PostgreSQL connections, whereas the second one will show the list of configured
PostgreSQL servers we wish to back up.
The output should be as follows (see Fig. 8.2):
8.2.7
40 / 50
Once all of the items in the output of barman check oldserver return OK, we are ready to perform our first backup with
the following command (see Fig. 8.3):
barman backup oldserver
which will list all the backups we have performed for oldserver. To view details about a specific backup, well use
barman show-backup oldserver backup_id
41 / 50
8.2.8
As we can see in Fig. 8.5, the World_db database cant be found on newserver. To migrate a backup, we will stop the postgresql
service
sudo systemctl stop postgresql
Note how barman makes use of the SSH keys to connect as user postgres to localhost in order to load the backup with id
20161015T142346 to the data directory. The result is shown in Fig. 8.5:
42 / 50
Now lets run queries against the database, as shown in Fig. 8.6:
43 / 50
8.3
Automating backups
In order to automate the backup process, switch to user barman and open the crontab file:
sudo -i -u barman
crontab -e
Then add the following two lines in it in order to execute a backup of oldserver each day at 12:45 pm
45 12 * * * /usr/bin/barman backup oldserver
Please note that this is a basic Barman / PostgreSQL setup, so I strongly suggest to check the official Barman docs here.
44 / 50
Chapter 9
9.1
As we just mentioned, we will use PHP to connect to the database server and to display the results of a query in a web page.
Before we even start writing the application, we will need to install PHP and some additional packages - including the Apache
web server. To do this in an Ubuntu 16.04 server with IP 192.168.0.54, use the following command:
sudo aptitude update && sudo aptitude install apache2 postgresql-contrib php7.0-pgsql
After the installation is complete, create a php file named info.php under /var/www/html with the following three lines. This will
help us to verify that PHP has been installed along with the PostgreSQL dependencies:
<?php
phpinfo();
?>
Then browse to 192.168.0.54/info.php and look for the section with the PostgreSQL details. You should find that the
PDO driver is enabled and that PHP is supporting our RDBMS, as shown in Fig. 9.1:
45 / 50
9.2
The first thing that we must do is ensure PHP can connect to the database server. Create a file named con.php under /var/www/html
with the following contents:
<?php
// Connection details
$conn_string = "host=localhost port=5432 dbname=World_db user=scg password=MyPassword
options=--client_encoding=UTF8";
For security purposes, set the appropriate ownership to the Linux account postgres (the user the database service runs as) and add
the www-data account to the postgres group. This will allow Apache to read this file:
46 / 50
Now go to 192.168.0.54/con.php and make sure the connection to the database is successful before proceeding:
9.3
and insert the following lines below it. Please note that we will use a very simple query that will retrieve city names and the district
it belongs to in Argentina (you will later be able to change it to a more complicated query using Common Table Expressions, for
example):
$query = "SELECT name, district FROM city WHERE countrycode=ARG";
$cities = pg_query($query) or die(Query failed: . pg_last_error());
$myarray = array();
while ($row = pg_fetch_assoc($cities)) {
$myarray[] = $row;
}
// Encode response into JSON array
echo json_encode($myarray);
47 / 50
Next, go to 192.168.0.54/con.php. You should see the results of the query in JSON format (see Fig. 9.4):
48 / 50
9.4
Most web developers nowadays use a robust HTML5/CSS/Javascript framework called Bootstrap to write mobile-friendly applications very easily. Though a full discussion about Bootstrap (and the HTML5-related technologies) is out of the scope of this
article, it is sufficient to say that one of its distinguishing characteristics is that it divides the viewport in a 12-column grid.
It is up to the developer to decide how many columns will be assigned to a particular piece of content for xs (extra small, i.e. cell
phones), sm (small, i.e. tablets and ipads), md (medium, i.e. laptops), and lg or large devices (high resolution monitors). In this
tutorial we will assume that we desire to show the city and district fields using 6 columns each in small devices (sm) and up. For
extra small screens, city will stack on top of district, as we will see later.
To do this, create a file named index.php in the same location as con.php and insert the following lines into it:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Mobile friendly page with PostgreSQL and PHP</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap. min.css">
49 / 50
</head>
<body>
<div class="row">
<div class="col-md-6" id="city" style="text-align: center"><strong>City</strong>
<strong>District</strong>
</div>
</div>
</body>
<script src="https://code.jquery.com/jquery-3.1.1.min.js"</script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"</script>
</html>
<script>
$(document).ready(function(){
$.ajax({
url: con.php,
datatype: json,
type: POST,
success: function(data){
var output = $.parseJSON(data);
for(var i =0;i < output.length;i++)
{
var item = output[i];
$("#city").append("<br>"+item.name);
$("#district").append("<br>"+item.district);
}
}}
);
});
</script>
As you can see, this simple page uses a well-known Javascript library called jQuery to make an Ajax call to con.php and retrieve
the results. Again, an adequate discussion about jQuery, Ajax, and Javascript is out of the scope of this article, but you can find
some very valuable information on W3schools.
When you browse to 192.168.0.54/index.php, the result should be similar to Fig. 9.5:
50 / 50
Figure 9.5: Displaying the web page with the results of the query
Feel free to resize your browsers window to see the visualization changes as the viewport changes.
9.5
Summary
If you followed this tutorial carefully, congratulations! You have set connected to your PostgreSQL server using PHP and
displayed data from your database in a mobile-friendly web page. Hopefully this will give you the foundation to create more
sophisticated applications.