0% found this document useful (0 votes)
24 views

Postgresql Course Material

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Postgresql Course Material

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 205

PostgreSQL Database Administration Training

Dt: 21 – 25 Sept, 2015

Shankarnag
Software Engineer & Database Architect
© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 1
Intro Course Agenda

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 2


Advance Course Agenda
• Transactions and Concurrency
• Performance Tuning
• High Availability & Replication
• Hot Standby
• Table Partitioning with Procedure Language
• PGPool-II
• Monitoring
• Add on Utilities – Contrib

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 3


1. Introduction

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 4


Objectives

• In this module we will cover:


− PostgreSQL
− Features of PostgreSQL
− General Database Limits
− Common Database Object Names

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 5


Postgres: A Proven Track Record

• Most mature open source RDBMS technology


• Enterprise-class features (built like Oracle, DB2, SQL Server)
• Strong, independent community driving rapid innovation

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 6


Major Features: PostgreSQL
• Portable – written in ANSI C
• Multi platform support (Linux on Power, Windows, OSX)
• ACID Compliant
• Multi-version Concurrency Control
• Table Partitioning
• Tablespaces
• Host Based Access Control
• Streaming Replication
• Online Backups and Point in Time Recovery
• Procedural Languages: PL/pgSQL, PL/Java, more

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 7


General Database Limits

Limit Value
Maximum Database Size Unlimited
Maximum Table Size 32 TB
Maximum Row Size 1.6 TB
Maximum Field Size 1 GB
Maximum Rows per Table Unlimited
Maximum Columns per Table 250-1600 (Depending on Column types)
Maximum Indexes per Table Unlimited

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 8


Common Database Object Names

Industry Term PostgreSQL Term

Table or Index Relation

Row Tuple

Column Attribute

Data Block Page (when block is on disk)

Page Buffer (when block is in memory)

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 9


Summary

• In this module we covered:


− PostgreSQL
− Features of PostgreSQL
− General Database Limits
− Common Database Object Names

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 10


2. System Architecture

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 11


Objectives

• In this module we will cover:


− PostgreSQL System Architecture:
− Process
− Memory
− Storage
− Connection Request and Response
− Write Ahead Logging

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 12


Architectural Summary

• PostgreSQL uses processes, not threads


• Postmaster process acts as supervisor
• Several utility processes perform background work
− postmaster starts them, restarts them if they die
• One backend process per user session
− postmaster listens for new connections

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 13


Process and Memory Architecture
Postmaster
Shared Memory
Shared Buffers WAL Buffers Process Array

BGWRITER STATS
COLLECTOR
Data WAL Archived
CHECKPOINTER ARCHIVER Files Segments WAL

AUTO -- LOG WRITER


VACUUM

WAL Writer

14 © Copyright EnterpriseDB Corporation, 2015. All rights reserved. 14


Utility Processes

• Background writer
− Writes updated data blocks to disk
• WAL writer
− Flushes write-ahead log to disk
• Checkpointer process
− Automatically performs a checkpoint based on config parameters
• Autovacuum launcher
− Starts Autovacuum workers as needed
• Autovacuum workers
− Recover free space for reuse

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 15


More Utility Process

• Logging collector
− Routes log messages to syslog, eventlog, or log files
• Stats collector
− Collects usage statistics by relation and block
• Archiver
− Archives write-ahead log files

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 16


Postmaster as Listener

• Postmaster is the master process Client request a


called postgres connection

• Listens on 1-and-only-1 tcp port


Postmaster
• Receives client connection request

Shared Memory

17

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 17


Disk Read Buffering

postgres postgres postgres


• PostgreSQL buffer cache
(shared_buffers) reduces
physical reads from storage.
• Read the block once, then
Shared (data) Buffers
examine it many times in cache.

Stable Database

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 18


Disk Write Buffering
postgres postgres postgres

• Blocks are written to disk


only when needed:
− To make room for new
blocks Shared (data) Buffers
− At checkpoint time

CHECKPOINT
Stable Database

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 19


Background Writer Cleaning Scan

• Background writer scan postgres postgres postgres

attempts to ensure an
adequate supply of clean
buffers.
Shared (data) Buffers
• bgwriter writes dirty blocks to
storage as needed.
BGWRITER

Stable Database

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 20


Write Ahead Logging

postgres postgres postgres


• Backend process that writes
data to Write Ahead Logging
(WAL) buffers.
• Flush WAL buffers periodically WAL
Shared (data) Buffers
(WAL writer), on commit, or Buffer
when buffers are full.
• Group commit.

Transaction
Log
Stable Database

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 21


Transaction Log Archiving
postgres postgres postgres

• Archiver spawns a task to


copy pg_xlog log files to
archive location when full.
WAL
Shared (data) Buffers
Buffer

Transaction
ARCHIVE Log
COMMAND Stable Database

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 22


Commit and Checkpoint

• Before Commit
− Uncommitted updates are in memory.
• After Commit
− Committed updates written from shared memory to WAL log file (on disk).
• After Checkpoint
− Modified data pages are written from shared memory to the data files.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 23


Database Cluster

• A Cluster is a collection of databases managed by a one server instance


• Each Cluster has a separate
− Data directory
− TCP port
− Set of processes
• A Cluster can contain multiple databases

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 24


Installation Directory Layout

• bin – Binary files bin

• include – Header files include

• lib – Library files


lib
• doc – Help files for different
components, contrib modules and doc
extensions /opt/PostgresPlus/9.4AS
share
• share – Extensions and sample
config files scripts
• scripts – Some useful script files
stackbuilderplus
• stackbuilder – installation directory
for Stackbuilder pg_env.sh,
uninstall-* files

25
© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 25
Database Cluster Data Directory Layout

global – Cluster wide database objects


global
base – Contains databases
base
pg_tblspc – Symbolic link to
pg_tblspc
tablespaces
pg_xlog
pg_xlog – Write ahead logs
pg_log
pg_log – Error logs /data
pg_clog, pg_multiexact, pg_snapshots, pg_stat,
Multiple directories containing different pg_subtrans,pg_notify, pg_serial, pg_replslot,
pg_logical, pg_dynshmem
status data
postgresql.conf, pg_hba.conf, pg_ident.conf
Server configuration files
postmaster information files postmaster.pid, postmaster.opts

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 26


Data File Storage Internals

• File-per-table, file-per-index.
• A table-space is a directory.
• Each database that uses that table-space gets a subdirectory.
• Each relation using that tablespace/database combination gets one or
more files, in 1GB chunks.
• Additional files used to hold auxiliary information (free space map, visibility
map).
• Each file name is a number (see pg_class.relfilenode).

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 27


Free Space and Visibility Map Files

• Each Relation has a free space map


− Stores information about free space available in the relation
− File named as filenode number plus the suffix _fsm
• Tables also have a visibility map
− Track which pages are known to have no dead tuples
− Stored in a fork with the suffix _vm

28
© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 28
Sample: Data Directory Layout
Database OID Relation Data File
14307 14297
base 14300
/data 14405
14312
pg_tblspc 16650 14498
Tablespace OID
/storage1/pgtab

14300
14307
14301
16700
16651
16701

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 29


Lab Exercise

1. Write a command to display a list of all PostgreSQL related processes on


your operating system
2. Write a SQL statement to find the data file name and location for
„customers‟ table

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 30


Summary

• In this module we covered:


− PostgreSQL System Architecture:
− Process
− Memory
− Storage
− Connection Request and Response
− Write Ahead Logging

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 31


3. Installation

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 32


Objectives

• In this module we will cover:


− OS User & Permissions
− Supported Platforms & Locales
− Installation Wizard
− Text Mode Installation
− Stackbuilder Plus
− Database Clusters
− Starting and Stopping the Server
− Connecting to a database using psql

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 33


OS User and Permissions

• PostgreSQL runs as a daemon (Unix / Linux) or service (Windows).


• Installation requires superuser/admin access.
• All PostgreSQL processes and data files must be owned by a user in the
OS.
− OS user is un-related to database user accounts.
− For security reasons, the OS user must not be root or an administrative account.
• SELinux Permissions
− SELinux must be set to permissive mode on systems with SELinux.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 34


Supported Platforms

• PostgreSQL is available for Windows, Linux and Solaris systems


• Windows:
− Windows Server 2008 R1 (32 and 64 bit)
− Windows Server 2008 R2 (64 bit)
− Windows 2012 (64-bit)
• Linux platforms:
− CentOS 6.x and 7.x (64-bit)
− Red Hat Enterprise Linux 6.x and 7.x (64-bit)
− Ubuntu 14.04 LTS
− SLES 11.x

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 35


Supported Locales

• PostgreSQL has been tested and certified for the following locales:
− en_US United States English
− zh_HK Traditional Chinese with Hong Kong SCS
− zh_TW Traditional Chinese for Taiwan
− zh_CN Simplified Chinese
− ja_JP Japanese
− ko_KR Korean

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 36


Text Mode Installation

• Run the installer in text mode:


# ./postgresql.9.4-linux.run –mode text

• Choose the language using a number[1-5]

Please select the installation language


[1] English - English
[2] Japanese - 日本語
[3] Simplified Chinese - 简体中文
[4] Traditional Chinese - 繁体中文
[5] Korean - 한국어
Please choose an option [1] :

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 37


Text Mode Installation Steps

• Installer will prompt for different steps:


− License Agreement
− User Authentication
− Installation Directory
− Components
− Data and WAL Directory
− Configuration Mode
− Database superuser password and Locale

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 38


StackBuilder Plus
• Simplifies the process of downloading and installing modules
• StackBuilder requires Internet access
• Run stackbuilder using the StackBuilder menu option from the Postgresql
9.4 menu

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 39


Setting Environmental Variables

• Setting environment variables is very important for trouble free


startup/shutdown of the database server.
− PATH –should point correct bin directory.
− PGDATA –should point correct data cluster directory.
− PGPORT –should point correct port on which database cluster is running.
− Edit .bash_profile to set the variables
− In Windows set these variables using my computer properties page.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 40


Clusters

• Each instance of Advanced Server is referred to as a “cluster”


• Clusters are comprised of a data directory that contains all data and
configuration files
• Referred to in two ways
− Location of the data directory
− Port number
• A single server can have many installations and you can create multiple
clusters using initdb.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 41


Starting and Stopping the Server

• Controlling a Service on Linux


− service service_name action
− /etc/init.d/service_name action
• pg_ctl utility command can also be used for starting and stopping a
database cluster
• pg_ctl will be covered later in this training

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 42


Starting and Stopping the Server(windows)

• Use Services Utility to start/stop PostgreSQL server


• The Services Utility can be accessed through the Windows Control Panel
• Other OS dependant service start methods would be used as needed.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 43


Connecting to Server using edb-psql

• psql is a command line interface for PostgreSQL


• Syntax:
− psql [OPTIONS]... [DBNAME [USERNAME]]

• Database to connect to may also be specified using the -d DBNAME option


The user to connect as may be specified with -U
• In a psql session the connection may be changed by using
− \c[onnect] [DBNAME [USERNAME]]

• DBNAME and USERNAME default to the operating system user

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 44


Summary

• In this module we covered:


− OS User & Permissions
− Supported Platforms & Locales
− Installation Wizard
− Text Mode Installation
− Stackbuilder Plus
− Database Clusters
− Starting and Stopping the Server
− Connecting to a database using psql

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 45


8. Configuration

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 46


Objectives

• In this module we will cover:


− Server Configuration File
− Setting Server Parameters
− Connection Settings
− Security and Authentication Settings
− Memory Settings
− Query Planner Settings
− Log Management
− Background Writer Settings
− Statement Behavior
− Autovacuum Settings

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 47


Server Configuration File

• There are many configuration parameters that effect the behavior of the database
system.
• Server config file postgresql.conf stores all the server parameters.
• All parameter names are case-insensitive.
• Some parameters require restart.
• Query to list of all parameters.
− SELECT name,setting FROM pg_settings;
• Query to list all parameters requiring server restart.
− SELECT name FROM pg_settings WHERE context = 'postmaster';
• One way to set these parameters is to edit the file postgresql.conf, which is
normally kept in the data directory.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 48


Setting Server Parameters

• Some parameters can be changed per session using the SET command.
• Some parameters can be changed at the user level using ALTER USER.
• Some parameters can be changed at the database level using ALTER
DATABASE.
• The SHOW command can be used to see settings.
• pg_settings catalog table lists settings information.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 49


Connection Settings

• The following parameters in the postgresql.conf file control the connection


settings:
− listen_addresses (string)
− Specifies the TCP/IP address(es) on which the server is to listen for connections
− port (integer)
− The TCP port the server listens on; 5444 by default
− max_connections (integer)
− Determines the maximum number of concurrent connections to the database server
− superuser_reserved_connections (integer)
− Determines the number of connection “slots” that are reserved for connections by
Advanced Server superusers

50
© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 50
Memory Settings
• The following parameters in postgresql.conf control the memory settings:
− shared_buffers (integer)
− Sets the amount of memory the database server uses for shared memory buffers
− temp_buffers (integer)
− Sets the maximum number of temporary buffers used by each database session
− work_mem (default 4MB)
− Specifies the amount of memory to be used by internal sort operations and hash tables before
switching to temporary disk files
− autovacuum_work_mem
− Specifies the maximum amount of memory to be used by each autovacuum worker process
− maintenance_work_mem (default 64MB)
− Specifies the maximum amount of memory to be used in maintenance operations, such as
VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 51


Query Planner Settings

• random_page_cost (default 4.0): Estimated cost of a random page fetch,


in abstract cost units. May need to be reduced to account for caching
effects.
• seq_page_cost (default 1.0): Estimated cost of a sequential page fetch,
in abstract cost units. May need to be reduced to account for caching
effects. Must always set random_page_cost >= seq_page_cost.
• effective_cache_size (default 4GB): Used to estimate the cost of an
index scan. Rule of thumb is 75% of system memory.
• There are plenty of enable_* parameters which influence the planner in
choosing an optimal plan.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 52


Write Ahead Logs Settings

• wal_level (default: minimal): Determines how much information is written to the


WAL. Other values are archive, hot_standby and logical.
• fsync (default on): Turn this off to make your database much faster – and silently
cause arbitrary corruption in case of a system crash – limited applicability.
• wal_buffers (default: -1, autotune): The amount of memory used in shared
memory for WAL data. The default setting of -1 selects a size equal to 1/32nd
(about 3%) of shared_buffers.
• checkpoint_segments (default 3): Maximum number of 16MB WAL file
segments between checkpoints. Default is too small!
• checkpoint_timeout (default 5 minutes): Maximum time between checkpoints.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 53


Logging – Where to Log

• log_destination: Valid values are combinations of stderr, csvlog,


syslog, and eventlog.
• logging_collector: Enables advanced logging features. csvlog requires
logging_collector.
• log_directory: Directory where log files are written. Default pg_log.
• log_filename: Format of log file name (e.g. enterprisedb-%Y-%m-
%d_%H%M%S.log).
• log_rotation_age: Automatically rotate logs after this much time.
Requires logging_collector.
• log_rotation_size: Automatically rotate logs when they get this big.
Requires logging_collector.

54
© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 54
Logging - When To Log

• client_min_messages (default NOTICE). Messages of this severity level


or above are sent to the client.
• log_min_messages (default WARNING). Messages of this severity level
or above are sent to the server.
• log_min_error_statement (default ERROR). When a message of this
severity or higher is written to the server log, the statement that caused it is
logged along with it.
• log_min_duration_statement (default -1, disabled): When a statement
runs for at least this long, it is written to the server log, with its duration.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 55


Logging - What To Log

• log_connections (default off): Log successful connections to the server log.


• log_disconnections (default off): Log some information each time a session
disconnects, including the duration of the session.
• log_error_verbosity (default “default”): Can also select “terse” or “verbose”.
• log_duration (default off): Log duration of each statement.
• log_line_prefix: Additional details to log with each line.
• log_statement (default none): Legal values are none, ddl, mod (DDL and all
other data-modifying statements), or all.
• log_temp_files (default -1): Log temporary files of this size or larger, in kilobytes.
• log_checkpoints (default off): Causes checkpoints and restart points to be
logged in the server log.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 56


Background Writer Settings

• bgwriter_delay (default 200 ms): Specifies time between activity rounds


for the background writer.
• bgwriter_lru_maxpages (default 100): Maximum number of pages that
the background writer may clean per activity round.
• bgwriter_lru_multiplier (default 2.0): Multiplier on buffers scanned per
round. By default, if system thinks 10 pages will be needed, it cleans 10 *
bgwriter_lru_multiplier of 2.0 = 20.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 57


Lab Exercise

1. Open psql and write a statement to change work_mem to 10MB. This


change must persist across server restarts
2. Open psql and write a statement to change work_mem to 20MB for the
current session
3. Open psql and write a statement to change work_mem to 1 MB for the
postgres user
4. Write a query to list all parameters requiring a server restart
5. Take a backup of the postgresql.conf file

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 58


Lab Exercise

6. Open the configuration file for your Postgres database cluster and make
the following changes
− Maximum allowed connections to 50
− Authentication time to 10 mins
− Shared buffers to 256 MB
− work_mem to 10 MB
− wal_buffers to 8MB
7. Restart the server and verify the changes made in previous step

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 59


Summary

• In this module we covered:


− Setting PostgresPlus Advanced Server Parameters
− Access Control
− Connection Settings
− Security and Authentication Settings
− Memory Settings
− Query Planner Settings
− Log Management
− Background Writer Settings
− Statement Behaviour
− Autovacuum Settings

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 60


PSQL

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 61


Objectives

• In this module we will cover:


− Client Interface edb-psql
− Connecting to a Database
− edb-psql Meta Commands
− Executing SQL Commands
− Advance Features

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 62


PSQL

• psql is the name of PostgreSQL PSQL's executable.

• It enables you to type in queries interactively, issue them to PostgreSQL,


and see the query results.
− psql [option...] [dbname [username]]

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 63


Connecting to a Database

• In order to connect to a database you need to know the name of your


target database, the host name and port number of the server and what
user name you want to connect as.
• Command line options, namely -d, -h, -p, and -U respectively.
• Environment variables PGDATABASE, PGHOST,
PGPORT and PGUSER to appropriate values.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 64


Conventions

• psql has it's own set of commands, all of which start with a backslash (\).
These are in no way related to SQL commands, which operate in the
server. psql commands only affect psql.
• Some commands accept a pattern. This pattern is a modified regex. Key
points:
− * and ? are wildcards
− Double-quotes are used to specify an exact name, ignoring all special characters and
preserving case

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 65


On Startup...

• psql will execute commands from $HOME/.psqlrc, unless option -X is


specified.
• -f FILENAME will execute the commands in FILENAME, then exit
• -c COMMAND will execute COMMAND (SQL or internal) and then exit
• --help will display all the startup options, then exit
• --version will display version info and then exit

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 66


Entering Commands
• psql uses the command line editing capabilities that are available in the
native OS. Generally, this means
• Up and Down arrows cycle through command history
− on UNIX, there is tab completion for various things, such as SQL commands and to a more
limited degree, table and field names
− disabled with -n

• \s will show the command history


• \s FILENAME will save the command history
• \e will edit the query buffer and then execute it
• \e FILENAME will edit FILENAME and then execute it
• \w FILENAME will save the query buffer to FILENAME

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 67


Controlling Output

• There are numerous ways to control output.


• The most important are:
− -o FILENAME or \o FILENAME will send query output
(excluding STDERR) to FILENAME (which may be a pipe)
− \g FILENAME executes the query buffer,
sending output to FILENAME (may be a pipe)
− -q runs quietly.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 68


Advance Features: Variables

• psql provides variable substitution


• Variables are simply name/value pairs
• Use \set meta command to set a variable
postgres=# \set city Edmonton
postgres =# \echo :city
Edmonton
• Use \unset to delete a variable
postgres =# \unset city

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 69


Information Commands

• \d(i, s, t, v, S)[+] [pattern]


• List information about indexes, sequences, tables, views or system objects
Any combination of letters may be used in any order, e.g.: \dvs
− + displays comments
• \d[+] [pattern]
− For each relation describe/display the relation structure details
− + displays any comments associated with the columns of the table,
and if the table has an OID column
• Without a pattern, \d[+] is equivalent to \dtvs[+]

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 70


Information Commands (cont)

• \l[ist][+]
− Lists the names, owners, and character set encodings of all the databases in the
server
− If + is appended to the command name, database descriptions are also displayed
• \dn+ [pattern]
− Lists schemas (namespaces)
− + adds permissions and description to output
• \df[+] [pattern]
− Lists functions
− + adds owner, language, source code and description to output

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 71


Other Common psql Meta Commands

• \conninfo
− Current connection information
• \q or ^d
− Quits the edb-psql program
• \cd [ directory ]
− Change current working directory
− Tip: To print your current working directory, use
\! pwd
• \! [ command ]
− Executes the specified command
− If no command is specified, escapes to a separate Unix shell (CMD.EXE in
Windows)

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 72


Help

• \?
− Shows help information about edb-psql commands
• \h [command]
− Shows information about SQL commands
− If a command isn't specified, lists all SQL commands
• psql --help
− Lists command line options for edb-psql

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 73


Lab Exercise

1. Open an psql terminal and connect to the edb database


2. Write a command to display the list of:
− Databases
− Users
− Tablespaces
− Tables
3. Create a SQL script with following content:
select * from customers limit 10;
4. Execute the above SQL script using psql and store the results in a file

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 74


Summary

• In this module we covered:


− Client Interface edb-psql
− Connecting to a Database
− psql Meta Commands
− Executing SQL Commands

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 75


Creating and Managing Databases

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 76


Objectives

• In this module we will cover:


− Object Hierarchy
− Creating Databases
− Creating Schemas
− Schema Search Path
− Creating Database Objects

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 77


Object Hierarchy

Cluster

Users/Groups Databases Tablespaces

Schemas

Tables Views Sequences Synonyms Domains Functions

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 78


Database

• A running PostgreSQL server can manage multiple databases.


• A database is a named collection of SQL objects. It is a collection of
schemas and the schemas contain the tables, functions, etc.
• Databases are created with CREATE DATABASE command.
• Databases are destroyed with DROP DATABASE command .
• To determine the set of existing databases:
− SQL: SELECT datname FROM pg_database;
− PSQL META COMMAND: \l (backslash lowercase L)

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 79


Creating Databases

• There is a program that you can execute from the shell to create new
databases, createdb.
− createdb dbname
• Create Database command can be used to create a database in a cluster.
− Syntax:
CREATE DATABASE name
[ [ WITH ] [ OWNER [=] dbowner ]
[ TEMPLATE [=] template ]
[ ENCODING [=] encoding ]
[ TABLESPACE [=] tablespace ]
[ CONNECTION LIMIT [=] connlimit ] ]

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 80


Example – Creating Databases

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 81


Accessing a Database

• The psql tool allows you to interactively enter, edit, and execute SQL
commands.
• PGAdmin-III GUI tool can also be used to access a database.
• Lets use psql to access a database:
− Open Command prompt or terminal.
− If PATH is not set you can execute next command from the bin directory location of
postgres installation
− $psql –U postgres db
− db=#

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 82


What is a Schema

SCHEMA

Tables Views Sequences

Synonyms Functions Procedures


Owns

Packages Domains
USER

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 83


Benefits of Schemas

• A database can contains one or more named schemas.


• By Default, all database contain public schema.
• There are several reasons why one might want to use schemas:
− To allow many users to use one database without interfering with each other.
− To organize database objects into logical groups to make them more manageable.
− Third-party applications can be put into separate schemas so they cannot collide with
the names of other objects.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 84


Creating Schemas

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 85


Schema Search Path

• Qualified names are tedious to write, so we use table names directly in


queries.
• If no schema name is given, the schema search path determines which
schemas are searched for matching table names.
− E.g.:
SELECT * FROM emp
− This statement will find the first emp table the schemas listed in the search path.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 86


Schema Search Path

• The first schema named in the search path is called the current schema if
that named schema exists.
• Aside from being the first schema searched, it is also the schema in which
new tables will be created if the CREATE TABLE command does not
specify a schema name.
• To show the current search path, use the following command:
− SHOW search_path;
• To put our new schema in the path, we use:
− SET search_path TO myschema, public;

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 87


Listing Schemas and search_path

• \dn can be used to list all the schemas in a database


• Use show command to view the current search path

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 88


Database Objects

• Database Schemas can contain different types of objects including:


− Tables
− Sequences
− Views
− Synonyms
− Domains
− Packages
− Functions
− Procedures

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 89


Lab Exercise

1. Write a SQL query to view name and size of all the databases in your
data cluster
2. In a previous module you learned how to create a database user, now
create a database user named mgr1
3. Create a new database mgrdb with owner mgr1
4. Create a schema mgr1 in database mgrdb. This schema must be owned
by mgr1

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 90


Lab Exercise

6. Open psql and connect to mgrdb using mgr1 user and find the result of
the following query
select * from mgrtab;
The above statement should run successfully with 0 rows returned
7. In the lab exercise of a previous module you have added user Irena. Set
the proper search_path for user Irena so that the edbstore schema
objects can be accessed without use of fully qualified table names
8. Connect to the edbstore database using the Irena user and verify if you
can access orders table without a fully qualified table name

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 91


Summary

• In this module we covered:


− Object Hierarchy
− Creating Databases
− Creating Schemas
− Schema Search Path

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 92


Security

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 93


Objectives

• In this module we will cover:


− Authentication and Authorization
− Levels of security
− Implementing Host Based Access Control

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 94


Authentication and Authorization

• Secure access is a two step process:


− Authentication
− Ensures a user is who he/she claims to be.
− Authorization
− Ensures an authenticated user has access to only the data for which he/she
has been granted the appropriate privileges.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 95


Levels of Security

Server and • Check Client IP


Application • pg_hba.conf

• User/Password
• Connect Privilege
Database • Schema
Permissions

• Table level
Object privileges
• Grant/Revoke

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 96


pg_hba.conf – Access control

• Host based access control file


• Located in the cluster data directory
• Read at startup, any change require reload
• Contain set of records, one per line
• Each record specify connection type, database name, user name , client IP and
method of authentication
• Top to bottom read
• Hostnames, IPv6 and IPv4 supported
• Authentication methods: trust, reject, md5, password, gss, sspi, krb5, ident,
peer, pam, ldap, radius or cert.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 97


pg_hba.conf Example

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 98


User/Password and Privileges

• Passwords are stored in pg_shadow system table.


• User supplied password must match the password stored on the database
side.
• pg_hba‟s trust authentication bypass the password and thus is not secure.
• DBAs must have OS privileges to create and delete files.
• Typically other database users should not have any OS level privilege.
• Always revoke CONNECT privilege on a database from public to block
normal user access to the database.
• Use GRANT to grant CONNECT privilege to the users who are allowed to
connect to a database.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 99


Authentication Problems

FATAL: no pg_hba.conf entry for host "192.168.10.23", user “edbstore", database “edbuser“
FATAL: password authentication failed for user “edbuser“
FATAL: user “edbuser" does not exist
FATAL: database “edbstore" does not exist

• Self explanatory message is displayed


• Verify database name, username and Client IP in pg_hba.conf
• Reload Cluster after changing pg_hba.conf
• Check server log for more information

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 100


Object Ownership

Cluster

Users/Groups Databases Tablespaces

Schemas

Tables Views Sequences Synonyms Domains Functions

101
© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 101
Lab Exercise

1. User Irena is trying to connect to the edbstore database. She emailed


you the error message she is getting while trying to connect:
FATAL: no pg_hba.conf entry for host "192.168.10.23", user
“irena", database “edbstore“

Solve the above issue so that she can connect successfully.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 102


Lab Exercise

2. You have decided to log all the connections on your data cluster.
Configure your postgresql.conf settings so that all the successful as well
as unsuccessful connections are logged in the error log file
3. Connect to edbstore database
4. Verify if the connections are logged or not

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 103


Summary

• In this module we covered:


− Authentication & Authorization
− Levels of security
− Implementing Host Based Access Control

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 104


Tablespaces

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 105


Objectives

• In this module we will cover


− Tablespaces and Datafiles
− pg_global and pg_default
− Advantages of Tablespaces
− Creating Tablespaces
− Changing Default Tablespace
− Usage Example
− Altering Tablespaces
− Dropping Tablespaces

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 106


Tablespaces and Datafiles

• Data is stored logically in tablespaces and physically in data files


• Tablespaces:
− Can belong to only one database cluster
− Consist of multiple data files
− Can be used by multiple databases
• Data Files:
− Can belong to only one tablespace
− Are used to store database objects
− Cannot be shared by multiple tables (one or more per table)

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 107


Advantages of Tablespaces

• Control the disk layout for a Database Cluster.


• Heavily used indexes can be placed on fast media using tablespaces.
• Historical tables can be placed on slow and cheaper media.

Fast
Media

Slow
Media

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 108


Postgres Pre-Configured Tablespaces

• pg_global tablespace corresponds to PGDATA/global directory.


• pg_global is used to store cluster-wide tables and shared system
catalog.
• pg_default tablespace corresponds to PGDATA/base directory.
• pg_default is used to store databases and relations.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 109


Creating Tablespaces

• Each user-defined tablespace has a symbolic link inside the PGDATA/pg_tblspc


directory.
• This symbolic link is named after the tablespace's OID.
• Tablespace directory contains a subdirectory named after the PPAS version used
to built the database cluster.
• Each database have a separate subdiretory.
• Tablespaces can be created using CREATE TABLESPACE command.
• Tablespace directory must be created and appropriate permissions must be set
prior to running CREATE TABLESPACE command.

CREATE TABLESPACE tablespace_name [ OWNER user_name ] LOCATION


'directory'

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 110


Default and Temp Tablespace

• default_tablespace server parameter sets default tablespace.


• default_tablespace parameter can also be set using SET command at
session level.
• temp_tablespaces parameter determines the placement of temporary
tables and indexes and temporary files.
• temp_tablespaces can be a list of tablespace names.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 111


Altering Tablespaces

• ALTER TABLESPACE is used to change the definition of a tablespace.


• ALTER TABLESPACE can be used to rename a tablespace, change
ownership, move objects to other tablespaces and set custom values for a
configuration parameter.
• Only owner or superuser can alter a tablespace.
• seq_page_cost and random_page_cost parameters can be altered
for a tablespace.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 112


Dropping Tablespace

• DROP TABLESPACE removes a tablespace from the system


• Only owner or superuser can drop a tablespace
• Tablespace must be empty
• If tablespace is listed in temp_tablespaces parameter, make sure
current sessions are not using the tablespace.
• DROP TABLESPACE cannot be executed inside a transaction

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 113


Lab Exercise

1. Write a statement or meta command in psql to view the list of all


tablespaces
2. You need to create a new tablespace labtab for the labapp application.
Start with creation of new directory /home/postgres/labtab and then
create the required tablespace
3. Create a table labtest in the edb database with tablespace labtab
4. Verify the location of the newly created table labtest
5. Move the labtest table to the pg_default tablespace
6. Remove labtab tablespace and the associated directory

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 114


Summary

• In this module we covered:


− Tablespaces and Datafiles
− pg_global and pg_default
− Advantages of Tablespaces
− Creating Tablespaces
− Changing Default Tablespace
− Usage Example
− Altering Tablespaces
− Dropping Tablespaces

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 115


Proactive Maintenance

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 116


Objectives

• In this module we will cover:


- Query Plan
- Table Statistics
- Updating Planner Statistics
- Vacuuming
- Scheduling Auto Vacuum
- Preventing Transaction ID Wraparound Failures
- The Visibility Map
- Routine Reindexing

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 117


Query Plan

• Postgres Planner is responsible for generating execution plan for a query.


• Postgres optimizer determine the most efficient execution plan for a query.
• Optimization is cost based, cost is estimated resource usage for a plan.
• EXPLAIN query - displays an execution plan for a SQL statement without actually
executing the statement.
• An execution plan shows the detailed steps necessary to execute a SQL
statement.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 118


Sample Explain Output

• Example
edb=# explain select * from emp;
QUERY PLAN
------------------------------------------------------
Seq Scan on emp (cost=0.00..1.14 rows=14 width=135)
• The numbers that are quoted by EXPLAIN are:
- Estimated start-up cost
- Estimated total cost
- Estimated number of rows output by this plan node
- Estimated average width (in bytes) of rows output by this plan node

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 119


Table Statistics

• Postgres Optimizer and Planner use table statistics for generating Query Plans.
• Choice of Query Plans are as good as Table Statistics.
• Table statistics:
- Table statistics stores the total number of rows in each table and index, as
well as the number of disk blocks occupied by each table and index.
- Statistics table pg_class contains reltuples and relpages which contain
important statistic information for each table in a database.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 120


Updating Planner Statistics

• Table Statistics
- Are not updated in real time
- Can be updated using ANALYZE command
- Stored in pg_class and pg_statistics
- You can run the ANALYZE command from edb-psql on specific tables and just
specific columns
- Autovacuum will run ANALYZE as configured
- Syntax for ANALYZE
- Analyze [Table]

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 121


Routine Vacuuming

• Regular maintenance is critical for any running database.


• An update or delete of a row does not immediately remove row from the disk
page.
• Eventually this row space becomes obsolete and can be removed or reused.
• Vacuuming and Auto-vacuuming can perform this maintenance task.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 122


Vacuuming

• VACUUM command when executed:


- Can recover or reuse disk space occupied by obsolete rows
- Update data statistics
- Update the visibility map, which speeds up index-only scans
- Protect against loss of very old data due to transaction ID wraparound

• VACUUM command have two options:


- VACUUM
- VACUUM FULL

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 123


Vacuuming

• VACUUM
- Removes dead rows and marks the space available for future reuse.
- Does not return the space to the operating system.
- Space is reclaimed if obsolete rows are at the end of the table.

• VACUUM FULL
- More aggressive algorithm compared to VACUUM
- Compacts tables by writing a complete new version of the table file with no
dead space.
- Takes more time.
- Requires extra disk space for the new copy of the table, until the operation
completes.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 124


Autovacuum

• Postgres has an optional but highly recommended feature called


autovacuum.
• It automates the execution of VACUUM and ANALYZE commands.
• When enabled, autovacuum checks for tables that have had a large
number of inserted, updated or deleted tuples and execute VACUUM
and/or ANALYZE as needed
• It is controlled by the thresholds and scale factors which are taken from
postgresql.conf

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 125


Preventing Transaction ID Wraparound Failures

• MVCC transaction semantics depend on being able to compare


transaction ID (XID) numbers: a row version with an insertion XID greater
than the current transaction's XID is "in the future" and should not be
visible to the current transaction.
• Since transaction IDs have limited size (32 bits at this writing) a cluster that
runs for a long time (more than 4 billion transactions) would suffer
transaction ID wraparound.
• To avoid this, every table in the database must be vacuumed at least once
every two billion transactions.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 126


The Visibility Map

• Each heap relation has a Visibility Map which keep track of which pages contain
only tuples.
• Stored at <relfilenode>_vm.
• Helps vacuum to determine whether page contain dead rows.
• Can also be used by index-only scans to answer queries.
• VACUUM command updates the visibility map.
• The visibility map is vastly smaller, so it can be cached easily.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 127


Routine Reindexing

• REINDEX rebuilds an index using the data stored in the index's table.
• There are several reasons to use REINDEX:
− An index is corrupted.
− An index has become "bloated", that it is contains many empty or nearly-
empty pages.
− You have altered a storage parameter (such as fillfactor) for an index.
− An index build with the CONCURRENTLY option failed, leaving an "invalid"
index.
• Syntax:
− REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [
FORCE ]

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 128


Lab Exercise

1. Configure a Postgres cluster to log all SQL statements which run for more
than 500ms
2. Write a statement to view the explain plan for any slow running queries
logged from the previous step
3. Write a statement to view total pages and tuples in the orders table

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 129


Lab Exercise

4. Write a command to vacuum all tables of the edbstore database


5. Write a command to reindex all the indexes of the orders table
6. Write a command to reindex all the system indexes
7. Write a query to list all the tables which have not been automatically
vacuumed since 5 days

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 130


Module Summary

• In this module we covered:


- Query Plan
- Table Statistics
- Updating Planner Statistics
- Vacuuming
- Scheduling Auto Vacuum
- Preventing Transaction ID Wraparound Failures
- The Visibility Map
- Routine Reindexing

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 131


Backup and Recovery

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 132


Objectives

• In this module we will cover:


− Backup Types
− SQL Dump
− Backup using PGADMIN
− Cluster Dump
− Offline Copy Backup
− Continuous Archiving
− Point-In Time Recovery

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 133


Backup

• As with any database, Postgres database should be backed up regularly.


• There are three fundamentally different approaches to backing up
Postgres data:
- SQL dump
- File system level backup
- Continuous Archiving

Let's discuss them in detail.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 134


Backup – SQL Dump

• Generate a text file with SQL commands


• PPAS provides the utility program pg_dump for this purpose.
• pg_dump does not block readers or writers.
• pg_dump does not operate with special permissions.
• Dumps created by pg_dump are internally consistent, that is, the dump
represents a snapshot of the database as of the time pg_dump begins
running.
Syntax:
pg_dump [options] [dbname]

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 135


Backup – SQL Dump
pg_dump Options
-a – Data only. Do not dump the data definitions (schema)
-s – Data definitions (schema) only. Do not dump the data
-n <schema> - Dump from the specified schema only
-t <table> - Dump specified table only
-f <path/file name.backup> - Send dump to specified file
-Fp – Dump in plain-text SQL script (default)
-Ft – Dump in tar format
-Fc – Dump in compressed, custom format
-v – Verbose option
-o use oids

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 136


Postgres PGAdmin III: Backup

• Open PGADMIN-III
• Connect with the database
cluster
• Right click on the database to
be backed up
• Click Backup

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 137


PGADMIN-III: Backup

• Specify the backup


filename
• Choose the backup format
• Choose different Dump
Options
• Click backup button

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 138


SQL Dump – Large Databases

• If operating systems have maximum file size limits, it causes problems


when creating large pg_dump output files.
• Standard Unix tools can be used to work around this potential problem.
- You can use your favorite compression program, for example gzip:
- pg_dump dbname | gzip > filename.gz
- Also the split command allows you to split the output into smaller files:
- pg_dump dbname | split -b 1m - filename

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 139


Restore – SQL Dump

• The text files created by pg_dump are intended to be read in by the psql program.
The general command form to restore a dump is:

psql dbname < infile

• infile is what you used as outfile for the pg_dump command. The database
dbname will not be created by this command, so you must create it yourself.

• pg_restore is used to restore a database backed up with pg_dump that was


saved in an archive format – i.e., a non-text format.
• Files are portable across architectures.
Syntax:
pg_restore [options…] [filename.backup]

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 140


Restore – SQL Dump

pg_restore Options
• -d <database name>: Connect to the specified database. Also restores to this
database if –C option is omitted.
• -C: Create the database named in the dump file & restore directly into it.
• -a: Restore the data only, not the data definitions (schema).
• -s: Restore the data definitions (schema) only, not the data.
• -n <schema>: Restore only objects from specified schema.
• -t <table>: Restore only specified table.
• -v: Verbose option.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 141


PGADMIN-III: Restore

• Right click on any


database.
• Click Restore.
• Locate the backup file.
• Select Create Database
option.
• Click Restore button.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 142


PGADMIN-III: Restore

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 143


Entire Cluster – SQL Dump

• pg_dumpall is used to dump an entire database cluster in plain-text SQL


format
• Dumps global objects: user, groups, and associated permissions
• Use edb-psql to restore

Syntax:
pg_dumpall [options…] > filename.backup

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 144


Entire Cluster – SQL Dump

pg_dumpall Options
-a: Data only. Do not dump schema.
-s: Data definitions (schema) only.
-g: Dump global objects only - not databases.
-r: Dump only roles.
-c: Clean (drop) databases before recreating.
-O: Skip restoration of object ownership.
-x: Do not dump privileges (grant/revoke)
--disable-triggers: Disable triggers during data-only restore.
-v: Verbose option.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 145


PGADMIN-III: Backup Server/Globals

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 146


Backup - File system level backup

• An alternative backup strategy is to directly copy the files that PPAS uses
to store the data in the database.
• You can use whatever method you prefer for doing usual file system
backups, for example:
- tar -cf backup.tar /usr/local/pgsql/data
• The database server must be shut down in order to get a usable backup.
• File system backups only work for complete backup and restoration of an
entire database cluster.
• File system snapshots work for live servers.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 147


Backup - Continuous Archiving

• Postgres maintains WAL files for all transactions in pg_xlog directory.


• Postgres automatically maintains the WAL logs which are full and
switched.
• Continuous archiving can be setup to keep a copy of switched WAL Logs
which can be later used for recovery.
• It also enables online file system backup of a database cluster.
• Requirements:
- wal_level must be set to archive
- archive_mode must be set to on
- archive_command must be set in postgresql.conf which archives WAL
logs and supports PITR

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 148


Backup - Continuous Archiving

• Step 1: Edit the postgresql.conf file and set the archive parameters:
wal_level=archive
archive_mode=on
Unix:
archive_command= ‘cp –i %p /mnt/server/archivedir/%f </dev/null’
Windows:
archive_command= 'copy "%p" c:\\mnt\\server\\archivedir\\"%f"'

%p is the absolute path of WAL otherwise you can define the path
%f is a unique file name which will be created on above path.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 149


Base Backup using Low Level API

• Step 2: Make a base backup:

− Connect using psql and issue the command:


SELECT pg_start_backup(‘any useful label’);

− Use a standard file system backup utility to back up the /data subdirectory

− Connect using edb-psql and issue the command:


SELECT pg_stop_backup();

− Continuously archive the WAL segment files

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 150


Point-in-Time Recovery

• Point-in-time recovery (PITR) is the ability to restore a database cluster up


to the present or to a specified point of time in the past.
• Uses a full database cluster backup and the write-ahead logs found in the
/pg_xlog subdirectory.
• Must be configured before it is needed (write-ahead log archiving must be
enabled).

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 151


Performing Point-in-Time Recovery
• Stop the server, if it's running.
• If you have enough space keep a copy of data directory and transaction logs.
• Remove all directories and files from the cluster data directory.
• Restore the database files from your file system backup.
• Verify the ownership of restored backup directories (must not be root)
• Remove any files present in pg_xlog/
• If you have any unarchived WAL segment files recovered from crashed cluster,
copy them into pg_xlog/.
• Create a recovery command file recovery.conf in the cluster data directory.
• Start the server.
• Upon completion of the recovery process, the server will rename recovery.conf to
recovery.done

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 152


Point-in-Time Recovery

• Settings in the recovery.conf file:


restore_command(string)
Unix:
restore_command = 'cp /mnt/server/archivedir/%f "%p"„
Windows:
restore_command = 'copy c:\\mnt\\server\\archivedir\\"%f" "%p"'

recovery_target_time(timestamp)

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 153


Summary

• In this module we covered:


− Backup Types
− SQL Dump
− Backup using PGADMIN
− Cluster Dump
− PGADMIN Backup Globals
− Offline Copy Backup
− Continuous Archiving
− Point-In Time Recovery

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 154


The PostgreSQL Data Dictionary

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 155


Objectives

• In this module we will cover:


− In this module you will learn:
− The System Catalog Schema
− System Information views/tables
− System Information Functions

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 156


The System Catalog Schema

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 157


System Information views/tables

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 158


System Administration Functions

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 159


System Administration Functions

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 160


System Administration Functions

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 161


Summary

• In this module we will cover:


− In this module you will learn:
− The System Catalog Schema
− System Information views/tables
− System Information Functions

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 162


Performance Tuning

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 163


Objectives

• In this module we will cover:


− Hardware Configuration
− OS Configuration
− Configuration (postgresql.conf)
− Timing
− Clustering Rows
− Indexes

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 164


Hardware Configuration

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 165


Hardware Configuration

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 166


OS Configuration

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 167


Configuration – Best Practices

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 168


Configuration – Best Practices

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 169


Clustering Rows

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 170


Indexes

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 171


Multicolumn Indexes

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 172


Summary

• In this module we will cover:


− Hardware Configuration
− OS Configuration
− Configuration (postgresql.conf)
− Timing
− Clustering Rows
− Indexes

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 173


PGPool

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 174


PGPOOL-II

• pgpool-II is a middleware that works between


• PostgreSQL servers and a PostgreSQL database client.
• pgpool-II talks PostgreSQL's backend and frontend protocol, and relays a
connection between them.
• Features:
− Connection Pooling
− Replication
− Load Balance

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 175


Connection Pooling

• pgpool-II saves connections to the PostgreSQL servers, and reuse them


whenever a new connection with the same properties (i.e. username,
database, protocol version) comes in. It reduces connection overhead, and
improves system's overall throughput.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 176


Replication

• pgpool-II can manage multiple PostgreSQL servers. Using the replication


function enables creating a realtime backup on 2 or more physical disks,
so that the service can continue without stopping servers in case of a disk
failure.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 177


Load Balance

• If a database is replicated, executing a SELECT query on any server will


return the same result. pgpool-II takes an advantage of the replication
feature to reduce the load on each PostgreSQL server by distributing
SELECT queries among multiple servers, improving system's overall
throughput. At best, performance improves proportionally to the number of
PostgreSQL servers. Load balance works best in a situation where there
are a lot of users executing many queries at the same time.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 178


Transactions and Concurrency

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 179


What is a Transaction?

• The essential point of a transaction is that it bundles multiple steps into a


single ACID event characterized by:
• An all-or-nothing operation (Atomicity).
• Only valid data is written to the database (Consistency).
• The intermediate states between the steps are not visible to other
concurrent transactions (Isolation).
• If some failure occurs that prevents the transaction from completing, then
none of the steps affect the database at all (Durability).

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 180


Concurrency and Transactions

• Concurrency – two or more sessions accessing the same data at the same
time.
• Each transaction sees snapshot of data (database version) as it was some
time ago.
• Transaction isolation - Protects transaction from viewing “inconsistent”
data (currently being updated by another transaction).
• No locking – readers don‟t block writers and writers don‟t block readers.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 181


Table Partitioning

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 182


Objective

• Partitioning
• Partition Methods
• Partition Setup
• Partitioning Example
• Partition Table Explain Plan

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 183


What is Partitioning?

• PostgreSQL supports basic table partitioning


• Splitting one large table into smaller pieces

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 184


Partitioning

• Partitioning refers to splitting what is logically one large table into smaller
physical pieces.
• Query performance can be improved dramatically for certain kinds of
queries.
• Improved Update performance. When an index no longer fits easily in
memory, both read and write operations on the index take progressively
more disk accesses.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 185


Partitioning

• Bulk deletes may be accomplished by simply removing one of the


partitions
• Seldom-used data can be migrated to cheaper and slower storage media.
• PostgreSQL manages partitioning via table inheritance. Each partition
must be created as a child table of a single parent table. The parent table
itself is normally empty; it exists just to represent the entire data set.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 186


Partitioning Methods

• Range Partitioning
− Range partitions are defined via key column(s) with no overlap or gaps

• List Partitioning
− Each key value is explicitly listed for the partitioning scheme Partitioning Methods

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 187


Why Use Partitioning?

• Performance
− As tables grow query performance slows
− Data access methods, you may only need to access portions of data frequently
• Manageability
− Allows data to be added and removed easier
− Maintenance is easier (vacuum, reindex, cluster), can focus on active data
• Scalability
− Manage larger amounts of data easier
− Remove any hardware constraints

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 188


Creating the Master Table

Step 1: create table master_table (id numeric, name varchar(50) NOT NULL,
state varchar(20));

Step 2: Set postgresql.conf parameter :


Ensure you set constraint_exclusion = partition

NOTE: After setting this parameter, you will need to signal the server to reload
the configuration file by using the pg_ctl utility:

• pg_ctl –D <datadir> reload


© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 189
Creating Partition or Child Tables

Step 3: Create the child tables, a.k.a partitions:

Partition 1:
create table child1 (check (id between 1 and 100)) inherits (master_table);

Partition 2:
create table child2 (check (id between 101 and 200)) inherits (master_table);

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 190


Creating Procedure Function

create or replace function trg_cit_ins() returns trigger as


$$ begin
if NEW.id >= 1 and NEW.id <=100 then
insert into child1 values(NEW.id, NEW.name, NEW.state);
elseif NEW.id >= 101 and NEW.id <=200 then
insert into child2 values(NEW.id, NEW.name, NEW.state);
else
RAISE NOTICE 'INVALID ID RANGE';
end if;
return null;
end;
$$ language plpgsql;

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 191


Creating Trigger

postgres=# create trigger trg_ins_cit BEFORE INSERT on master_table FOR


EACH ROW EXECUTE PROCEDURE trg_cit_ins();
CREATE TRIGGER
postgres=#

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 192


Triggers

• Created with the CREATE FUNCTION command


• Several special variables are created automatically in the top-level block.
• NEW
− Data type RECORD; variable holding the new database row for INSERT/UPDATE
operations in row-level triggers. This variable is NULL in statement-level triggers. "
• OLD
− Data type RECORD; variable holding the old database row for UPDATE/DELETE
operations in row-level triggers. This variable is NULL in statement-level triggers. "
• TG_NAME
− Data type name; variable that contains the name of the trigger actually fired. "

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 193


Triggers (Contd.)

• TG_WHEN
− Data type text; a string of either BEFORE or AFTER depending on the trigger's definition. "
• TG_LEVEL
− Data type text; a string of either ROW or STATEMENT depending on the trigger's
definition. "
• TG_OP
− Data type text; a string of INSERT, UPDATE, or DELETE telling for which operation the
trigger was fired. "
• TG_RELNAME
− Data type name; the name of the table that caused the trigger invocation. "
• TG_NARGS
− Data type integer; the number of arguments given to the trigger procedure in the CREATE
TRIGGER statement. "

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 194


Triggers (Contd.)

• TRIGGERS TYPES : in order to create trigger we need to create function


first and return triggers

• ROW LEVEL :- will get fire on each row affected on the database, for eg if
you are executing an update query on 1000 rows the trigger will get fired
for 1000 times.

• STATEMENT LEVEL :- here it only fires only one time compared to above.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 195


PG_UPGRADE

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 196


Pg_upgrade

• Typically pg_dump is required for major version upgrades (e.g. 8.3.X to


8.4.X, or 8.4.X to 9.0.X)
• pg_upgrade is used to migrate PostgreSQL data files to a later
PostgreSQL major version without the data dump/reload
• pg_upgrade does its best to make sure the old and new clusters are
binary-compatible, e.g. by checking for compatible compile time settings,
including 32/64-bit binaries.
• It is important that any external modules are also binary compatible,
though this cannot be checked by pg_upgrade.
• pg_upgrade supports upgrades from 8.3.X and later to the current major
release of PostgreSQL.

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 197


Pg_upgrade

You can use pg_upgrade utility for migrating old cluster data
directories to new version.

Syntax:
pg_upgrade [OPTIONS]...

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 198


Pg_upgrade

Options:
-b, --old-bindir old cluster executable directory
-B, --new-bindir new cluster executable directory
-c, --check check clusters only, don't change any data
-d, --old-datadir old cluster data directory
-D, --new-datadir new cluster data directory
-l, --logfile log session activity to file
-p, --old-port old cluster port number (default 5432)
-P, --new-port new cluster port number (default 5432)
-u, --user clusters superuser (default "postgres")
-v, --verbose enable verbose output

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 199


Pg_upgrade

• Before running pg_upgrade you must:


− create a new database cluster (using the new version of initdb)
− shutdown the postmaster servicing the old cluster
− shutdown the postmaster servicing the new cluster
• When you run pg_upgrade, you must provide the following information:
− the data directory for the old cluster (-d OLDDATADIR)
− the data directory for the new cluster (-D NEWDATADIR)
− the 'bin' directory for the old version (-b OLDBINDIR)
− the 'bin' directory for the new version (-B NEWBINDIR)
• For example:
− pg_upgrade -d oldCluster/data -D newCluster/data -b oldCluster/bin -B newCluster/bin

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 200


Loading/Unloading

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 201


Copy

• postgres=# COPY emp (empno,ename,job,sal,comm,hiredate) TO


'/tmp/emp.csv' CSV HEADER;
• COPY
• postgres=# \! cat /tmp/emp.csv
• empno,ename,job,sal,comm,hiredate
• 7369,SMITH,CLERK,800.00,,17-DEC-80 00:00:00
• 7499,ALLEN,SALESMAN,1600.00,300.00,20-FEB-81 00:00:00
• 7521,WARD,SALESMAN,1250.00,500.00,22-FEB-81 00:00:00
• 7566,JONES,MANAGER,2975.00,,02-APR-81 00:00:00

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 202


Copy

• Postgres =# CREATE TEMP TABLE empcsv (LIKE emp);

• CREATE TABLE
• edb=# COPY empcsv (empno, ename, job, sal, comm, hiredate)
• edb-# FROM '/tmp/emp.csv' CSV HEADER;
• COPY

• postgres=# SELECT * FROM empcsv;


− empno | ename | job | mgr | hiredate | sal | comm | deptno

-------+--------+-----------+-----+--------------------+---------+---------+--------
− 7369 | SMITH | CLERK | | 17-DEC-80 00:00:00 | 800.00 | |
− 7499 | ALLEN | SALESMAN | | 20-FEB-81 00:00:00 | 1600.00 | 300.00 |
− 7521 | WARD | SALESMAN | | 22-FEB-81 00:00:00 | 1250.00 | 500.00 |
− 7566 | JONES | MANAGER | | 02-APR-81 00:00:00 | 2975.00 | |
− 7654 | MARTIN | SALESMAN | | 28-SEP-81 00:00:00 | 1250.00 | 1400.00 |
− 7698 | BLAKE | MANAGER | | 01-MAY-81 00:00:00 | 2850.00 | |
− 7782 | CLARK | MANAGER | | 09-JUN-81 00:00:00 | 2450.00 | |

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 203


Copy

Options:
-b, --old-bindir old cluster executable directory
-B, --new-bindir new cluster executable directory
-c, --check check clusters only, don't change any data
-d, --old-datadir old cluster data directory
-D, --new-datadir new cluster data directory
-l, --logfile log session activity to file
-p, --old-port old cluster port number (default 5432)
-P, --new-port new cluster port number (default 5432)
-u, --user clusters superuser (default "postgres")
-v, --verbose enable verbose output

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 204


Thank You

© Copyright EnterpriseDB Corporation, 2015. All rights reserved. 205

You might also like