Non-Relational Postgres
Non-Relational Postgres
BRUCE MOMJIAN
This talk explores the advantages of non-relational storage, and the Postgres support for
such storage.
1 / 71
Relational Storage
2 / 71
What Is Relational Storage?
https://en.wikipedia.org/wiki/Relational_database
3 / 71
What Is Data Normalization? First Normal Form
https://en.wikipedia.org/wiki/First_normal_form
http://www.anaesthetist.com/mnm/sql/normal.htm
4 / 71
Downsides of First Normal Form
• Query performance
• Query complexity
• Storage inflexibility
• Storage overhead
• Indexing limitations
5 / 71
Postgres Non-Relational Storage Options
1. Arrays
2. Range types
3. Geometry
4. XML
5. JSON
6. JSONB
7. Row types
8. Character strings
6 / 71
1. Arrays
CREATE TABLE employee
(name TEXT PRIMARY KEY, certifications TEXT[]);
SELECT name
FROM employee
WHERE certifications @> ’{ACSP}’;
name
------
Bill
All queries used in this presentation are available at https://momjian.us/main/writings/
pgsql/non-relational.sql.
7 / 71
Array Access
SELECT certifications[1]
FROM employee;
certifications
----------------
CCNA
SELECT unnest(certifications)
FROM employee;
unnest
--------
CCNA
ACSP
CISSP
8 / 71
Array Unrolling
9 / 71
Array Creation
10 / 71
2. Range Types
CREATE TABLE car_rental
(id SERIAL PRIMARY KEY, time_span TSTZRANGE);
SELECT *
FROM car_rental
WHERE time_span @> ’2016-05-09 00:00:00’::timestamptz;
id | time_span
----+-----------------------------------------------------
1 | ["2016-05-03 09:00:00-04","2016-05-11 12:00:00-04")
SELECT *
FROM car_rental
WHERE time_span @> ’2018-06-09 00:00:00’::timestamptz;
id | time_span
----+-----------
11 / 71
Range Type Indexing
INSERT INTO car_rental (time_span)
SELECT tstzrange(y, y + ’1 day’)
FROM generate_series(’2001-09-01 00:00:00’::timestamptz,
’2010-09-01 00:00:00’::timestamptz, ’1 day’) AS x(y);
SELECT *
FROM car_rental
WHERE time_span @> ’2007-08-01 00:00:00’::timestamptz;
id | time_span
------+-----------------------------------------------------
2162 | ["2007-08-01 00:00:00-04","2007-08-02 00:00:00-04")
EXPLAIN SELECT *
FROM car_rental
WHERE time_span @> ’2007-08-01 00:00:00’::timestamptz;
QUERY PLAN
-----------------------------------------------------------------------------
Seq Scan on car_rental (cost=0.00..64.69 rows=16 width=36)
Filter: (time_span @> ’2007-08-01 00:00:00-04’::timestamp with time zone)
12 / 71
Range Type Indexing
EXPLAIN SELECT *
FROM car_rental
WHERE time_span @> ’2007-08-01 00:00:00’::timestamptz;
QUERY PLAN
---------------------------------------------------------------------------------------
Bitmap Heap Scan on car_rental (cost=4.27..28.35 rows=16 width=36)
Recheck Cond: (time_span @> ’2007-08-01 00:00:00-04’::timestamp with time zone)
-> Bitmap Index Scan on car_rental_idx (cost=0.00..4.27 rows=16 width=0)
Index Cond: (time_span @> ’2007-08-01 00:00:00-04’::timestamp with time zone)
13 / 71
Exclusion Constraints
ALTER TABLE car_rental ADD EXCLUDE USING GIST (time_span WITH &&);
14 / 71
3. Geometry
SELECT *
FROM dart
LIMIT 5;
dartno | location
--------+-------------------------------------
1 | (60.1593657396734,64.1712633892894)
2 | (22.9252253193408,38.7973457109183)
3 | (54.7123382799327,16.1387695930898)
4 | (60.5669556651264,53.1596980988979)
5 | (22.7800350170583,90.8143546432257)
15 / 71
Geometry Restriction
-- find all darts within four units of point (50, 50)
SELECT *
FROM dart
WHERE location <@ ’<(50, 50), 4>’::circle;
dartno | location
--------+-------------------------------------
308 | (52.3920683190227,49.3803130928427)
369 | (52.1113255061209,52.9995835851878)
466 | (47.5943599361926,49.0266934968531)
589 | (46.3589935097843,50.3238912206143)
793 | (47.3468563519418,50.0582652166486)
EXPLAIN SELECT *
FROM dart
WHERE location <@ ’<(50, 50), 4>’::circle;
QUERY PLAN
------------------------------------------------------
Seq Scan on dart (cost=0.00..19.50 rows=1 width=20)
Filter: (location <@ ’<(50,50),4>’::circle)
16 / 71
Indexed Geometry Restriction
EXPLAIN SELECT *
FROM dart
WHERE location <@ ’<(50, 50), 4>’::circle;
QUERY PLAN
----------------------------------------------------------------------
Index Scan using dart_idx on dart (cost=0.14..8.16 rows=1 width=20)
Index Cond: (location <@ ’<(50,50),4>’::circle)
17 / 71
Geometry Indexes with LIMIT
-- find the two closest darts to (50, 50)
SELECT *
FROM dart
ORDER BY location <-> ’(50, 50)’::point
LIMIT 2;
dartno | location
--------+-------------------------------------
308 | (52.3920683190227,49.3803130928427)
466 | (47.5943599361926,49.0266934968531)
EXPLAIN SELECT *
FROM dart
ORDER BY location <-> ’(50, 50)’::point
LIMIT 2;
QUERY PLAN
----------------------------------------------------------------------------------
Limit (cost=0.14..0.30 rows=2 width=28)
-> Index Scan using darts_idx on darts (cost=0.14..80.14 rows=1000 width=28)
Order By: (location <-> ’(50,50)’::point)
18 / 71
4. XML
$ psql
CREATE TABLE printer (doc XML);
19 / 71
Xpath Query
20 / 71
Remove XML Array
21 / 71
Xpath to XML Text
22 / 71
Xpath to SQL Text
23 / 71
XML Non-Root Query
24 / 71
Unnest XML Arrays
25 / 71
Search XML Text
26 / 71
5. JSON Data Type
27 / 71
Load JSON Data
SELECT *
FROM friend
ORDER BY 1
LIMIT 2;
id | data
----+-----------------------------------------------------------…
1 | {"gender":"Male","first_name":"Eugene","last_name":"Reed",…
2 | {"gender":"Female","first_name":"Amanda","last_name":"Morr…
28 / 71
Pretty Print JSON
29 / 71
Access JSON Values
SELECT data->>’email’
FROM friend
ORDER BY 1
LIMIT 5;
?column?
-----------------------------
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
30 / 71
Concatenate JSON Values
SELECT data->>’first_name’ || ’ ’ ||
(data->>’last_name’)
FROM friend
ORDER BY 1
LIMIT 5;
?column?
----------------
Aaron Alvarez
Aaron Murphy
Aaron Rivera
Aaron Scott
Adam Armstrong
31 / 71
JSON Value Restrictions
SELECT data->>’first_name’
FROM friend
WHERE data->>’last_name’ = ’Banks’
ORDER BY 1;
?column?
----------
Bruce
Fred
33 / 71
JSON Calculations
SELECT data->>’first_name’ || ’ ’ || (data->>’last_name’),
data->>’ip_address’
FROM friend
WHERE (data->>’ip_address’)::inet <<= ’172.0.0.0/8’::cidr
ORDER BY 1;
?column? | ?column?
---------------+-----------------
Lisa Holmes | 172.65.223.150
Walter Miller | 172.254.148.168
34 / 71
6. JSONB
35 / 71
JSON vs. JSONB Data Types
36 / 71
JSONB Index
37 / 71
JSONB Index Queries
SELECT data->>’first_name’
FROM friend2
WHERE data @> ’{"last_name" : "Banks"}’
ORDER BY 1;
?column?
----------
Bruce
Fred
41 / 71
Row Types
SELECT *
FROM truck_driver;
id | name | license
----+---------------+------------------------
1 | Jimbo Biggins | (PA,175319,2017-03-12)
SELECT license
FROM truck_driver;
license
------------------------
(PA,175319,2017-03-12)
$ cd /tmp
$ wget http://web.mit.edu/freebsd/head/games/fortune/datfiles/fortunes
$ psql postgres
43 / 71
8.1 Case Folding and Prefix
44 / 71
Case Folding
45 / 71
Indexed Case Folding
46 / 71
String Prefix
SELECT line
FROM fortune
WHERE line LIKE ’Mop%’
ORDER BY 1;
line
-------------------------
Mophobia, n.:
Moping, melancholy mad:
47 / 71
String Prefix
48 / 71
Indexed String Prefix
49 / 71
Case Folded String Prefix
50 / 71
Indexed Case Folded String Prefix
51 / 71
8.2. Full Text Search
52 / 71
Tsvector and Tsquery
SHOW default_text_search_config;
default_text_search_config
----------------------------
pg_catalog.english
53 / 71
Tsvector and Tsquery
54 / 71
Indexing Full Text Search
55 / 71
Full Text Search Queries
SELECT line
FROM fortune
WHERE to_tsvector(’english’, line) @@ to_tsquery(’pandas’);
line
----------------------------------------------------------------------
A giant panda bear is really a member of the raccoon family.
56 / 71
Complex Full Text Search Queries
SELECT line
FROM fortune
WHERE to_tsvector(’english’, line) @@ to_tsquery(’cat & sleep’);
line
-----------------------------------------------------------------
People who take cat naps don’t usually sleep in a cat’s cradle.
SELECT line
FROM fortune
WHERE to_tsvector(’english’, line) @@ to_tsquery(’cat & (sleep | nap)’);
line
-----------------------------------------------------------------
People who take cat naps don’t usually sleep in a cat’s cradle.
Q: What is the sound of one cat napping?
57 / 71
Word Prefix Search
SELECT line
FROM fortune
WHERE to_tsvector(’english’, line) @@
to_tsquery(’english’, ’zip:*’)
ORDER BY 1;
line
------------------------------------------------------------------------
Bozo is the Brotherhood of Zips and Others. Bozos are people who band
… he’s the one who’s in trouble. One round from an Uzi can zip
far I’ve got two Bics, four Zippos and eighteen books of matches."
Postmen never die, they just lose their zip.
58 / 71
Word Prefix Search
59 / 71
8.3. Adjacent Letter Search
60 / 71
Adjacent Letter Search
61 / 71
Indexed Adjacent Letters
62 / 71
Indexed Adjacent Letters
SELECT line
FROM fortune
WHERE line ILIKE ’%verit%’
ORDER BY 1;
line
-------------------------------------------------------------------------
body. There hangs from his belt a veritable arsenal of deadly weapons:
In wine there is truth (In vino veritas).
Passes wind, water, or out depending upon the severity of the
63 / 71
Indexed Adjacent Letters
64 / 71
Word Prefix Search
65 / 71
Word Prefix Search
66 / 71
Similarity
SELECT show_limit();
show_limit
------------
0.3
67 / 71
Similarity
68 / 71
Indexes Created in this Section
\dt+ fortune
List of relations
Schema | Name | Type | Owner | Size | Description
--------+---------+-------+----------+---------+-------------
public | fortune | table | postgres | 4024 kB |
69 / 71
Use of the Contains Operator @> in this Presentation
\do @>
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Description
------------+------+---------------+----------------+-------------+-------------
pg_catalog | @> | aclitem[] | aclitem | boolean | contains
pg_catalog | @> | anyarray | anyarray | boolean | contains
pg_catalog | @> | anyrange | anyelement | boolean | contains
pg_catalog | @> | anyrange | anyrange | boolean | contains
pg_catalog | @> | box | box | boolean | contains
pg_catalog | @> | box | point | boolean | contains
pg_catalog | @> | circle | circle | boolean | contains
pg_catalog | @> | circle | point | boolean | contains
pg_catalog | @> | jsonb | jsonb | boolean | contains
pg_catalog | @> | path | point | boolean | contains
pg_catalog | @> | polygon | point | boolean | contains
pg_catalog | @> | polygon | polygon | boolean | contains
pg_catalog | @> | tsquery | tsquery | boolean | contains
70 / 71
Conclusion
https://momjian.us/presentations https://www.flickr.com/photos/patgaines/
71 / 71