JSON Extension - DuckDB
JSON Extension - DuckDB
0 (stable) IN THIS
Search ctrl+k
ARTICLE
Installation
JSON Extension Ins…
Ex…
Documentation The json extension is a loadable
extension that implements SQL functions JS…
Overview
that are useful for reading values from JS…
Connect
existing JSON, and creating new JSON
JS…
Data Import data.
JS…
Client APIs
JS…
Configuration
Installing and Loading JS…
SQL
The json extension is shipped by default JS…
Extensions
in DuckDB builds, otherwise, it will be Tra…
Overview
transparently autoloaded on first use. If Se…
Core Extensions you would like to install and load it
Ind…
Community Extensions manually, run:
Eq…
Working with Extensions
INSTALL json;
Versioning of Extensions
LOAD json;
Arrow
AutoComplete
ICU
SELECT *
inet
FROM read_json('todos.json',
jemalloc format = 'array',
columns = {userId: '
JSON id: 'UBIG
MySQL
title: 'V
completed
SQL
Extensions Insert JSON data into the table:
Overview
INSERT INTO example VALUES
Core Extensions ('{ "family": "anatidae", "spec
Community Extensions
AutoComplete
AWS
"anatidae"
Azure
Delta
Extract the family key's value with a
Excel JSONPath expression:
Full Text Search
SELECT j->'$.family' FROM example;
httpfs (HTTP and S3)
Iceberg
ICU "anatidae"
inet
MySQL
SELECT j->>'$.family' FROM example;
anatidae
Search ctrl+k
Installation
Documentation
JSON Type
Overview The json extension makes use of the
Connect JSON logical type. The JSON logical type
is interpreted as JSON, i.e., parsed, in
Data Import
JSON functions rather than interpreted as
Client APIs VARCHAR , i.e., a regular string (modulo the
Configuration equality-comparison caveat at the bottom
SQL of this page). All JSON creation functions
return values of this type.
Extensions
Versioning of Extensions
SELECT '{"duck": 42}'::JSON::STRUCT
Arrow
AutoComplete
Iceberg {"duck":42}
ICU
MySQL
SELECT '2023-05-12'::DATE::JSON;
"2023-05-12"
Search ctrl+k
Data Import
Overview Function De
Arrow
read_ndjson_objects(filename) Ali
AutoComplete re
AWS wit
se
Azure
'n
Delta
read_json_objects_auto(filename) Ali
Excel
re
Full Text Search wit
httpfs (HTTP and S3) se
Iceberg
jemalloc
JSON
MySQL
Name Description
Data Import
filename Whether or not an
Client APIs filename colum
Configuration should be include
result.
SQL
Extensions format Can be one of ['
'unstructured'
Overview
'newline_delim
Core Extensions 'array'] .
Community Extensions
hive_partitioning Whether or not to
Working with Extensions
interpret the path
Versioning of Extensions Hive partitioned p
Arrow
ignore_errors Whether to ignore
AutoComplete errors (only possib
AWS when format is
'newline_delim
Azure
ICU
The format parameter specifies how to
inet
read the JSON from a file. With
jemalloc 'unstructured' , the top-level JSON is
JSON read, e.g.:
MySQL
{
"duck": 42
}
{
"goose": [1, 2, 3]
Search ctrl+k
}
Installation
will result in two objects being read.
Documentation
SQL
Extensions
will also result in two objects being read.
Community Extensions
[
Working with Extensions {
"duck": 42
Versioning of Extensions
},
Arrow {
"goose": [1, 2, 3]
AutoComplete
}
AWS ]
Azure
ICU
inet {"duck":42,"goose":[1,2,3]}
jemalloc
JSON
MySQL
SELECT * FROM read_json_objects(['m
{"duck":42,"goose":[1,2,3]}
Search ctrl+k
{"duck":43,"goose":[4,5,6],"swan":3
Installation
Documentation SELECT * FROM read_ndjson_objects('
Overview
Connect
{"duck":42,"goose":[1,2,3]}
Data Import {"duck":43,"goose":[4,5,6],"swan":3
Client APIs
Configuration DuckDB also supports reading JSON as a
SQL table, using the following functions:
Extensions
Function Descripti
Overview
jemalloc
Besides the maximum_object_size ,
JSON format , ignore_errors and
MySQL
compression , these functions have
additional parameters:
Search ctrl+k
auto_detect Whether to BOOL
auto-detect
Installation the names of
Documentation the keys and
data types of
Overview the values
Connect automatically
Data Import
columns A struct that STRUC
Client APIs specifies the
Configuration key names
and value
SQL
types
Extensions contained
within the
Overview
JSON file (e.g.,
Core Extensions {key1:
Community Extensions 'INTEGER',
key2:
Working with Extensions
'VARCHAR'} ).
Versioning of Extensions If
auto_detect
Arrow
is enabled
AutoComplete these will be
AWS inferred
Azure
dateformat Specifies the VARCH
Delta date format to
use when
Excel
parsing dates.
Full Text Search See Date
httpfs (HTTP and S3) Format
Iceberg
maximum_depth Maximum BIGIN
ICU nesting depth
inet to which the
automatic
jemalloc schema
JSON detection
detects types.
MySQL
Name Description Type
Set to -1 to
fully detect
nested JSON
Search ctrl+k types
Connect
sample_size Option to UBIGI
Data Import define number
Client APIs of sample
objects for
Configuration
automatic
SQL JSON type
detection. Set
Extensions
to -1 to scan
Overview the entire
Core Extensions input file
ICU
SELECT * FROM read_json('my_file1.j
inet
jemalloc
JSON
MySQL
duck
42
Overview
43 [4, 5, 6] 3.3
Core Extensions
AWS
Delta [1, 2, 3] 42
Excel
[4, 5, 6] 43
Full Text Search
jemalloc
[
JSON {
MySQL "duck": 42,
"goose": 4.2
},
{
"duck": 43,
"goose": 4.3
Search ctrl+k }
]
Installation
Documentation Can be queried exactly the same as a
JSON file that contains 'unstructured'
Overview
JSON, e.g.:
Connect
Data Import {
"duck": 42,
Client APIs
"goose": 4.2
Configuration }
{
SQL
"duck": 43,
Extensions "goose": 4.3
}
Overview
Core Extensions
Both can be read as the table:
Community Extensions
Versioning of Extensions
42 4.2
Arrow
AutoComplete 43 4.3
AWS
If your JSON file does not contain
Azure
'records', i.e., any other type of JSON than
Delta objects, DuckDB can still read it. This is
Excel specified with the records parameter.
Full Text Search
The records parameter specifies
whether the JSON contains records that
httpfs (HTTP and S3)
should be unpacked into individual
Iceberg columns, i.e., reading the following file
ICU with records :
inet
{"duck": 42, "goose": [1, 2, 3]}
jemalloc
{"duck": 43, "goose": [4, 5, 6]}
JSON
MySQL
Results in two columns:
duck goose
42 [1,2,3]
Search ctrl+k
42 [4,5,6]
Installation
Documentation You can read the same file with records
Overview set to 'false' , to get a single column,
which is a STRUCT containing the data:
Connect
Extensions
For additional examples reading more
Overview
complex data, please see the Shredding
Core Extensions
Deeply Nested JSON, One Vector at a
Community Extensions Time blog post.
Working with Extensions
Versioning of Extensions
JSON Import/Export
Arrow
AutoComplete When the json extension is installed,
FORMAT JSON is supported for COPY FROM ,
AWS
COPY TO , EXPORT DATABASE and IMPORT
Azure DATABASE . See Copy and Import/Export.
Delta
By default, COPY expects newline-
Excel
delimited JSON. If you prefer copying
Full Text Search data to/from a JSON array, you can
httpfs (HTTP and S3) specify ARRAY true , e.g.,
Iceberg
COPY (SELECT * FROM range(5)) TO 'm
ICU
inet
will create the following file:
jemalloc
JSON
MySQL
[
{"range":0},
{"range":1},
{"range":2},
{"range":3},
Search ctrl+k
{"range":4}
]
Installation
Documentation
This can be read like so:
Overview
Connect CREATE TABLE test (range BIGINT);
COPY test FROM 'my.json' (ARRAY tru
Data Import
Client APIs
The format can be detected automatically
Configuration
the format like so:
SQL
Extensions COPY test FROM 'my.json' (AUTO_DETE
Overview
Core Extensions
JSON
MySQL
Function Description
Search ctrl+k
array_to_json(list) Alias for
to_json
Installation that only
Documentation accepts
LIST .
Overview
Connect row_to_json(list) Alias for
to_json
Data Import
that only
Client APIs accepts
Configuration STRUCT .
SQL
json_array([any, Create a
Extensions ...]) JSON array
from any
Overview
number of
Core Extensions values.
Community Extensions
json_object([key, Create a
Working with Extensions
value, ...]) JSON object
Versioning of Extensions from any
number of
Arrow
key , value
AutoComplete pairs.
AWS
json_merge_patch(json, Merge two
Azure
json) JSON
Delta documents
Excel together.
inet
"duck"
jemalloc
JSON
MySQL
SELECT to_json([1, 2, 3]);
[1,2,3]
Search ctrl+k
Documentation
Overview
{"duck":42}
Connect
Data Import
SELECT to_json(map(['duck'],[42]));
Client APIs
Configuration
SQL {"duck":42}
Extensions
Overview
SELECT json_array(42, 'duck', NULL)
Core Extensions
Community Extensions
Versioning of Extensions
AWS
{"duck":42}
Azure
Delta
ICU
MySQL
There are two extraction functions, which
have their respective operators. The
operators can only be used if the string is
stored as the JSON logical type. These
Search ctrl+k functions supports the same two location
notations as the previous functions.
Installation
Function Alias
Documentation
Connect
Data Import
Client APIs
Configuration
SQL
Extensions
Community Extensions
Versioning of Extensions
Arrow
AutoComplete
AWS
Azure
Note that the equality comparison
Delta operator ( = ) has a higher precedence
Excel than the -> JSON extract operator.
Full Text Search Therefore, surround the uses of the ->
operator with parentheses when making
httpfs (HTTP and S3)
equality comparisons. For example:
Iceberg
jemalloc
Warning
JSON
MySQL
DuckDB's JSON data type uses 0-
based indexing.
Examples:
Search ctrl+k
Overview
Connect SELECT json_extract(j, '$.family')
Data Import
Client APIs
"anatidae"
Configuration
SQL
Extensions SELECT j->'$.family' FROM example;
Overview
Arrow
AutoComplete "duck"
AWS
Azure
SELECT j->'$.species[*]' FROM examp
Delta
Excel
Full Text Search ["duck", "goose", "swan", null]
inet
[duck, goose, swan, null]
jemalloc
JSON
MySQL
SELECT j->'$.species'->0 FROM examp
"duck"
Search ctrl+k
Overview
["duck", "goose"]
Connect
Data Import
Configuration
SQL
anatidae
Extensions
Overview
SELECT j->>'$.family' FROM example;
Core Extensions
Community Extensions
Arrow
SELECT j->>'$.species[0]' FROM exam
AutoComplete
AWS
Azure duck
Delta
Excel
SELECT j->'species'->>0 FROM exampl
Full Text Search
Iceberg duck
ICU
MySQL
[duck, goose]
SQL SELECT
json_extract(j, 'family') AS fa
Extensions json_extract(j, 'species') AS s
FROM example;
Overview
Core Extensions
Community Extensions
family species
Working with Extensions
"anatidae" ["duck","goose","swan",n
Versioning of Extensions
Arrow
The following produces the same result
AutoComplete
but is faster and more memory-efficient:
AWS
ICU
inet
jemalloc
JSON Scalar
Functions
JSON
MySQL
The following scalar JSON functions can
be used to gain information about the
stored JSON values. With the exception
of json_valid(json) , all JSON functions
Search ctrl+k produce an error when invalid JSON is
supplied.
Installation
We support two kinds of notations to
Documentation describe locations within JSON: JSON
Overview Pointer and JSONPath.
Connect
Function Descrip
Data Import
json_array_length(json[, Return t
Client APIs
path]) number
Configuration element
SQL JSON a
json , o
Extensions
not a JS
Overview If path
specifie
Core Extensions
the num
Community Extensions element
Working with Extensions JSON a
given pa
Versioning of Extensions
path is
Arrow the resu
AutoComplete LIST o
lengths.
AWS
Arrow
AutoComplete
AWS
Azure
Delta
Excel
Full Text Search
Iceberg
ICU
inet
jemalloc
JSON
MySQL
Function Descrip
Iceberg
1
ICU
MySQL
the same example, we can do the
following:
Installation 1
Documentation
Configuration
SELECT json_extract('{"duck": [1, 2
SQL
Extensions
Overview 3
Core Extensions
Versioning of Extensions
SELECT json_extract('{"duck.goose":
Arrow
AutoComplete
AWS 2
Azure
Delta
Examples using the anatidae biological
family:
Excel
Full Text Search CREATE TABLE example (j JSON);
httpfs (HTTP and S3) INSERT INTO example VALUES
('{ "family": "anatidae", "spec
Iceberg
ICU
jemalloc
JSON
MySQL
{"family":"anatidae","species":["du
Installation "anatidae"
Documentation
Overview
SELECT j.species[0] FROM example;
Connect
Data Import
"duck"
Client APIs
Configuration
Overview
true
Core Extensions
Community Extensions
SELECT json_valid('{');
Working with Extensions
Versioning of Extensions
Arrow false
AutoComplete
AWS
SELECT json_array_length('["duck",
Azure
Delta
Excel 4
ICU
4
inet
jemalloc
JSON SELECT json_array_length(j, '/speci
MySQL
4
Installation 4
Documentation
Overview
SELECT json_array_length(j, ['$.spe
Connect
Data Import
Configuration
SQL
SELECT json_type(j) FROM example;
Extensions
Overview
OBJECT
Core Extensions
Community Extensions
Versioning of Extensions
AWS
SELECT json_structure(j) FROM examp
Azure
Delta
Excel {"family":"VARCHAR","species":["VAR
ICU
inet ["JSON"]
jemalloc
JSON
MySQL
SELECT json_contains('{"key": "valu
true
Search ctrl+k
Overview
true
Connect
Data Import
Configuration
SQL
true
Extensions
Overview
MySQL
Function Descripti
of all json
aggregatio
Search ctrl+k
Examples:
Installation
CREATE TABLE example1 (k VARCHAR, v
Documentation INSERT INTO example1 VALUES ('duck'
Overview
Connect
SELECT json_group_array(v) FROM exa
Data Import
Client APIs
Configuration [42, 7]
SQL
Extensions
SELECT json_group_object(k, v) FROM
Overview
Core Extensions
Azure
SELECT json_group_structure(j) FROM
Delta
Excel
Full Text Search {"family":"VARCHAR","species":["VAR
httpfs (HTTP and S3)
Iceberg
Function Descript
Search ctrl+k
json_transform(json, Transform
structure) to the spe
Installation structu
Documentation
from_json(json, structure) Alias for
Overview
Connect json_transform_strict(json, Same as
structure) json_tr
Data Import
throws an
Client APIs casting fa
Configuration
from_json_strict(json, Alias for
SQL structure) json_tr
Extensions
Azure Examples:
Delta
CREATE TABLE example (j JSON);
Excel INSERT INTO example VALUES
Full Text Search ('{"family": "anatidae", "speci
('{"family": "canidae", "specie
httpfs (HTTP and S3)
Iceberg
inet
jemalloc
{'family': anatidae, 'coolness': 42
JSON {'family': canidae, 'coolness': NUL
MySQL
SELECT json_transform(j, '{"family"
Installation
Documentation SELECT json_transform_strict(j, '{"
Overview
Connect
Invalid Input Error: Failed to cast
Data Import
Client APIs
Configuration
Serializing and
SQL
Deserializing SQL to
Extensions
JSON and Vice Versa
Overview
The json extension also provides
Core Extensions
functions to serialize and deserialize
Community Extensions SELECT statements between SQL and
Working with Extensions JSON, as well as executing JSON
Versioning of Extensions serialized statements.
Arrow
Function
AutoComplete
json_deserialize_sql(json)
AWS
Azure
Delta
json_execute_serialized_sql(varchar)
Excel
Full Text Search
Iceberg json_serialize_sql(varchar,
ICU skip_empty := boolean, skip_null :=
boolean, format := boolean)
inet
jemalloc
PRAGMA
JSON
json_execute_serialized_sql(varchar)
MySQL
Function
JSON
MySQL
SELECT json_serialize_sql('SELECT 2
'{"error":false,"statements":[{"nod
Search ctrl+k
Overview
SELECT json_serialize_sql('SELECT 1
Connect
Data Import
Configuration
Core Extensions
AWS
Azure
'SELECT (1 + 2)'
Delta
Excel
Example with deserialize and syntax
Full Text Search sugar:
httpfs (HTTP and S3)
ICU
inet
'SELECT (1 + 2) FROM x'
jemalloc
JSON Example with execute:
MySQL
SELECT * FROM json_execute_serializ
3
Search ctrl+k
Documentation
SELECT * FROM json_execute_serializ
Overview
Connect
Client APIs
Configuration
SQL Indexing
Extensions
Warning
Overview
Arrow
AutoComplete
AWS
Equality Comparison
Azure
Warning
Delta
Currently, equality comparison of
Excel JSON files can differ based on the
Full Text Search context. In some cases, it is based
on raw text comparison, while in
httpfs (HTTP and S3)
other cases, it uses logical content
Iceberg comparison.
ICU
JSON
MySQL
SELECT
a != b, -- Space is part of phy
c != d, -- Same.
c[0] = d[0], -- Equality becaus
a = c[0], -- Indeed, field is e
Search ctrl+k
b != c[0], -- ... but different
FROM (
Installation SELECT
'[]'::JSON AS a,
Documentation
'[ ]'::JSON AS b,
Overview '[[]]'::JSON AS c,
'[[ ]]'::JSON AS d
Connect );
Data Import
Client APIs
(a != (c != (c[0] = (a = (b !=
Configuration
b) d) d[0]) c[0]) c[0])
SQL
true true true true true
Extensions
Overview
Core Extensions
About this page Last
Community Extensions
modified:
• Visit the related directory in
Working with Extensions 2024-08-02
the main DuckDB GitHub
Versioning of Extensions repository
• Report content issue
Arrow
• Edit this page on GitHub
AutoComplete
AWS
Azure
Delta
Excel
Full Text Search
Iceberg
ICU
inet
jemalloc
JSON
MySQL