0% found this document useful (0 votes)
69 views

Duckdb: An Embeddable Analytical Database: Mark Raasveldt Hannes Mühleisen

Duck DB

Uploaded by

Alejandro Sava
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Duckdb: An Embeddable Analytical Database: Mark Raasveldt Hannes Mühleisen

Duck DB

Uploaded by

Alejandro Sava
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

DuckDB: an Embeddable Analytical Database

Mark Raasveldt Hannes Mühleisen


[email protected] [email protected]
CWI, Amsterdam CWI, Amsterdam
ABSTRACT
The great popularity of SQLite shows that there is a need for Embedded
unobtrusive in-process data management solutions. How- ?
ever, there is no such system yet geared towards analytical
workloads. We demonstrate DuckDB, a novel data manage-
ment system designed to execute analytical SQL queries
while embedded in another process. In our demonstration,
we pit DuckDB against other data management solutions Stand-Alone
to showcase its performance in the embedded analytics sce-
nario. DuckDB is available as Open Source software under a
permissive license. OLTP OLAP

ACM Reference Format:


Mark Raasveldt and Hannes Mühleisen. 2019. DuckDB: an Em- Figure 1: Systems Landscape
beddable Analytical Database. In 2019 International Conference on
Management of Data (SIGMOD ’19), June 30-July 5, 2019, Ams-
terdam, Netherlands. ACM, New York, NY, USA, 4 pages. https:
//doi.org/10.1145/3299869.3320212 There is a clear need for embeddable analytical data man-
agement. This needs stems from two main sources: Inter-
1 INTRODUCTION active data analysis and “edge” computing. Interactive data
Data management systems have evolved into large mono- analysis is performed using tools such as R or Python. The
lithic database servers running as stand-alone processes. This basic data management operators available in these envi-
is partly a result of the need to serve requests from many ronments through extensions (dplyr [14], Pandas [6], etc.)
clients simultaneously and partly due to data integrity re- closely resemble stacked relational operators, much like in
quirements. While powerful, stand-alone systems require SQL queries, but lack full-query optimization and transac-
considerable effort to set up properly and data access is con- tional storage. Embedded analytical data management is also
stricted by their client protocols [12]. There exists a com- desirable for edge computing scenarios. For example, con-
pletely separate use case for data management systems, those nected power meters currently forward data to a central
that are embedded into other processes where the database location for analysis. This is problematic due to bandwidth
system is a linked library that runs completely within a “host” limitations especially on radio interfaces, and also raises pri-
process. The most well-known representative of this group vacy concerns. An embeddable analytical database is very
is SQLite, the most widely deployed SQL database engine well-equipped to support this use case, with data analyzed
with more than a trillion databases in active use [4]. SQLite on the edge node. The two use cases of interactive analysis
strongly focuses on transactional (OLTP) workloads, and con- and edge computing appear orthogonal. But surprisingly, the
tains a row-major execution engine operating on a B-Tree different use cases yield similar requirements. For example,
storage format [3]. As a consequence, SQLite’s performance in both use cases portability and resource requirements are
on analytical (OLAP) workloads is very poor. critical, and systems that are careful with both will do well
Permission to make digital or hard copies of part or all of this work for in both usage scenarios.
personal or classroom use is granted without fee provided that copies are In our previous research, we have developed MonetDBLite,
not made or distributed for profit or commercial advantage and that copies an embedded analytical system that was derived from the
bear this notice and the full citation on the first page. Copyrights for third- MonetDB system [13]. MonetDBLite proved successfully
party components of this work must be honored. For all other uses, contact that there is a real interest in embedded analytics, it enjoys
the owner/author(s).
SIGMOD ’19, June 30-July 5, 2019, Amsterdam, Netherlands
thousands of downloads per month and is used all around
© 2019 Copyright held by the owner/author(s). the world from the Dutch central bank to the New Zealand
ACM ISBN 978-1-4503-5643-5/19/06. police. However, its success also uncovered several issues
https://doi.org/10.1145/3299869.3320212 that proved very complex to address in a non-purpose-built
SIGMOD ’19, June 30-July 5, 2019, Amsterdam, Netherlands Mark Raasveldt and Hannes Mühleisen

system. The following requirements for embedded analytical API C/C++/SQLite


databases were identified: SQL Parser libpg_query [2]
• High efficiency for OLAP workloads, but without com- Optimizer Cost-Based [7, 9]
pletely sacrificing OLTP performance. For example, Execution Engine Vectorized [1]
concurrent data modification is a common use case in Concurrency Control Serializable MVCC [10]
dashboard-scenarios where multiple threads update Storage DataBlocks [5]
the data using OLTP queries and other threads run the Table 1: DuckDB: Component Overview
OLAP queries that drive visualizations simultaneously.
• High degree of stability, if the embedded database
crashes, for example due to an out-of-memory situ-
ation, it takes the host down with it. This can never
happen. Queries need to be able to be aborted cleanly DuckDB’s components is revolutionary in its own regard.
if they run out of resources, and the system needs to Instead, we combined methods and algorithms from the state
gracefully adapt to resource contention. of the art that were best suited for our use cases.
• Efficient transfer of tables to and from the database Being an embedded database, DuckDB does not have a
is essential. Since both database and application run client protocol interface or a server process, but instead is
in the same process and thus address space, there is accessed using a C/C++ API. In addition, DuckDB provides
a unique opportunity for efficient data sharing which a SQLite compatibility layer, allowing applications that pre-
needs to be exploited. viously used SQLite to use DuckDB through re-linking or
• Practical “embeddability” and portability, the database library overloading.
needs to run in whatever environment the host does. The SQL parser is derived from Postgres’ SQL parser that
Dependencies on external libraries (e.g. openssh) has been stripped down as much as possible [2]. This has
for either compile- or runtime have been found to the advantage of providing DuckDB with a full-featured and
be problematic. Signal handling, calls to exit() and stable parser to handle one of the most volatile form of its
modification of singular process state (locale, working input, SQL queries. The parser takes a SQL query string as
directory etc.) are forbidden. input and returns a parse tree of C structures. This parse tree
In this demonstration, we present the capabilities of our is then immediately transformed into our own parse tree of
new system, DuckDB. DuckDB is a new purpose-built em- C++ classes to limit the reach of Postgres’ data structures.
beddable relational database management system. DuckDB This parse tree consists of statements (e.g. SELECT, INSERT
is available as Open-Source software under the permissive etc.) and expressions (e.g. SUM(a)+1).
MIT license1 . To the best of our knowledge, there currently The logical planner consists of two parts, the binder and
exists no purpose-built embeddable analytical database de- the plan generator. The binder resolves all expressions re-
spite the clear need outlined above. DuckDB is no research ferring to schema objects such as tables or views with their
prototype but built to be widely used, with millions of test column names and types. The logical plan generator then
queries run on each commit to ensure correct operation and transforms the parse tree into a tree of basic logical query
completeness of the SQL interface. Our first-ever demonstra- operators such as scan, filter, project, etc. After the plan-
tion of DuckDB will pit it against other systems on a small ning phase, we have a fully type-resolved logical query plan.
device. We will allow viewers to increase the size of the DuckDB keeps statistics on the stored data, and these are
dataset processed, and observe various metrics such as CPU propagated through the different expression trees as part of
load and memory pressure as the dataset size changes. This the planning process. These statistics are used in the opti-
will demonstrate the performance of DuckDB for analytical mizer itself, and are also used for integer overflow prevention
embedded data analysis. by upgrading types when required.
DuckDB’s optimizer performs join order optimization us-
2 DESIGN AND IMPLEMENTATION ing dynamic programming [7] with a greedy fallback for
complex join graphs [11]. It performs flattening of arbitrary
DuckDB’s design decisions are informed by its intended use
subqueries as described in Neumann et al. [9]. In addition,
case: embedded analytics. Overall, we follow the “textbook”
there are a set of rewrite rules that simplify the expression
separation of components: Parser, logical planner, optimizer,
tree, by performing e.g. common subexpression elimination
physical planner, execution engine. Orthogonal components
and constant folding. Cardinality estimation is done using a
are the transaction and storage managers. While DuckDB
combination of samples and HyperLogLog. The result of this
is first in a new class of data management systems, none of
process is the optimized logical plan for the query. The physi-
1 https://github.com/cwida/duckdb cal planner transforms the logical plan into the physical plan,
DuckDB: an Embeddable Analytical Database SIGMOD ’19, June 30-July 5, 2019, Amsterdam, Netherlands

selecting suitable implementations where applicable. For ex-


ample, a scan may decide to use an existing index instead of
scanning the base tables based on selectivity estimates, or Live
switch between a hash join or merge join depending on the Metrics
join predicates.
DuckDB uses a vectorized interpreted execution engine [1].
This approach was chosen over Just-in-Time compilation
(JIT) of SQL queries [8] for portability reasons. JIT engines
depend on massive compiler libraries (e.g. LLVM) with ad-
ditional transitive dependencies. DuckDB uses vectors of a
fixed maximum amount of values (1024 per default). Fixed- SQLite MonetDBLite HyPer DuckDB
length types such as integers are stored as native arrays.
Variable-length values such as strings are represented as a
native array of pointers into a separate string heap. NULL Dataset Size
values are represented using a separate bit vector, which
is only present if NULL values appear in the vector. This
allows fast intersection of NULL vectors for binary vector Figure 2: Demonstration setup schematic
operations and avoids redundant computation. To avoid ex-
cessive shifting of data within the vectors when e.g. the data
is filtered, the vectors may have a selection vector, which is
a list of offsets into the vector stating which indices of the In addition, blocks carry a lightweight index for every col-
vector are relevant [1]. DuckDB contains an extensive library umn, which allows to restrict the amount of values scanned
of vector operations that support the relational operators, even further [5].
this library expands code for all supported data types using
C++ code templates.
The execution engine executes the query in a so-called 3 DEMONSTRATION SCENARIO
“Vector Volcano” model. Query execution commences by pulling In our interactive demonstration scenarios, we would like
the first “chunk” of data from the root node of the physical to showcase two major advantages of DuckDB: The ability
plan. A chunk is a horizontal subset of a result set, query to process large data sets on restricted hardware resources
intermediate or base table. This node will recursively pull combined with the benefit of embedded operations.
chunks from child nodes, eventually arriving at a scan oper- Our demonstration is setup on a table on which a screen,
ator which produces chunks by reading from the persistent a large dial and four identical benchmark computers are laid
tables. This continues until the chunk arriving at the root is out. Each computer runs a different DBMS: SQLite, MonetD-
empty, at which point the query is completed. BLite, HyPer and DuckDB. Each database is pre-loaded with
DuckDB provides ACID-compliance through Multi-Version the TPC-H benchmark tables, chosen because the audience
Concurrency Control (MVCC). We implement HyPer’s se- is likely familiar with this schema. Computers are connected
rializable variant of MVCC that is tailored specifically for using Ethernet to a fifth “management” computer which can
hybrid OLAP/OLTP systems [10]. This variant updates data be used to configure a query that is run in repetition on the
in-place immediately, and keeps previous states stored in a four benchmark computers. The screen shows real-time met-
separate undo buffer for concurrent transactions and aborts. rics from all four benchmark computers, at least the query
MVCC was chosen over simpler schemes such as Optimistic completion rate (QpS) and memory pressure. The dial on the
Concurrency Control because, even though DuckDB’s main table controls the amount of input data read for the currently
use case is analytics, modifying tables in parallel was still an configured query.
often-requested feature in the past. We propose two demonstration scenarios, a “teaser” and
For persistent storage, DuckDB uses the read-optimized a “drilldown” scenario: For the “teaser” scenario, a suitable
DataBlocks storage layout. Logical tables are horizontally query is pre-configured on the benchmark computers. The
partitioned into chunks of columns which are compressed audience will be invited to turn the physical dial to increase/de-
into physical blocks using light-weight compression meth- crease the amount of data that is read from the fact tables.
ods. Blocks carry min/max indexes for every column allow- This will immediately influence intermediate and result set
ing to quickly determine whether they are relevant to a query. sizes of the pre-configured query which will also immediately
impact the metrics shown on the screen. Figure 2 illustrates
this setup.
SIGMOD ’19, June 30-July 5, 2019, Amsterdam, Netherlands Mark Raasveldt and Hannes Mühleisen

While for very small data sets all systems will show com- and elsewhere. We are also particularly indebted to the TUM
parable behavior, only DuckDB will be able to continue func- database group for their papers on query optimization, win-
tioning for larger ones. SQLite will begin to suffer from its dow functions, storage and concurrency control that we used
row-based execution model and MonetDBLite begins to suf- to implement DuckDB.
fer from excessive intermediate result materialization due to
its bulk processing model. While HyPer is extremely fast in REFERENCES
processing queries, it will not be able to transfer result sets [1] Peter A. Boncz, Marcin Zukowski, and Niels Nes. 2005. MonetDB/X100:
as quickly as DuckDB using its socket client protocol [12]. Hyper-Pipelining Query Execution. In CIDR 2005, Second Biennial
For the “drilldown” scenario, we invite the audience to Conference on Innovative Data Systems Research, Asilomar, CA, USA,
January 4-7, 2005. 225–237. http://cidrdb.org/cidr2005/papers/P19.pdf
propose their own query to be configured into the benchmark [2] Lukas Fittl. 2019. C library for accessing the PostgreSQL parser outside
computers. This will allow direct appraisal of DuckDB’s of the server environment. https://github.com//fittl/libpg_query.
performance, without the demonstration authors being able [3] Richard Hipp. 2019. Database File Format. https://www.sqlite.org/
to cherry-pick queries where DuckDB excels. Again, the fileformat.html.
audience member that has proposed the query will then be [4] Richard Hipp. 2019. Most Widely Deployed and Used Database Engine.
https://www.sqlite.org/mostdeployed.html.
able to turn the dial to increase the amount of data read by [5] Harald Lang, Tobias Mühlbauer, Florian Funke, et al. 2016. Data
the query and observe the impact on the four systems in Blocks: Hybrid OLTP and OLAP on Compressed Storage using both
real-time. Vectorization and Compilation. In Proceedings of the 2016 Interna-
tional Conference on Management of Data, SIGMOD Conference 2016,
San Francisco, CA, USA, June 26 - July 01, 2016. 311–326. https:
4 CURRENT STATE AND NEXT STEPS //doi.org/10.1145/2882903.2882925
As of this writing, DuckDB runs all TPC-H queries and all [6] Wes McKinney. 2010. Data Structures for Statistical Computing in
but two TPC-DS queries. We expect complete TPC-DS cov- Python. In Proceedings of the 9th Python in Science Conference, Stéfan
van der Walt and Jarrod Millman (Eds.). 51 – 56.
erage by the time the demonstration is presented. DuckDB
[7] Guido Moerkotte and Thomas Neumann. 2008. Dynamic programming
also already completes most of SQLite’s SQL logic test suite strikes back. In Proceedings of the ACM SIGMOD International Confer-
that contains millions of queries. Immediate next steps for ence on Management of Data, SIGMOD 2008, Vancouver, BC, Canada,
DuckDB are completion of DataBlocks storage scheme and June 10-12, 2008. 539–552. https://doi.org/10.1145/1376616.1376672
subquery folding, currently in branches. A buffer manager [8] Thomas Neumann. 2011. Efficiently Compiling Efficient Query Plans
for Modern Hardware. PVLDB 4, 9 (2011), 539–550. https://doi.org/10.
is also not yet implemented, but will be. DuckDB already
14778/2002938.2002940
supports inter-query parallelism but intra-query parallelism [9] Thomas Neumann and Alfons Kemper. 2015. Unnesting Arbitrary
will be added as well. We plan to implement a work steal- Queries. In Datenbanksysteme für Business, Technologie und Web (BTW),
ing scheduler to balance resources between short and long 16. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssys-
running queries. A special consideration is also to allow bal- teme" (DBIS), 4.-6.3.2015 in Hamburg, Germany. Proceedings. 383–402.
https://dl.gi.de/20.500.12116/2418
ancing resource usage with the host application, a special
[10] Thomas Neumann, Tobias Mühlbauer, and Alfons Kemper. 2015. Fast
issue for embedded operations. As with MonetDBLite, we Serializable Multi-Version Concurrency Control for Main-Memory
will implement the database APIs of R and Python etc. Database Systems. In Proceedings of the 2015 ACM SIGMOD Interna-
A more advanced future direction is self-checking. We tional Conference on Management of Data, Melbourne, Victoria, Aus-
have learned to distrust the hardware the database is running tralia, May 31 - June 4, 2015. 677–689. https://doi.org/10.1145/2723372.
2749436
on. This is particularly relevant in the edge computing use
[11] Thomas Neumann and Bernhard Radke. 2018. Adaptive Optimization
case, where hardware failures are to be commonplace. One of Very Large Join Queries. In Proceedings of the 2018 International
approach is to keep checksums on all persistent and inter- Conference on Management of Data (SIGMOD ’18). ACM, New York,
mediate data and piggy-back checksum verification on scan NY, USA, 677–692. https://doi.org/10.1145/3183713.3183733
operators. This might be possible without a significant per- [12] Mark Raasveldt and Hannes Mühleisen. 2017. Don’t Hold My Data
Hostage - A Case For Client Protocol Redesign. PVLDB 10, 10 (2017),
formance impact. A vectorized engine is particularly suited
1022–1033. https://doi.org/10.14778/3115404.3115408
for this since a chunk of data typically fits in the CPU cache [13] Mark Raasveldt and Hannes Mühleisen. 2018. MonetDBLite:
and additional passes are not requiring RAM access. Another An Embedded Analytical Database. CoRR abs/1805.08520 (2018).
approach to increasing trust in the hardware is inspired by arXiv:1805.08520 http://arxiv.org/abs/1805.08520
video game developers which periodically run sanity check [14] Hadley Wickham, Romain François, Lionel Henry, and Kirill Müller.
2018. dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.
computation to ensure correct operation of CPU and RAM.
org/package=dplyr R package version 0.7.8.

Acknowledgements
We would like to thank all past, current and future contribu-
tors to DuckDB at the CWI Database Architectures Group

You might also like