Benchmarking A Sensor Billing Application Using Griddb and Mariadb
Benchmarking A Sensor Billing Application Using Griddb and Mariadb
October 3, 2018
Revision 1.0
Table of Contents
Table of Contents 2
Executive Summary 3
Introduction 3
Environment 4
Azure Configuration 4
Software Version and Configuration 4
Test Methodology 5
Test Design 5
GridDB Schema 5
SQL Schema 6
Methodology 6
Benchmark Results 7
Ingest 7
Extract 8
Resource Usage 8
Load Average 9
Memory Usage 10
Tabular Results 11
Conclusion 12
Appendices 13
SQL Schema 13
GridDB Schema 14
Page 2/14
Executive Summary
A common use case for GridDB is using sensor data to perform billing functions. This
whitepaper builds a sample IoT billing application benchmark that can be used to compare the
performance of GridDB against a relational database management system (RDBMS). MariaDB
was chosen as the RDBMS due to its popularity and free and open source nature.
While relational databases have been the traditional choice for building such applications but as
the results demonstrate, MariaDB’s performance suffers as the number of records grow. One
reason for this is that rather than storing all sensor read records in one table, GridDB utilizes
containers. Each GridDB container only holds data for a particular device, which allows for
improved and more consistent query times as the number of devices is increased.
Introduction
NoSQL databases were developed to overcome the performance and scalability issues that
organizations faced as their datasets grew into the category of “BigData”. It is exceedingly
difficult for a typical relational database managed system (RDBMS) to scale out or to add
additional computational nodes to increase performance. Instead, they require their
administrators to scale up by adding more CPU cores and memory to their
systems.
GridDB is developed by Toshiba Digital Solutions Corporation and can be used as an in-memory
database, or as a hybrid composition. GridDB has many features, including a unique
Key-Container model which may utilize any-key Collections or specialized TimeSeries
containers.
MariaDB is a free and open source relational SQL database that is a derivative of MySQL. MySQL
is the world’s most popular1 open source database and since it’s introduction MariaDB has
supplanted MySQL in many Linux distributions.
1
https://db-engines.com/en/ranking
Page 3/14
Environment
Azure Configuration
The benchmarks were performed using Microsoft Azure virtual machines in the WestUS region.
There is one database server, a B2ms instance and a mix of three B2ms and two A2 client
instances.
The B2ms instances have dual core Intel(R) Xeon(R) E5-2673 v3 running at 2.40GHz and 4GB of
memory while A2 instances have a dual -core Intel(R) Xeon(R) E5-2660 CPU and 3.5GB of
memory.
All instances are based on the RogueWave Software CentOS 7.5 image using Java 1.8.0 update
181.
GridDB 4.0.0 Community Edition was installed on both Azure servers using RPM packages
downloaded from https://griddb.net. MariaDB version 5.5.56 was installed on the database
server via CentOS’s default YUM repositories. MySQL Connector 8.0.11 was used to create a
JDBC connection to MariaDB.
GridDB was configured to have a storeMemoryLimit of 1024MB (the amount of memory it will
use for cached data) and a concurrency level of 4. MariaDB used the default configuration with
the exception of increasing the max_connection_count to 320.
Page 4/14
Test Methodology
Test Design
The goal of this testing is to fairly compare GridDB and a relational database in a “real world
use case” where sensor data is collected from IoT devices or other sources and in the second, an
aggregation function is used to provide a billing amount.
The application has been separated into two test cases: the first testcase is a load or ingestion
test that shows the performance of updating and writing new records to the database; the
second test is an extraction or aggregation test which performs an aggregation on a subset of
the data generated in the load phase.
The Ingest test performs the ingest operations as quickly as possible where each thread is
responsible for one device. Each insert requires one update to METERS and one write to
METER_READS. The timestamp for each new record is incremented by one hour. One year of
data (8760 records per device) is inserted in the ingestion phase and then one month (744
records) of data is aggregated for each device in the extraction test.
The benchmark application was implemented in Java and based on GridDB user feedback. We
elected to ingest one year of data while performing monthly billing calculations.
GridDB Schema
Page 5/14
SQL Schema
Methodology
Since performance often fluctuates when running applications on cloud services such as Azure,
each test was repeated three times.
The first test uses five instances for generating database operations with thirty-two threads per
instance. This test should generate a year worth of data. Each thread is designed to insert
records into the database as fast as possible, with the aim of inspecting the overall efficiency and
performance times of that database for writing and updating records. This test was run three
times, with the median of those three taken as the value.
Fixstars attempted to find optimal conditions during initial testing. While GridDB was relatively
insensitive to both the number of hosts and number of threads, MariaDB’s performance was
optimized with 5 client nodes and 32 threads per node. It was also found that individual
connections for every thread was faster for both GridDB and MariaDB versus having a shared
connection shared between all threads.
For each test, the database server would first have its data deleted and was then restarted; this
was done before the workload application was started concurrently on all of the client hosts. For
the extractaction test, devices were both loaded and queried in batches of 160 (5 nodes X 32
threads) at a time.
Page 6/14
Benchmark Results
Ingest
The first test measured the number of updates per second that the database can write – where
more is better. For each operation, the Meter record is updated (or added if it does not exist)
and writes a new Meter_Read record to the database. GridDB was able to process 31,916
operations per second, nearly thirteen times more than MariaDB’s 2,423 operations per second.
Page 7/14
Extract
The extract test measured the SUM-aggregation of one month’s data for each device in the
database, with data from both MariaDB and GridDB comparing how long it takes to run the
query against one device. With 1600 devices, MariaDB takes roughly twice the time as
compared to GridDB to perform the aggregation operation. As the number of devices increases,
MariaDB’s performance worsens while GridDB’s remains consistent.
Resource Usage
MariaDB has a higher Load2 than GridDB while all tests are running and both databases use all
available memory on the server while the GridDB client uses significantly less memory.
2
Understand Linux Load Averages
https://www.tecmint.com/understand-linux-load-averages-and-monitor-performance/
Page 8/14
Load Average
Page 9/14
Memory Usage
Page 10/14
Tabular Results
All tabular results are listed in the order they are shown.
Page 11/14
Conclusion
GridDB outperforms MariaDB for both the ingestion and extraction/aggregation workloads
while maintaining a lower load average and memory usage. This means GridDB is well suited for
billing applications that are typically based on data collected from a large number of sources.
It should be reiterated that this configuration uses only one server instance, which is the typical
configuration for MariaDB. It is, however, not an ideal configuration for GridDB, which would
see even greater performance and reliability with multiple server nodes. GridDB’s ability to
scale-out increases reliability and allows GridDB to grow with your data while relational
databases are limited to scaling up and the amount of cores, memory and disk that will fit in one
physical system.
Page 12/14
Appendices
SQL Schema
Page 13/14
GridDB Schema
// Container Key: METERS
class Meter {
@RowKey
public long id;
Page 14/14