User Guide
User Guide
v2.2
Copyright © 2007 - 2010 Eric Long, Chris Henson, Mark Hanes, Greg Wilmer
Permission to use, copy, modify, and distribute the SymmetricDS 2 User Guide Version 2.2 for any purpose and
without fee is hereby granted in perpetuity, provided that the above copyright notice and this paragraph appear
in all copies.
Table of Contents
Preface ................................................................................................................................................ vi
1. Introduction ..................................................................................................................................... 1
1.1. What is SymmetricDS? ........................................................................................................ 1
1.2. Background .......................................................................................................................... 1
1.3. SymmetricDS Features ........................................................................................................ 2
1.3.1. Notification Schemes ................................................................................................ 2
1.3.2. Two-Way Table Synchronization ............................................................................. 2
1.3.3. Data Channels ........................................................................................................... 2
1.3.4. Transaction Awareness ............................................................................................. 3
1.3.5. Data Filtering and Rerouting .................................................................................... 3
1.3.6. HTTP(S) Transport ................................................................................................... 3
1.3.7. Remote Management ................................................................................................ 3
1.4. System Requirements ........................................................................................................... 4
1.5. What's new in SymmetricDS 2 ............................................................................................ 4
2. Hands-on Tutorial ........................................................................................................................... 7
2.1. Installing SymmetricDS ....................................................................................................... 8
2.2. Creating and Populating Your Databases ............................................................................ 9
2.3. Starting SymmetricDS ....................................................................................................... 10
2.4. Registering a Node ............................................................................................................. 11
2.5. Sending an Initial Load ...................................................................................................... 11
2.6. Pulling Data ....................................................................................................................... 11
2.7. Pushing Data ...................................................................................................................... 12
2.8. Verifying Outgoing Batches .............................................................................................. 12
2.9. Verifying Incoming Batches .............................................................................................. 13
3. Planning an Implementation ......................................................................................................... 15
3.1. Identifying Nodes ............................................................................................................... 15
3.2. Organizing Nodes .............................................................................................................. 15
3.3. Defining Node Groups ....................................................................................................... 18
3.4. Linking Nodes .................................................................................................................... 19
3.5. Choosing Data Channels .................................................................................................... 19
3.6. Defining Data Changes to be Captured and Routed .......................................................... 20
3.6.1. Defining Triggers .................................................................................................... 20
3.6.2. Defining Routers ..................................................................................................... 21
3.6.3. Mapping Triggers to Routers .................................................................................. 22
3.6.3.1. Planning Initial Loads .................................................................................. 22
3.6.3.2. Circular References and "Ping Back" .......................................................... 22
3.6.4. Planning for Registering Nodes .............................................................................. 22
4. Configuration ................................................................................................................................ 24
4.1. Node Properties .................................................................................................................. 24
4.2. Node ................................................................................................................................... 25
4.3. Node Group ........................................................................................................................ 26
4.4. Node Group Link ............................................................................................................... 26
4.5. Channel .............................................................................................................................. 27
4.6. Triggers and Routers .......................................................................................................... 27
Symmetric DS v2.2 ii
SymmetricDS 2 User Guide
Symmetric DS v2.2 iv
SymmetricDS 2 User Guide
Symmetric DS v2.2 v
Preface
SymmetricDS is an open-source, web-enabled, database independent, data synchronization software
application. It uses web and database technologies to replicate tables between relational databases in near
real time. The software was designed to scale for a large number of databases, work across
low-bandwidth connections, and withstand periods of network outages.
This User Guide introduces SymmetricDS and its uses for data synchronization. It is intended for users
who want to be quickly familiarized with the software, configure it, and use its many features. This
version of the guide was generated on 2011-04-15 at 11:54:23.
Symmetric DS v2.2 vi
Chapter 1. Introduction
This User Guide will introduce both basic and advanced concepts in the configuration of SymmetricDS.
By the end of this chapter, you will have a better understanding of SymmetricDS' capabilities, and many
of its basic concepts.
A single installation of SymmetricDS attached to a target database is called a node. A node is initialized
by a properties file and is configured by inserting configuration data into a series of database tables. It
then creates database triggers on the application tables to be synchronized so that database events are
captured for delivery to other SymmetricDS nodes.
In most databases, the transaction id is also captured by the database triggers so that the insert, update,
and delete events can be replicated transactionally via the transport layer to other nodes. The transport
layer is typically a CSV protocol over HTTP or HTTPS.
SymmetricDS supports synchronization across different database platforms through the concept of
Database Dialects. A Database Dialect is an abstraction layer that SymmetricDS uses to insulate the main
synchronization logic from database-specific implementation details.
SymmetricDS is extendable through extension points. Extension points are custom, reusable Java code
that are configured via XML. Extension points hook into key points in the life-cycle of a synchronization
to allow custom behavior to be injected. Extension points allow custom behavior such as: publishing data
to other sources, transforming data, and taking different actions based on the content or status of a
synchronization.
1.2. Background
The idea of SymmetricDS was born from a real-world need. Several of the original developers were,
several years ago, implementing a commercial Point of Sale (POS) system for a large retailer. The
development team came to the conclusion that the software available for trickling back transactions to
corporate headquarters (frequently known as the 'central office' or 'general office') did not meet the
project needs. The list of project requirements made finding the ideal solution difficult:
• Sending and receiving data with up to 2000 stores during peak holiday loads.
• Supporting one database platform at the store and a different one at the central office.
Symmetric DS v2.2 1
Introduction
• Synchronizing some data in one direction, and other data in both directions.
• Preparing the store database with an initial load of data from the central office.
The team ultimately created a custom solution that met the requirements and led to a successful project.
From this work came the knowledge and experience that SymmetricDS benefits from today.
Symmetric DS v2.2 2
Introduction
processed independently so inventory can get through despite the large amount of item data.
Channels are discussed in more detail in Section 3.5, Choosing Data Channels (p. 19).
• As data changes are loaded in the target database, a class implementing IDataLoaderFilter can
change the data in a column or route it somewhere else. One possible use might be to route credit
card data to a secure database and blank it out as it loads into a centralized sales database. The filter
can also prevent data from reaching the database altogether, effectively replacing the default data
loading process.
• Columns can be excluded from synchronization so they are never recorded when the table is
changed. As data changes are loaded into the target database, a class implementing IColumnFilter
can remove a column altogether from the synchronization. For example, an employee table may be
synchronized to a retail store database, but the employee's password is only synchronized on the
initial insert.
• As data changes are extracted from the source database, a class implementing the
IExtractorListener interface is called to filter data or route it somewhere else. By default,
SymmetricDS provides a handler that transforms and streams data as CSV. Optionally, an alternate
implementation may be provided to take some other action on the extracted data.
Symmetric DS v2.2 3
Introduction
from the Java JConsole or through an application server. Functions include opening registration,
reloading data, purging old data, and viewing batches. A number of configuration and runtime properties
are available to be viewed as well.
SymmetricDS also provides functionality to send SQL events through the same synchronization
mechanism that is used to send data. The data payload can be any SQL statement. The event is processed
and acknowledged just like any other event type.
Any database with trigger technology and a JDBC driver has the potential to run SymmetricDS. The
database is abstracted through a Database Dialect in order to support specific features of each database.
The following Database Dialects have been included with this release:
• MySQL version 5.0.2 and above
• HSQLDB 1.8
• H2 1.x
See Appendix C, Database Notes (p. 93), for compatibility notes and other details for your specific
database.
The first significant architectural change involves SymmetricDS's use of triggers. In 1.x, triggers capture
and record data changes as well as the nodes to which the changes must be applied as row inserts into the
Symmetric DS v2.2 4
Introduction
DATA_EVENT table. Thus, the number of row-inserts grows linearly with the number of client nodes.
This can lead to an obvious performance issue as the number of nodes increases. In addition, the problem
is made worse at times due to synchronizing nodes updating the same DATA_EVENT table as part of the
batching process while the row-inserts are being created.
In SymmetricDS 2, triggers capture only data changes, not the node-specific details. The node-specific
row-inserts are replaced with a new routing mechanism that does both the routing and the batching of
data on one thread. Thus, the real-time inserts into DATA_EVENT by applications using synchronized
tables have been eliminated, and database performance is therefore improved. The database contention on
DATA_EVENT has also been eliminated, since the router job is the only thread inserting data into that
table. The only other access to the DATA_EVENT table is from selects by synchronizing nodes.
• Applications updating database tables that are being synchronized to a large number of nodes will
not degrade in performance as more nodes are added, and
• There should be almost no database contention on the data_event table, unlike the possible
contention in 1.X.
Because routing no longer takes place in the SymmetricDS database triggers, a new mechanism for
routing was needed. In SymmetricDS 1.x, the node_select expression was used for specifying the desired
data routing. It was a SQL expression that qualified the insert into DATA_EVENT from the
SymmetricDS triggers. In SymmetricDS 2 there is a new extension point called the data router. Data
routers are configured in the router table with a router_type and a router_expression. Several different
routers have been provided to serve the majority of users' routing needs, but the framework is in place for
a SymmetricDS programmer to develop domain- or application-specific routers. See Section 4.6.2, Router
(p. 28) for a complete list of provided routers.
Since the routing and capturing of data are now performed with two separate mechanisms, the two
concepts have been separated into separate configuration tables in the database, with a join table
(TRIGGER_ROUTER) specifying the relationships between routing (ROUTER) and capturing of data
(TRIGGER). This solves a long standing issue with some databases which only allow one trigger per
table. On those database platforms, we can now route data in multiple directions since we only require
one SymmetricDS trigger to capture data. This also helps performance in those scenarios, since we only
capture the data once instead of once per routing instance.
As part of the new routing job, we have introduced another new extension point to allow more flexibility
in the way data events get batched. A batch is the unit by with captured data is sent and committed on
target nodes. In SymmetricDS 2, batching is now configured on the channel configuration table. This
provides additional flexibility for batching:
• Batching can have the traditional SymmetricDS 1.x behavior of batching up to a max batch size,
but never breaking on a database transaction boundary.
• Batching can be completely tied to a database transaction. One batch per database transaction.
• Batching can ignore database transactions altogether and always batch based on a max batch size.
Symmetric DS v2.2 5
Introduction
Another significant change to note in SymmetricDS 2 is the removal of the incoming and outgoing batch
history tables. This change was made because it was found that over 95% of the time the statistics the end
user truly wanted to see were those for the most recent synchronization attempt, not to mention that the
outgoing batch history table was difficult to query. The most valuable information in the batch history
tables, the batch statistics, have been moved over to the batch tables themselves. The statistics in the
batch tables now always represent the latest synchronization attempt.
Symmetric DS v2.2 6
Chapter 2. Hands-on Tutorial
Now that several of the features of SymmetricDS have been discussed, a quick working example of
SymmetricDS is in order. This section contains a hands-on tutorial that demonstrates how to synchronize
a sample database between two running instances of SymmetricDS. This example models a retail
business that has a central office database (called "root") and multiple retail store databases (called
"client"). For the tutorial, we will have only one "client", as shown in Figure 2.1.
The root SymmetricDS instance sends changes to the client for item data, such as item number,
description, and price. The client SymmetricDS sends changes to the root for sale transaction data, such
as time of sale and items sold. The sample configuration specifies synchronization with a pull method for
the client to receive data from root, and a push method for the root to receive data from client.
3. Creating sample tables for client and root and sample data for the root,
Symmetric DS v2.2 7
Hands-on Tutorial
7. Verifying information about the batches that were sent and received.
2. Unzip the file in any directory you choose. This will create a symmetric-ds-2.x.x-server
subdirectory, which corresponds to the version you downloaded.
3. Edit the database properties in the following property files for the root (central office) and client
(store) nodes:
• samples/root.properties
• samples/client.properties
4. Set the following properties in both files to specify how to connect to the database:
5. Next, set the following property in the client.properties file to specify where the root node can
be contacted:
For the tutorial, the client database starts out empty, and the node is not registered. Registration
is the process where the node receives its configuration and stores it in its database. The
configuration describes which database tables to synchronize and to which nodes. When an
unregistered node starts up, it will register with the node specified by the registration URL. The
Symmetric DS v2.2 8
Hands-on Tutorial
registration node centrally controls nodes on the network by allowing registration and returning
configuration. In this tutorial, the registration node is the root node, which also participates in
synchronization with other nodes.
See Appendix C, Database Notes (p. 93), for compatibility with your specific database.
First, create the sample tables in the root node database, load the sample data, and load the sample
configuration.
1. Open a command prompt and navigate to the samples subdirectory of your SymmetricDS
installation.
2. Create the sample tables in the root database by executing the following command:
Note that the warning messages from the command are safe to ignore.
3. Next, create the SymmetricDS tables in the root node database. These tables will contain the
configuration for synchronization. The following command uses the auto-creation feature to
create all the necessary SymmetricDS system tables.
4. Finally, load the sample data and configuration into the root node database by executing:
We have now created the root database tables and populated them with sample data. Next, we create the
sample tables in the client node database to prepare it for receiving data.
1. Open a command prompt and navigate to the samples subdirectory of your SymmetricDS
installation.
Symmetric DS v2.2 9
Hands-on Tutorial
Note that the warning messages from the command are safe to ignore.
2. Find the sales tables that sync from client to root: sale_transaction and sale_return_line_item.
The root node server starts up and creates all the triggers that were configured by the sample
configuration. It listens on port 8080 for synchronization and registration requests.
The client node server starts up and uses the auto-creation feature to create the SymmetricDS
system tables. It begins polling the root node in order to register. Since registration is not yet
open, the client node receives an authorization failure (HTTP response of 403).
Tip
If you want to change the port number used by SymmetricDS, you need to also set the sync.url
runtime property to match. The default value is:
sync.url=http://localhost:8080/sync
Symmetric DS v2.2 10
Hands-on Tutorial
The registration is now opened for a node group called "store" with an external identifier of "1".
This information matches the settings in client.properties for the client node. Each node is
assigned to a node group and is given an external ID that makes sense for the application. In this
tutorial, we have retail stores that run SymmetricDS, so we named our node group "store" and
we used numeric identifiers starting with "1". More information about node groups will be
covered in the next chapter.
3. Watch the logging output of the client node to see it successfully register with the root node. The
client is configured to attempt registration once per minute. Once registered, the root and client
are enabled for synchronization!
With this command, the root node queues up an initial load for the client node that will be sent
the next time the client performs its pull. The initial load includes data for each table that is
configured for synchronization.
3. Watch the logging output of both nodes to see the data transfer. The client is configured to pull
data from the root every minute.
Symmetric DS v2.2 11
Hands-on Tutorial
Next, we will make a change to the item data in the central office (we'll add a new item), and observe the
data being pulled down to the store.
1. Open an interactive SQL session with the root database.
insert into item (item_id, price_id, name) values (110000055, 55, 'Soft Drink');
Once the statements are committed, the data change is captured by SymmetricDS and queued for
the client node to pull.
3. Watch the logging output of both nodes to see the data transfer. The client is configured to pull
data from the root every minute.
4. Verify that the new data arrives in the client database using another interactive SQL session.
insert into sale_transaction (tran_id, store, workstation, day, seq) values (1000, '1', '3',
'2007-11-01', 100);
Once the statements are committed, the data change is captured and queued for the client node to
push.
3. Watch the logging output of both nodes to see the data transfer. The client is configured to push
data to the root every minute.
Symmetric DS v2.2 12
Hands-on Tutorial
are categories assigned to tables for the purpose of independent synchronization and control. Batches for
a channel are not created when a batch in the channel is in error status.
1. Open an interactive SQL session with either the root or client database.
Each row represents a row of data that was changed. The event_type is "I" for insert, "U" for
update", or "D" for delete. For insert and update, the captured data values are listed in row_data.
For update and delete, the primary key values are listed in pk_data.
3. Verify that the data change was routed to a node, using the data_id from the previous step:
When the batched flag is set, the data change is assigned to a batch using a batch_id that is used
to track and synchronize the data. Batches are created and assigned during a push or pull
synchronization.
4. Verify that the data change was batched, sent, and acknowledged, using the batch_id from the
previous step:
A batch represents a collection of changes to be sent to a node. The batch is created during a
push or pull synchronization, when the status is set to "NE" for new. The receiving node
acknowledges the batch with a status of "OK" for success or "ER" for error.
Understanding these three tables, along with a fourth table discussed in the next section, is key to
diagnosing any synchronizaiton issues you might encounter. As you work with SymmetricDS, either
when experimenting or starting to use SymmetricDS on your own data, spend time monitoring these
tables to better understand how SymmetricDS works.
1. Open an interactive SQL session with either the root or client database.
2. Verify that the batch was acknowledged, using a batch_id from the previous section:
A batch represents a collection of changes loaded by the node. The sending node that created the
batch is recorded. The status is either "OK" for success or "ER" for error.
Symmetric DS v2.2 13
Hands-on Tutorial
Symmetric DS v2.2 14
Chapter 3. Planning an Implementation
In the previous Chapter we presented a high level introduction to some basic concepts in SymmetricDS,
some of the high-level features, and a tutorial demonstrating a basic, working example of SymmetricDS
in action. This chapter will focus on the key considerations and decisions one must make when planning a
SymmetricDS implementation. As needed, basic concepts will be reviewed or introduced throughout this
Chapter. By the end of the chapter you should be able to proceed forward and implement your planned
design. This Chapter will intentionally avoid discussing the underlying database tables that capture the
configuration resulting from your analysis and design process. Implementation of your design, along with
discussion of the tables backing each concept, is covered in Chapter 4, Configuration (p. 24).
When needed, we will rely on an example of a typical use of SymmetricDS in retail situations. This
example retail deployment of SymmetricDS might include many point-of-sale workstations located at
stores that may have intermittent network connection to a central location. These workstations might have
point-sale-software that uses a local relational database. The database is populated with items, prices and
tax information from a centralized database. The point-of-sale software looks up item information from
the local database and also saves sale information to the same database. The persisted sales need to be
propagated back to the centralized database.
Each node of SymmetricDS can be either embedded in another application, run stand-alone, or even run
in the background as a service. If desired, nodes can be clustered to help disperse load if they send and/or
receive large volumes of data to or from a large number of nodes.
Individual nodes are easy to identify when planning your implementation. If a database exists in your
domain that needs to send or receive data, there needs to be a corresponding SymmetricDS instance (a
node) responsible for managing the synchronization for that database.
Symmetric DS v2.2 15
Planning an Implementation
Our retail example, as shown in Figure 3.1, represents a tree hierarchy with a single central office node
connected by lines to one or more children nodes (the POS workstations). Information flows from the
central office node to an individual register and vice versa, but never flows between registers.
More complex organization can also be used. Consider, for example, if the same retail example is
expanded to include store servers in each store to perform tasks such as opening the store for the day,
reconciling registers, assigning employees, etc. One approach to this new configuration would be to
create a three-tier hierarchy (see Figure 3.2). The highest tier, the centralized database, connects with
each store server's database. The store servers, in turn, communicate with the individual point-of-sale
workstations at the store. In this way data from each register could be accumulated at the store server,
then sent on to the central office. Similarly, data from the central office can be staged in the store server
and then sent on to each register, filtering the register's data based on which register it is.
Symmetric DS v2.2 16
Planning an Implementation
Figure 3.2. Three Tiered, In-Store Server, Retail Store Deployment Example
One final example, show in Figure 3.3, again extending our original two-tier retail use case, would be to
organize stores by "region" in the world. This three tier architecture would introduce new regional servers
(and corresponding regional databases) which would consolidate information specific to stores the
regional server is responsible for. The tiers in this case are therefore the central office server, regional
servers, and individual store registers.
Symmetric DS v2.2 17
Planning an Implementation
Figure 3.3. Three Tiered, Regional Server, Retail Store Deployment Example
These are just three common examples of how one might organize nodes in SymmetricDS. While the
examples above were for the retail industry, the organization, they could apply to a variety of application
domains.
Symmetric DS v2.2 18
Planning an Implementation
example, when it comes time to decide where to route data captured by SymmetricDS, the routing is
configured by Node Group.
• "store" to represent the store server that interacts with store workstations and sends and receives
data from a central office server.
• "region" to represent the a regional server that interacts with store workstations and sends and
receives data from a central office server.
Considerable thought should be given to how you define the Node Groups. Groups should be created for
each set of nodes that synchronize common tables in a similar manner. Also, give your Node Groups
meaningful names, as they will appear in many, many places in your implementation of SymmetricDS.
Note that there are other mechanisms in SymmetricDS to route to individual nodes or smaller subsets of
nodes within a Node Group, so do not choose Node Groups based on needing only subsets of data at
specific nodes. For example, although you could, you would not want to create a Node Group for each
store even though different tax rates need to be routed to each store. Each store needs to synchronize the
same tables to the same groups, so 'store' would be a good choice for a Node Group.
For our retail store example, there are two Node Group Links defined. For the first link, the "store" Node
Group pushes data to the "corp" central office Node Group. The second defines a "corp" to "store" link as
a pull. Thus, the store nodes will periodically pull data from the central office, but when it comes time to
send data to the central office a store node will do a push.
SymmetricDS supports this by allowing tables being synchronized to be grouped together into Channels
Symmetric DS v2.2 19
Planning an Implementation
of data. A number of controls to the synchronization behavior of SymmetricDS are controlled at the
Channel level. For example, Channels provide a processing order when synchronizing, a limit on the
amount of data that will be batched together, and isolation from errors in other channels. By categorizing
data into channels and assigning them to TRIGGERs, the user gains more control and visibility into the
flow of data. In addition, SymmetricDS allows for synchronization to be enabled, suspended, or
scheduled by Channels as well. The frequency of synchronization can also be controlled at the channel
level.
Choosing Channels is fairly straightforward and can be changed over time, if needed. Think about the
differing "types" of data present in your application, the volume of data in the various types, etc. What
data is considered must-have and can't be delayed due to a high volume load of another type of data? For
example, you might place employee-related data, such as clocking in or out, on one channel, but sales
transactions on another. We will define which tables belong to which channels in the next sections.
Important
Be sure that, when defining Channels, all tables related by foreign keys are included in the
same channel.
• a SQL select statement that can be used to hold data needed for routing (known as External Data)
As you define your triggers, consider which data changes are relevant to your application and which ones
Symmetric DS v2.2 20
Planning an Implementation
ar not. Consider under what special conditions you might want to route data, as well. For our retail
example, we likely want to have triggers defined for updating, inserting, and deleting pricing information
in the central office so that the data can be routed down to the stores. Similarly, we need triggers on sales
transaction tables such that sales information can be sent back to the central office.
Before we discuss Routers and Trigger Routers, we should probably take a break and discuss the process
SymmetricDS uses to keep track of the changes and routing. As we stated, SymmetricDS relies on
auto-created database triggers to capture and record relevant data changes into a table, the DATA table.
After the data is captured, a background process chooses the nodes that the data will be synchronized to.
This is called routing and it is performed by the Routing Job. Note that the Routing Job does not actually
send any data. It just organizes and records the decisions on where to send data in a "staging" table called
DATA_EVENT and OUTGOING_BATCH.
Now we are ready to discuss Routers. The router itself is what defines the configuration of where to send
a data change. Each Router you define can be associated with or assigned to any number of Triggers
through a join table that defines the relationship. Routers are defined the SymmetricDS table named
ROUTER. For each router you define, you will need to specify:
• the target table on the destination node to route the data
• the source node group and target node group for the nodes to route the data to
For now, do not worry about the specific routing types. They will be covered later. For your design
simply make notes of the information needed and decisions to determine the list of nodes to route to. You
will find later that there is incredible flexibility and functionality available in routers. For example, you
will find you can:
• send the changes to all nodes that belong to the target node group defined in the router.
• compare old or new column values to a constant value or the value of a node's identity.
• execute a SQL expression against the database to select nodes to route to. This SQL expression can
be passed values of old and new column values.
• execute a Bean Shell expression in order to select nodes to route to. The Bean Shell expression can
use the the old and new column values.
• publish data changes directly to a messaging solution instead of transmitting changes to registered
Symmetric DS v2.2 21
Planning an Implementation
For each of your Triggers, decide which Router matches the behavior needed for that Trigger. These
Trigger Router combinations will be used to define a mapping between your Triggers and Routers when
you implement your design.
SymmetricDS provides the ability to "load" or "seed" a node's database with specific sets of data from its
parent node. This concept is known as an Initial Load of data and is used to start off most synchronization
scenarios. The Trigger Router mapping defines how initial loads can occur, so now is a good time to plan
how your Initial Loads will work. Using our retail example, consider a new store being opened. Initially,
you would like to pre-populate a store database with all the item, pricing, and tax data for that specific
store. This is achieved through an initial load. A part of your planning, be sure to consider which tables, if
any, will need to be loaded initially. SymmetricDS can also perform an initial load on a table with just a
subset of data. Initial Loads are further discussed in Section 4.6.3.1, Initial Load (p. 34).
When routing data, SymmetricDS by default checks each data change and will not route a data change
back to a node if it originated the change to begin with. This prevents the possibility of data changes
resulting in an infinite loop of changes under certain circumstances. You may find that, for some reason,
you need SymmetricDS to go ahead and send the data back to the originating node - a "ping back". As
part of the planning process, consider whether you have a special case for needing ping back. Ping Back
control is further discussed in Section 4.6.3.3, Enabling "Ping Back" (p. 36).
The following are some options on ways you might register nodes:
• The tutorial uses the command line utility to register each individual node.
• A JMX interface provides the same interface that the command line utility does. JMX can be
invoked programatically or via a web console.
• Both the utility and the JMX method register a node by inserting into two tables. A script can be
written to directly register nodes by directly inserting into the database.
• SymmetricDS can be configured to auto register nodes. This means that any node that asks for a
Symmetric DS v2.2 22
Planning an Implementation
Symmetric DS v2.2 23
Chapter 4. Configuration
Chapter 3 introduced numerous concepts and the analysis and design needed to create an implementation
of SymmetricDS. This chapter re-visits each analysis step and documents how to turn a SymmetricDS
design into reality through configuration of the various SymmetricDS tables. In addition, several
advanced configuration options, not presented previously, will also be covered.
• TRIGGER - specifies tables, channels, and conditions for which changes in the database should be
captured
• ROUTER - specifies the routers defined for synchronization, along with other routing details
During start up, triggers are verified against the database, and database triggers are installed on tables that
require data changes to be captured. The Route, Pull and Push Jobs begin running to synchronize changes
with other nodes.
Each node requires properties that allow it to connect to a database and register with a parent node. To
give a node its identity, the following properties are used:
group.id
The node group that this node is a member of. Synchronization is specified between node groups,
which means you only need to specify it once for multiple nodes in the same group.
external.id
The external id for this node has meaning to the user and provides integration into the system where it
is deployed. For example, it might be a retail store number or a region number. The external id can be
used in expressions for conditional and subset data synchronization. Behind the scenes, each node has
a unique sequence number for tracking synchronization events. That makes it possible to assign the
same external id to multiple nodes, if desired.
Symmetric DS v2.2 24
Configuration
sync.url
The URL where this node can be contacted for synchronization. At startup and during each heartbeat,
the node updates its entry in the database with this URL.
When a new node is first started, it is has no information about synchronizing. It contacts the registration
server in order to join the network and receive its configuration. The configuration for all nodes is stored
on the registration server, and the URL must be specified in the following property:
registration.url
The URL where this node can connect for registration to receive its configuration. The registration
server is part of SymmetricDS and is enabled as part of the deployment.
Important
Note that a registration server node is defined as one whose registration.url is either (a)
blank, or (b) identical to its sync.url.
When deploying to an application server, it is common for database connection pools to be found in the
Java naming directory (JNDI). In this case, set the following property:
db.jndi.name
The name of the database connection pool to use, which is registered in the JNDI directory tree of the
application server. It is recommended that this DataSource is NOT transactional, because
SymmetricDS will handle its own transactions.
For a deployment where the database connection pool should be created using a JDBC driver, set the
following properties:
db.driver
The class name of the JDBC driver.
db.url
The JDBC URL used to connect to the database.
db.user
The database username, which is used to login, create, and update SymmetricDS tables.
db.password
The password for the database user.
4.2. Node
A node, a single instance of SymmetricDS, is defined in the NODE table. Two other tables play a direct
role in defining a node, as well The first is NODE_IDENTITY. The only row in this table is inserted in
the database when the node first registers with a parent node. In the case of a root node, the row is
Symmetric DS v2.2 25
Configuration
entered by the user. The row is used by a node instance to determine its node identity.
The following SQL statements set up a top-level registration server as a node identified as "00000" in the
"corp" node group.
The second table, NODE_SECURITY has rows created for each child node that registers with the node,
assuming auto-registration is enabled. If auto registration is not enabled, you must create a row in NODE
and NODE_SECURITY for the node to be able to register. You can also, with this table, manually cause
a node to re-register or do a re-initial load by setting the corresponding columns in the table itself.
Registration is discussed in more detail in Section 4.7, Opening Registration (p. 37).
Symmetric DS v2.2 26
Configuration
4.5. Channel
By categorizing data into channels and assigning them to TRIGGERs, the user gains more control and
visibility into the flow of data. In addition, SymmetricDS allows for synchronization to be enabled,
suspended, or scheduled by channels as well. The frequency of synchronization and order that data gets
synchronized is also controlled at the channel level.
The following SQL statements setup channels for a retail store. An "item" channel includes data for items
and their prices, while a "sale_transaction" channel includes data for ringing sales at a register.
Batching is the grouping of data, by channel, to be transferred and committed at the client together. There
are three different out-of-the-box batching algorithms which may be configured in the batch_algorithm
column on channel.
default
All changes that happen in a transaction are guaranteed to be batched together. Multiple transactions
will be batched and committed together until there is no more data to be sent or the max_batch_size is
reached.
transactional
Batches will map directly to database transactions. If there are many small database transactions, then
there will be many batches. The max_batch_size column has no effect.
nontransactional
Multiple transactions will be batched and committed together until there is no more data to be sent or
the max_batch_size is reached. The batch will be cut off at the max_batch_size regardless of whether
it is in the middle of a transaction.
Symmetric DS v2.2 27
Configuration
4.6.1. Trigger
SymmetricDS captures synchronization data using database triggers. SymmetricDS' Triggers are defined
in the TRIGGER table. Each record is used by SymmetricDS when generating database triggers.
Database triggers are only generated when a trigger is associated with a ROUTER whose
source_node_group_id matches the node group id of the current node.
The following SQL statement defines a trigger that will capture data for a table named "item" whenever
data is inserted, updated, or deleted. The trigger is assigned to a channel also called 'item'.
Warning
Note that many databases allow for multiple triggers of the same type to be defined. Each
database defines the order in which the triggers fire differently. If you have additional triggers
beyond those SymmetricDS installs on your table, please consult your database documentation
to determine if there will be issues with the ordering of the triggers.
4.6.2. Router
Routers provided in the base implementation currently include:
• Default Router - a router that sends all data to all nodes that belong to the target node group
defined in the router.
• Column Match Router - a router that compares old or new column values to a constant value or the
value of a node's external_id or node_id.
• Sub-select Router - a router that executes a SQL expression against the database to select nodes to
route to. This SQL expression can be passed values of old and new column values.
• Bean Shell Router - a router that executes a BSH expression in order to select nodes to route to.
The BSH expression can use the the old and new column values.
• Xml Publishing Router - a router the publishes data changes directly to a messaging solution
instead of transmitting changes to registered nodes. This router must be configured manually in
XML as an extension point.
The mapping between the set of triggers and set of routers is many-to-many. This means that one trigger
can capture changes and route to multiple locations. It also means that one router can be defined an
associated with many different triggers.
Symmetric DS v2.2 28
Configuration
The simplest router is a router that sends all the data that is captured by its associated triggers to all the
nodes that belong to the target node group defined in the router. A router is defined as a row in the
ROUTER table. It is then linked to triggers in the TRIGGER_ROUTER table.
The following SQL statement defines a router that will send data from the 'corp' group to the 'store' group.
The following SQL statement maps the 'corp-2-store' router to the item trigger.
Sometimes requirements may exist that require data to be routed based on the current value or the old
value of a column in the table that is being routed. Column routers are configured by setting the
router_type column on the ROUTER table to column and setting the router_expression column to an
equality expression that represents the expected value of the column.
The first part of the expression is always the column name. The column name should always be defined
in upper case. The upper case column name prefixed by OLD_ can be used for a comparison being done
with the old column data value.
The second part of the expression can be a constant value, a token that represents another column, or a
token that represents some other SymmetricDS concept. Token values always begin with a colon (:).
Consider a table that needs to be routed to all nodes in the target group only when a status column is set to
'OK.' The following SQL statement will insert a column router to accomplish that.
Symmetric DS v2.2 29
Configuration
Consider a table that needs to be routed to all nodes in the target group only when a status column
changes values. The following SQL statement will insert a column router to accomplish that.
Consider a table that needs to be routed to only nodes in the target group whose STORE_ID column
matches the external id of a node. The following SQL statement will insert a column router to accomplish
that.
• EXTERNAL_ID
• NODE_GROUP_ID
Consider a table that needs to be routed to a redirect node defined by its external id in the
REGISTRATION_REDIRECT table. The following SQL statement will insert a column router to
accomplish that.
More than one column may be configured in a router_expression. When more than one column is
configured, all matches are added to the list of nodes to route to. The following is an example where the
STORE_ID column may contain the STORE_ID to route to or the constant of ALL which indicates that
all nodes should receive the update.
Symmetric DS v2.2 30
Configuration
The NULL keyword may be used to check if a column is null. If the column is null, then data will be
routed to all nodes who qualify for the update. This following is an example where the STORE_ID
column is used to route to a set of nodes who have a STORE_ID equal to their EXTERNAL_ID, or to all
nodes if the STORE_ID is null.
A lookup table may contain the id of the node where data needs to be routed. This could be an existing
table or an ancillary table that is added specifically for the purpose of routing data. Lookup table routers
are configured by setting the router_type column on the ROUTER table to lookuptable and setting a list of
configuration parameters in the router_expression column.
KEY_COLUMN
This is the name of the column on the table that is being routed. It will be used as a key into the
lookup table.
LOOKUP_KEY_COLUMN
This is the name of the column that is the key on the lookup table.
EXTERNAL_ID_COLUMN
This is the name of the column that contains the external_id of the node to route to on the lookup
table.
Note that the lookup table will be read into memory and cached for the duration of a routing pass for a
single channel.
Consider a table that needs to be routed to a specific store, but the data in the changing table only contains
brand information. In this case, the STORE table may be used as a lookup table.
Symmetric DS v2.2 31
Configuration
Sometimes routing decisions need to be made based on data that is not in the current row being
synchronized. Consider an example where an Order table and a OrderLineItem table need to be routed to
a specific store. The Order table has a column named order_id and STORE_ID. A store node has an
external_id that is equal to the STORE_ID on the Order table. OrderLineItem, however, only has a
foreign key to its Order of order_id. To route OrderLineItems to the same nodes that the Order will be
routed to, we need to reference the master Order record.
There are two possible ways to route the OrderLineItem in SymmetricDS. One is to configure a
'subselect' router_type on the ROUTER table and the other is to configure an external_select on the
TRIGGER table.
A 'subselect' is configured with a router_expression that is a SQL select statement which returns a result
set of the node_ids that need routed to. Column tokens can be used in the SQL expression and will be
replaced with row column data. The overhead of using this router type is high because the 'subselect'
statement runs for each row that is routed. It should not be used for tables that have a lot of rows that are
updated. It also has the disadvantage that if the Order master record is deleted, then no results would be
returned and routing would not happen. The router_expression is appended to the following SQL
statement in order to select the node ids.
Consider a table that needs to be routed to all nodes in the target group only when a status column is set to
'OK.' The following SQL statement will insert a column router to accomplish that.
Alternatively, when using an external_select on the TRIGGER table, data is captured in the
Symmetric DS v2.2 32
Configuration
EXTERNAL_DATA column of the DATA table at the time a trigger fires. The EXTERNAL_DATA can
then be used for routing by using a router_type of 'column'. The advantage of this approach is that it is
very unlikely that the master Order table will have been deleted at the time any DML accures on the
OrderLineItem table. It also is a bit more effcient than the 'subselect' approach, although the triggers
produced do run the extra external_select inline with application database updates.
In the following example, the STORE_ID is captured from the Order table in the EXTERNAL_DATA
column. EXTERNAL_DATA is always available for routing as a virtual column in a 'column' router. The
router is configured to route based on the captured EXTERNAL_DATA to all nodes whose external_id
matches. Note that other supported node attribute token can also be used for routing.
When more flexibility is needed in the logic to choose the nodes to route to, then the a Bean Shell router
may be used. Bean Shell is a Java-like scripting language. Documentation for the Bean Shell scripting
language can be found at http://www.beanshell.org.
The router_type for a Bean Shell router is 'bsh'. The router_expression is a valid Bean Shell script that:
• adds node ids to the 'targetNodes' collection which is bound to the script
• returns true to indicate that all nodes should be routed or returns false to indicate that no nodes
should be routed
Also bound to the script evaluation is a list of 'nodes'. The list of 'nodes' is a list of eligible Node objects.
The current data column values and the old data column values are bound to the script evaluation as Java
object representations of the column data. The columns are bound using the uppercase names of the
columns. Old values are bound to uppercase representations that are prefixed with 'OLD_'.
Symmetric DS v2.2 33
Configuration
The same could also be accomplished by simply returning the node id. The last line of a bsh script is
always the return value.
The following example will synchronize to all nodes if the FLAG column has changed, otherwise no
nodes will be synchronized.
An initial load is the process of seeding tables at a target node with data from its parent node. When a
node connects and data is extracted, after it is registered and if an initial load was requested, each table
that is configured to synchronize to the target node group will be given a reload event in the order defined
by the end user. A SQL statement is run against each table to get the data load that will be streamed to the
target node. The selected data is filtered through the configured router for the table being loaded. If the
data set is going to be large, then SQL criteria can optionally be provided to pair down the data that is
Symmetric DS v2.2 34
Configuration
An initial load can not occur until after a node is registered. An initial load is requested by setting the
initial_load_enabled column on NODE_SECURITY to 1 on the row for the target node in the parent
node's database. The next time the target node synchronizes, reload batches will be inserted. At the same
time reload batches are inserted, all previously pending batches for the node are marked as successfully
sent.
Important
Note that if the parent node that a node is registering with is not a registration server node (as
can happen with a registration redirect or certain non-tree structure node configurations) the
parent node's NODE_SECURITY entry must exist at the parent node and have a non-null
value for column initial_load_time. Nodes can't be registered to non-registration-server nodes
without this value being set one way or another (i.e., manually, or as a result of an initial load
occuring at the parent node).
SymmetricDS recognizes that an initial load has completed when the initial_load_time column on the
target node is set to a non-null value.
An initial load is accomplished by inserting reload batches in a defined order according to the
initial_load_order column on TRIGGER_ROUTER. Initial load data is always queried from the source
database table. All data is passed through the configured router to filter out data that might not be targeted
at a node.
An efficient way to select a subset of data from a table for an initial load is to provide an
initial_load_select clause on TRIGGER_ROUTER. This clause, if present, is applied as a where clause
to the SQL used to select the data to be loaded. If an initial_load_select clause is provided, data will not
be passed through the configured router during initial load. In cases where custom routing is done using a
feature like Section 4.6.2.4, Relational Router (p. 32) , an initial_load_select clause will always need to
be provided because the router would not function properly with initial load data.
One example of the use of an initial load select would be if you wished to only load data created more
recently than the start of year 2011. Say, for example, the column created_time contains the creation date.
Your initial_load_select would read created_time > ts {'2011-01-01 00:00:00.0000'} (using whatever
timestamp format works for your database). This then gets applied as a where clause when selecting data
from the table.
Important
When providing an initial_load_select be sure to test out the criteria against production data
in a query browser. Do an explain plan to make sure you are properly using indexes.
Occasionally the decision of what data to load initially results in additional triggers. These triggers,
known as Dead Triggers, are configured such that they do not capture any data changes. A "dead"
Symmetric DS v2.2 35
Configuration
Trigger is one that does not capture data changes. In other words, the sync_on_insert, sync_on_update, and
sync_on_delete properties for the Trigger are all set to false. However, since the Trigger is specified, it
will be included in the initial load of data for target Nodes.
Why might you need a Dead Trigger? A dead Trigger might be used to load a read-only lookup table, for
example. It could also be used to load a table that needs populated with example or default data. Another
use is a recovery load of data for tables that have a single direction of synchronization. For example, a
retail store records sales transaction that synchronize in one direction by trickling back to the central
office. If the retail store needs to recover all the sales transactions from the central office, they can be sent
are part of an initial load from the central office by setting up dead Triggers that "sync" in that direction.
The following SQL statement sets up a non-syncing dead Trigger that sends the sale_transaction table to
the "store" Node Group from the "corp" Node Group during an initial load.
As discussed in Section 3.6.3.2, Circular References and "Ping Back" (p. 22) SymmetricDS, by default,
avoids circular data changes. When a trigger fires as a result of SymmetricDS itself (such as the case
when sync on incoming batch is set), it records the originating source node of the data change in
source_node_id. During routing, if routing results in sending the data back to the originating source node,
the data is not routed by default. If instead you wish to route the data back to the originating node, you
can set the ping_back_enabled column for the needed particular trigger / router combination. This will
cause the router to "ping" the data back to the originating node when it usually would not.
Symmetric DS v2.2 36
Configuration
SymmetricDS allows you to have multiple nodes with the same external_id. Out of the box,
openRegistration will open a new registration if a registration already exists for a node with the same
external_id. A new registration means a new node with a new node_id and the same external_id will be
created. If you want to re-register the same node you can use the reOpenRegistration() JMX method
which takes a node_id as an argument.
Symmetric DS v2.2 37
Chapter 5. Advanced Topics
This chapter focuses on a variety of topics, including deployment options, jobs, clustering, encryptions,
synchronization control, and configuration of SymmetricDS.
By default, only the columns that changed will be updated in the target system.
More complex conflict resolution strategies can be accomplished by using the IDataLoaderFilter
extension point which has access to both old and new data.
A node will always push and pull data to other node groups according to the node group link
configuration. A node can only pull and push data to other nodes that are represented node table in its
database and having sync_enabled = 1. Because of this, a tree-like hierarchy of nodes can be created by
having only a subset of nodes belonging to the same node group represented at the different branches of
the tree.
If auto registration is turned off, then this setup must occur manually by opening registration for the
desired nodes at the desired parent node and by configuring each node's registration.url to be the parent
node's URL. The parent node is always tracked by the setting of the parent's node_id in the
created_at_node_id column of the new node. When a node registers and downloads its configuration it is
always provided the configuration for nodes that might register with the node itself based on the Node
Group Links defined in the parent node.
Symmetric DS v2.2 38
Advanced Topics
When deploying a multi-tiered system it may be advantageous to have only one registration server, even
though the parent node of a registering node could be any of a number of nodes in the system. In
SymmetricDS the parent node is always the node that a child registers with. The
REGISTRATION_REDIRECT table allows a single node, usually the root server in the network, to
redirect registering nodes to their true parents. It does so based on a mapping found in the table of the
external id (registrant_external_id) to the parent's node id (registration_node_id).
For example, if it is desired to have a series of regional servers that workstations at retail stores get
assigned to based on their external_id, the store number, then you might insert into
REGISTRATION_REDIRECT the store number as the registrant_external_id and the node_id of the
assigned region as the registration_node_id. When a workstation at the store registers, the root server
send an HTTP redirect to the sync_url of the node that matches the registration_node_id.
Important
Please see Section 4.6.3.1, Initial Load (p. 34) for important details around initial loads and
registration when using registration redirect.
5.2. Jobs
The SymmetricDS software allows for outgoing and incoming changes to be synchronized to/from other
databases. The node that initiates a synchronization connection is the client, and the node receiving a
connection is the host. Because synchronization is configurable to push or pull in either direction, the
same node can act as either a client or a host in different circumstances.
The SymmetricDS software consists of a series of background jobs, managers, Servlets, and services
wired together via dependency injection using the Spring Framework.
As a client, the node runs the router job, push job and pull job on a timer thread. The router job uses
services to create batches that are targeted at certain nodes. The push job uses services to extract and
stream data to another node (that is, it pushes data). The response from a push is a list of batch
acknowlegements to indicate that data was loaded. The pull job uses services to load data that is streamed
from another node (i.e., it pulls data). After loading data, a second connection is made to send a list of
batch acknowlegements.
As a host, the node waits for incoming connections that pull, push, or acknowledge data changes. The
push Servlet uses services to load data that is pushed from a client node. After loading data, it responds
with a list of batch acknowledgements. The pull Servlet uses services to extract, and stream data back to
the client node. The ack Servlet uses services to update the status of data that was loaded at a client node.
The router job batches and routes data.
By default, data is extracted from the source database into memory until a threshold size is reached. If the
threshold size is reached, data is streamed to a temporary file in the JVM's default temporary directory.
Next, the data is streamed to the target node across the transport layer. The receiving node will cache the
Symmetric DS v2.2 39
Advanced Topics
data in memory until the threshold size is reached, writing to a temporary file if necessary. At last, the
data is loaded into the target database by the data loader. This step by step approach allows for extract
time, transport time, and load time to all be measured independently. It also allows database resources to
be used most optimally.
The transport manager handles the incoming and outgoing streams of data between nodes. The default
transport is based on a simple implementation over HTTP. An internal transport is also provided. It is
possible to add other implementations, such as a socket-based transport manager.
The StandaloneSymmetricEngine is wrapper API that can be used to directly start the client services only.
The SymmetricWebServer is a wrapper API that can be used to directly start both the client and host services
inside a Jetty web container. The SymmetricLauncher provides command line tools to work with and start
SymmetricDS.
Symmetric DS v2.2 40
Advanced Topics
5.2.1.1. Overview
The SymmetricDS-created database triggers cause data to be capture in the DATA table. The next step in
the synchronization process is to process the change data to determine which nodes, if any, the data
should be routed to. This step is performed by the Route Job. In addition to determining which nodes data
will be sent to, the Route Job is also responsible for determing how much data will be batched together
for transport. It is a single background task that inserts into DATA_EVENT and OUTGOING_BATCH.
At a high level, the Route Job is straightforward. It collects a list of data ids from DATA which haven't
yet been routed (see Section 5.2.1.2, Data Gaps (p. 41) for much more detail about this step), one channel
at a time, up to a limit specified by the channel configuration (max_data_to_route, on CHANNEL). The
data is then batched based on the batch_algorithm defined for the channel and as documented in
Section 4.5, Channel (p. 27) . Note that, for the default batching algorithm, there may actually be more
than max_data_to_route included depending on the transaction boundaries. The mapping of data to specific
nodes, organized into batches, is then recorded in OUTGOING_BATCH with a status of "RT" in each
case (representing the fact that the Route Job is still running). Once the routing algorithms and batching
are completed, the batches are organized with their corresponding data ids and saved in DATA_EVENT.
Once DATA_EVENT is updated, the rows in OUTGOING_BATCH are updated to a status of New
"NE".
On the surface, the first Route Job step of collecting unrouted data ids seems simple: assign sequential
data ids for each data row as it's inserted and keep track of which data id was last routed and start from
there. The difficulty arises, however, due to the fact that there can be multiple transactions inserting into
DATA simultaneously. As such, a given section of rows in the DATA table may actually contain "gaps"
in the data ids when the Route Job is executing. Most of these gaps are only temporarily and fill in at
some point after routing and need to be picked up with the next run of the Route Job. Thus, the Route Job
needs to remember to route the filled-in gaps. Worse yet, some of these gaps are actually permanent and
result from a transaction that is rolled back for some reason. In this case, the Route Job must continue to
watch for the gap to fill in and, at some point, eventually gives up and assumes the gap is permanent and
can be skipped. All of this must be done in some fashion that guarantees that gaps are routed when they
fill in while also keeping routing as efficient as possible.
SymmetricDS handles the issue of data gaps by making use of a table, DATA_GAP, to record gaps found
in the data ids. In fact, this table completely defines the entire range of data tha can be routed at any point
in time. For a brand new instance of SymmetricDS, this table is empty and SymmetricDS creates a gap
starting from data id of zero and ending with a very large number (defined by routing.largest.gap.size).
At the start of a Route Job, the list of valid gaps (gaps with status of 'GP') is collected, and each gap is
evaluated in turn. If a gap is sufficiently old (as defined by routing.stale.dataid.gap.time.ms, the gap is
marked as skipped (status of 'SK') and will no longer be evaluated in future Route Jobs (note that the 'last'
gap (the one with the highest starting data id) is never skipped). If not skipped, then DATA_EVENT is
searched for data ids present in the gap. If one or more data ids is found in DATA_EVENT, then the
current gap is marked with a status of OK, and new gap(s) are created to represent the data ids still
missing in the gap's range. This process is done for all gaps. If the very last gap contained data, a new gap
starting from the highest data id and ending at (highest data id + routing.largest.gap.size) is then
created. This process has resulted in an updated list of gaps which may contain new data to be routed.
Symmetric DS v2.2 41
Advanced Topics
The Route Job determines which nodes data will be sent to, as well as how much data will be batched
together for transport. When the start.route.job SymmetricDS property is set to true, the frequency that
routing occurs is controlled by the job.routing.period.time.ms. Each time data is routed, the DATA_REF
table is updated with the id of the last contiguous data row to have been processed. This is done so the
query to find unrouted data is optimal.
After data is routed, it awaits transport to the target nodes. Transport can occur when a client node is
configured to pull data or when the host node is configured to push data. These events are controlled by
the Push and the Pull Jobs. When the start.pull.job SymmetricDS property is set to true, the frequency
that data is pulled is controlled by the job.pull.period.time.ms. When the start.push.job SymmetricDS
property is set to true, the frequency that data is pushed is controlled by the job.push.period.time.ms.
Data is extracted by channel from the source database's DATA table at an interval controlled by the
extract_period_millis column on the CHANNEL table. The last_extract_time is always recorded, by
channel, on the NODE_CHANNEL_CTL table for the host node's id. When the Pull and Push Job run, if
the extract period has not passed according to the last extract time, then the channel will be skipped for
this run. If the extract_period_millis is set to zero, data extraction will happen every time the jobs run.
SymmetricDS also provides the ability to configure windows of time when synchronization is allowed.
This is done using the NODE_GROUP_CHANNEL_WINDOW table. A list of allowed time windows
can be specified for a node group and a channel. If one or more windows exist, then data will only be
extracted and transported if the time of day falls within the window of time specified. The configured
times are always for the target node's local time. If the start_time is greater than the end_time, then the
window crosses over to the next day.
All data loading may be disabled by setting the dataloader.enable property to false. This has the effect of
not allowing incoming synchronizations, while allowing outgoing synchronizations. All data extractions
may be disabled by setting the dataextractor.enable property to false. These properties can be controlled
by inserting into the root server's PARAMETER table. These properties affect every channel with the
exception of the 'config' channel.
Symmetric DS v2.2 42
Advanced Topics
A configuration entry in Trigger without any history in Trigger Hist results in a new trigger being created
(N). The Trigger Hist stores a hash of the underlying table, so any alteration to the table causes the trigger
to be rebuilt (S). When the last_update_time is changed on the Trigger entry, the configuration change
causes the trigger to be rebuilt (C). If an entry in Trigger Hist is missing the corresponding database
trigger, the trigger is created (T).
The process of examining triggers and rebuilding them is automatically run during startup and each night
by the SyncTriggersJob. The user can also manually run the process at any time by invoking the
syncTriggers() method over JMX. The SyncTriggersJob is enabled by default to run at 15 minutes past
midnight. If SymmetricDS is being run from a collection of servers (multiple instances of the same Node
running against the same database), then locking should be enable to prevent database contention. The
following runtime properties control the behavior of the process.
start.synctriggers.job
Whether the sync triggers job is enabled for this node. [ Default: true ]
job.synctriggers.aftermidnight.minutes
If scheduled, the sync triggers job will run nightly. This is how long after midnight that job will run.
[ Default: 15 ]
cluster.lock.during.sync.triggers
Indicate if the sync triggers job is clustered and requires a lock before running. [ Default: false ]
The following is an example extension point configuration that will publish four tables in XML with a
root tag of 'sale'. Each XML message will be grouped by the batch and the column names identified by
the groupByColumnNames property which have the same values.
<bean id="configuration-publishingFilter"
class="org.jumpmind.symmetric.integrate.XmlPublisherDataLoaderFilter">
<property name="xmlTagNameToUseForGroup" value="sale"/>
Symmetric DS v2.2 43
Advanced Topics
<property name="tableNamesToPublishAsGroup">
<list>
<value>SALE_TX</value>
<value>SALE_LINE_ITEM</value>
<value>SALE_TAX</value>
<value>SALE_TOTAL</value>
</list>
</property>
<property name="groupByColumnNames">
<list>
<value>STORE_ID</value>
<value>BUSINESS_DAY</value>
<value>WORKSTATION_ID</value>
<value>TRANSACTION_ID</value>
</list>
</property>
<property name="publisher">
<bean class="org.jumpmind.symmetric.integrate.SimpleJmsPublisher">
<property name="jmsTemplate" ref="definedSpringJmsTemplate"/>
</bean>
</property>
</bean>
</beans>
The publisher property on the XmlPublisherDataLoaderFilter takes an interface of type IPublisher. The
implementation demonstrated here is an implementation that publishes to JMS using Spring's JMS
template. Other implementations of IPublisher could easily publish the XML to other targets like an
HTTP server, the file system or secure copy it to another server.
Symmetric DS v2.2 44
Advanced Topics
<data key="STORE_ID">001</data>
<data key="BUSINESS_DAY">2010-01-22</data>
<data key="WORKSTATION_ID">003</data>
<data key="TRANSACTION_ID">1234</data>
<data key="AMOUNT">1.33</data>
</row>
<row entity="SALE_TOTAL" dml="I">
<data key="STORE_ID">001</data>
<data key="BUSINESS_DAY">2010-01-22</data>
<data key="WORKSTATION_ID">003</data>
<data key="TRANSACTION_ID">1234</data>
<data key="AMOUNT">21.33</data>
</row>
</sale>
To publish JMS messages during routing the same pattern is valid, with the exception that the extension
point would be the XmlPublisherDataRouter and the router would be configured by setting the
router_type of a ROUTER to the Spring bean name of the registered extension point. Of course, the
router would need to be linked through TRIGGER_ROUTERs to each TRIGGER table that needs
published.
This option means packaging a WAR file and deploying to your favorite web server, like Apache
Tomcat. It's a little more work, but you can configure the web server to do whatever you need.
SymmetricDS can also be embedded in an existing web application, if desired.
This option means running the sym command line, which launches the built-in Jetty web server.
This is a simple option because it is already provided, but you lose the flexibility to configure the
web server any further.
This option means you must write a wrapper Java program that runs SymmetricDS. You would
probably use Jetty web server, which is also embeddable. You could bring up an embedded
database like Derby or H2. You could configure the web server, database, or SymmetricDS to do
whatever you needed, but it's also the most work of the three options discussed thus far.
• Grails Application
A Grails SymmetricDS plugin is provided at the default Grails plugin site. This option ends up
being a WAR deployment, but allows for the use of the Grails SDK for configuring and building
the deployment. The plugin also provides Gorm (Hibernate) access to many of the core database
tables.
Symmetric DS v2.2 45
Advanced Topics
The deployment model you choose depends on how much flexibility you need versus how easy you want
it to be. Both Jetty and Tomcat are excellent, scalable web servers that compete with each other and have
great performance. Most people choose either the Standalone or Web Archive with Tomcat 5.5 or 6.
Deploying to Tomcat is a good middle-of-the-road decision that requires a little more work for more
flexibility.
Next, we will go into a little more detail on the first three deployment options listed above.
A war file can be generated using the standalone installation's sym utility and the --create-war option. The
command requires the name of the war file to generate. It essentially packages up the web directory, the
conf directory and includes an optional properties file. Note that if a properties file is included, it will be
copied to WEB-INF/classes/symmetric.properties. This is the same location conf/symmetric.properties
would have been copied to. The generated war distribution uses the same web.xml as the standalone
deployment.
5.4.2. Standalone
A standalone service can use the sym command line options to start a server. An embedded instance of
Jetty is used to service web requests for all the servlets.
Symmetric DS v2.2 46
Advanced Topics
This example starts the SymmetricDS server on port 8080 with the startup properties found in the
root.properties file.
5.4.3. Embedded
A Java application with the SymmetricDS Java Archive (JAR) library on its classpath can use the
SymmetricWebServer to start the server.
import org.jumpmind.symmetric.SymmetricWebServer;
/**
* Start an engine that is configured by two properties files. One is
* packaged with the application and contains overridden properties that are
* specific to the application. The other is found in the application's
* working directory. It can be used to setup environment specific
* properties.
*/
public static void main(String[] args) throws Exception {
// this will create the database, sync triggers, start jobs running
node.start(8080);
This example starts the SymmetricDS server on port 8080 with startup properies found in two locations.
The first file, my-application.properties, is packaged in the application to provide properties that override
the SymmetricDS default values.
bin\install_service.bat
The service configuration is found in conf/sym_service.conf. Edit this file if you want to change the
default port number (8080), initial memory size (256 MB), log file size (10 MB), or other settings. When
started, the server will look in the conf directory for the symmetric.properties file and the log4j.xml file.
Logging for standard out, error, and application are written to the logs directory.
Symmetric DS v2.2 47
Advanced Topics
Most configuration changes do not require the service to be re-installed. To un-install the service, use the
provided script:
bin\uninstall_service.bat
An init script is provided to work with standard Unix run configuration levels. The sym_service.initd file
follows the Linux Standard Base specification, which should work on many systems, including Fedora
and Debian-based distributions. To install the script, copy it into the system init directory:
cp bin/sym_service.initd /etc/init.d/sym_service
Edit the init script to set the SYM_HOME variable to the directory where SymmetricDS is located. The
init script calls the sym_service executable.
/usr/lib/lsb/install_initd sym_service
/usr/lib/lsb/remove_initd sym_service
Symmetric DS v2.2 48
Advanced Topics
Use the service command to start, stop, and query the status of the service:
/etc/init.d/sym_service start
/etc/init.d/sym_service stop
/etc/init.d/sym_service status
5.7. Clustering
A single SymmetricDS node may be clustered across a series of instances, creating a web farm. A node
might be clustered to provide load balancing and failover, for example.
When clustered, a hardware load balancer is typically used to round robin client requests to the cluster.
The load balancer should be configured for stateless connections. Also, the sync.url (discussed in
Section 4.1, Node Properties (p. 24) ) SymmetricDS property should be set to the URL of the load
balancer.
If the cluster will be running any of the SymmetricDS jobs, then the cluster.lock.enabled property should
be set to true. By setting this property to true, SymmetricDS will use a row in the LOCK table as a
semaphore to make sure that only one instance at a time runs a job. When a lock is acquired, a row is
updated in the lock table with the time of the lock and the server id of the locking job. The lock time is set
back to null when the job is finished running. Another instance of SymmetricDS cannot aquire a lock
until the locking instance (according to the server id) releases the lock. If an instance is terminated while
the lock is still held, an instance with the same server id is allowed to reaquire the lock. If the locking
instance remains down, the lock can be broken after a period of time, specified by the
cluster.lock.timeout.ms property, has expired. Note that if the job is still running and the lock expires,
two jobs could be running at the same time which could cause database deadlocks.
By default, the locking server id is the hostname of the server. If two clustered instances are running on
the same server, then the cluster.server.id property may be set to indicate the name that the instance
should use for its server id.
When deploying SymmetricDS to an application server like Tomcat or JBoss, no special session
clustering needs to be configured for the application server.
Symmetric DS v2.2 49
Advanced Topics
following command:
sym -e secret
The text is encrypted by the cipher defined as alias "sym.secret" in the Java keystore. The keystore is
specified by the "sym.keystore.file" system property, which defaults to security/keystore. If a cipher is
not found, a default cipher using Triple DES with a random password is generated.
registration.url
This is the URL where the node will connect for registration when it first starts up. To protect the
registration with SSL, you specify "https" in the URL.
For incoming HTTPS connections, SymmetricDS depends on the webserver where it is deployed, so the
webserver must be configured for HTTPS. As a standalone deployment, the "sym" launcher command
provides options for enabling HTTPS support.
5.9.2. Tomcat
If you deploy SymmetricDS to Apache Tomcat, it can be secured by editing the
TOMCAT_HOME/conf/server.xml configuration file. There is already a line that can be uncommented and
changed to the following:
Symmetric DS v2.2 50
Advanced Topics
5.9.3. Keystores
When SymmetricDS connects to a URL with HTTPS, Java checks the validity of the certificate using the
built-in trusted keystore located at JRE_HOME/lib/security/cacerts. The "sym" launcher command
overrides the trusted keystore to use its own trusted keystore instead, which is located at security/cacerts.
This keystore contains the certificate aliased as "sym" for use in testing and easing deployments. The
trusted keystore can be overridden by specifying the javax.net.ssl.trustStore system property.
When SymmetricDS is run as a secure server with the "sym" launcher, it accepts incoming requests using
the key installed in the keystore located at security/keystore. The default key is provided for convenience
of testing, but should be re-generated for security.
keytool -keystore keystore -alias sym -genkey -keyalg RSA -validity 10950
Symmetric DS v2.2 51
Advanced Topics
http.basic.auth.password
password for client node basic authentication. [ Default: ]
The SymmetricDS Standalone and Embedded Server also support basic authentication. This feature is
enabled by specifying the basic authentication username and password using the following startup
parameters:
embedded.webserver.basic.auth.username
username for basic authentication for an embedded server or standalone server node. [ Default: ]
embedded.webserver.basic.auth.password
password for basic authentication for an embedded server or standalone server node. [ Default: ]
If the server node is deployed to Tomcat or another application server as a WAR or EAR file, then basic
authentication is setup with the standard configuration in the WEB.xml file.
5.11. IP Filtering
SymmetricDS supports restricting IP addresses of clients that are allowed to connect to servers. The
following filtering functionality is supported for IPv4 addresses (IPv6 is currently not supported).
• CIDR (Classless Inter-Domain Routing) notation
• Wildcarding
• Range
• Literal
Symmetric DS v2.2 52
Advanced Topics
The basis for implementing CIDR notation is defining the IP address block and significant bits of that
address that are to be checked. The filter must be a well formatted IP address with a ending with a “/”
followed by a numeric value between 0 and 32. The use of “0” denotes that all IP addresses are allowed
(in which case it's fairly pointless to enable the filtering framework), and “32” signifies only the
precesding IP address would be authorized. In the latter case, a Literal Filter string would be
recommended as it is significantly more obvious that only that address is allowed.
#
# Filter string definition to restrict connecting client
# IP addresses
#
ip.filters=10.10.4.32/27, 10.5.0.0/16
5.11.3. Wildcarding
The wildcard notation allows all values for a specific piece of an IP address to be valid (0 to 255 for IPv4
addresses). This is denoted with a “*” within the specific piece (octet for IPv4) of an IP address. The
wildcard character is the only allowable character within that piece of the address (no other characters
included whitespace).
Wildcard filters may be combined with Range Filters. They may NOT be combined with CIDR Filter.
#
# Filter string definition to restrict connecting client
# IP addresses
#
ip.filters=10.10.*.40
Symmetric DS v2.2 53
Advanced Topics
#
# Filter string definition to restrict connecting client
# IP addresses
#
ip.filters=10.10.40-20.200-1
5.11.6. Configuration
Configuring IP filter strings is done through defining the following property in the SymmetricDS
configuration (one of the symmetric .properties files). One need only to define the ip.filter property and
assign a comma “,” delimited string of filter tokens to provide to the filter framework.
#
# Filter string definition to restrict connecting client
# IP addresses
#
ip.filters=10.10.4.32/27, 100.50-40.10-5.*, 35.58.124.89
Important
Symmetric DS v2.2 54
Advanced Topics
Note, that there is obvious overlap between the some of the filtering notation, and hence,
functionality. The Wildcarding and Range Filters functionality exists to provide workarounds
for scenarios where CIDR Filter notation and Literal Filter will not suffice.
Warning
Take care in defining your filter string as it is possible to overlap filters. Also, as with the
definition of any other property in the SymmetricDS configuration, if the property is defined
in multiple properties files the property file that is read in last will override any previous filter
string definitions.
Symmetric DS v2.2 55
Chapter 6. Extending SymmetricDS
SymmetricDS may be extended via a plug-in like architecture where extension point interfaces may be
implemented by a custom class and registered with the synchronization engine. All supported extension
points extend the IExtensionPoint interface. The currently available extension points are documented in
the following sections.
When the synchronization engine starts up, a Spring post processor searches the Spring
ApplicationContext for any registered classes which implement IExtensionPoint. An IExtensionPoint
designates whether it should be auto registered or not. If the extension point is to be auto registered then
the post processor registers the known interface with the appropriate service.
/**
* Only apply this extension point to the 'root' node group.
*/
public String[] getNodeGroupIdsToApplyTo() {
return new String[] { "root" };
}
SymmetricDS will look for Spring configured extensions in the application Classpath by importing any
Spring XML configuration files found matching the following pattern:
META-INF/services/symmetric-*-ext.xml. When packaged in a jar file the META-INF directory should be at
the root of the jar file. When packaged in a war file, the META-INF directory should be in the
WEB-INF/classes directory.
6.1. IParameterFilter
Parameter values can be specified in code using a parameter filter. Note that there can be only one
parameter filter per engine instance. The IParameterFilter replaces the deprecated IRuntimeConfig from
prior releases.
/**
* Only apply this filter to stores
*/
public String[] getNodeGroupIdsToApplyTo() {
return new String[] { "store" };
}
Symmetric DS v2.2 56
Extending SymmetricDS
6.2. IDataLoaderFilter
Data can be filtered as it is loaded into the target database. It can also be filtered when it is extracted from
the source database. As data is loaded into the target database, a filter can change the data in a column or
save it somewhere else. It can also specify by the return value of the function call that the data loader
should continue on and load the data (by returning true) or ignore it (by returning false). One possible use
of the filter might be to route credit card data to a secure database and blank it out as it loads into a
less-restricted reporting database.
An IDataLoaderContext is passed to each of the callback methods. A new context is created for each
synchronization. The context provides methods to lookup column indexes by column name, get table
meta data, and access to old data if the sync_column_level flag is enabled. The context also provides a
means to share data during a synchronization between different rows of data that are committed in a
database transaction and are in the same channel. It does so by providing a context cache which can be
populated by the extension point.
Many times the IDataLoaderFilter will be combined with the IBatchListener. The XmlPublisherFilter (in
the org.jumpmind.symmetric.ext package) is a good example of using the combination of the two extension
points in order to create XML messages to be published to JMS.
A class implementing the IDataLoaderFilter interface is injected onto the DataLoaderService in order to
receive callbacks when data is inserted, updated, or deleted.
Symmetric DS v2.2 57
Extending SymmetricDS
The filter class is specified as a Spring-managed bean. A custom Spring XML file is specified as follows
in a jar at META-INF/services/symmetric-myfilter-ext.xml.
</beans>
6.3. ITableColumnFilter
Implement this extension point to filter out specific columns from use by the dataloader. Only one
column filter may be added per target table.
6.4. IBatchListener
This extension point is called whenever a batch has completed loading but before the transaction has
committed.
6.5. IAcknowledgeEventListener
Implement this extension point to receive callback events when a batch is acknowledged. The callback for
this listener happens at the point of extraction.
6.6. IReloadListener
Implement this extension point to listen in and take action before or after a reload is requested for a Node.
The callback for this listener happens at the point of extraction.
6.7. IExtractorFilter
This extension point is called after data has been extracted, but before it has been streamed. It has the
ability to inspect each row of data to take some action and indicate, if necessary, that the row should not
be streamed.
Symmetric DS v2.2 58
Extending SymmetricDS
6.8. ISyncUrlExtension
This extension point is used to select an appropriate URL based on the URI provided in the sync_url
column of sym_node.
To use this extension point configure the sync_url for a node with the protocol of ext://beanName. The
beanName is the name you give the extension point in the extension xml file.
6.9. INodeIdGenerator
This extension point allows SymmetricDS users to implement their own algorithms for how node ids and
passwords are generated or selected during the registration process. There may be only one node
generator per SymmetricDS instance.
6.10. ITriggerCreationListener
Implement this extension point to get status callbacks during trigger creation.
6.11. IBatchAlgorithm
Implement this extension point and set the name of the Spring bean on the batch_algorithm column of the
Channel table to use. This extension point gives fine grained control over how a channel is batched.
6.12. IDataRouter
Implement this extension point and set the name of the Spring bean on the router_type column of the
Router table to use. This extension point gives the ability to programatically decide which nodes data
should be routed to.
6.13. IHeartbeatListener
Implement this extension point to get callbacks during the heartbeat job.
6.14. IOfflineClientListener
Implement this extension point to get callbacks for offline events on client nodes.
6.15. IOfflineServerListener
Symmetric DS v2.2 59
Extending SymmetricDS
Implement this extension point to get callbacks for offline events detected on a server node during
monitoring of client nodes.
6.16. INodePasswordFilter
Implement this extension point to intercept the saving and rendering of the node password.
6.17. IServletExtension
Implement this extension point to allow additional Servlets to be registered with SymmetricDS. This is
probably only useful if SymmetricDS is running in standalone or embedded mode.
Symmetric DS v2.2 60
Chapter 7. Administration
update SYM_TRIGGER
set channel_id = 'price',
last_update_by = 'jsmith',
last_update_time = current_timestamp
where source_table_name = 'price_changes';
All configuration should be managed centrally at the registration node. If enabled, configuration changes
will be synchronized out to client nodes. When trigger changes reach the client nodes the Sync Triggers
Job will run automatically.
Centrally, the trigger changes will not take effect until the Sync Triggers Job runs. Instead of waiting for
the Sync Triggers Job to run overnight after making a Trigger change, you can invoke the syncTriggers()
method over JMX or simply restart the SymmetricDS server.
If this behavior is not desired, the feature can be turned off using a parameter. Custom triggers may be
added to the sym_* tables when the auto syncing feature is disabled.
SymmetricDS proxies all of its logging through Commons Logging. When deploying to an application
server, if Log4J is not being leveraged, then the general rules for for Commons Logging apply.
Symmetric DS v2.2 61
Administration
Monitoring and administrative operations can be performed using Java Management Extensions (JMX).
SymmetricDS uses MX4J to expose JMX attributes and operations that can be accessed from the built-in
web console, Java's jconsole, or an application server. By default, the web management console can be
opened from the following address:
http://localhost:31416/
Using the Java jconsole command, SymmetricDS is listed as a local process named SymmetricLauncher.
In jconsole, SymmetricDS appears under the MBeans tab under then name defined by the engine.name
property. The default value is SymmetricDS.
SymmetricDS creates these temporary files in the directory specified by the java.io.tmpdir Java System
property. When SymmmetricDS starts up, stranded temporary files are aways cleaned up. Files will only
be stranded if the SymmetricDS engine is force killed.
The location of the temporary directory may be changed by setting the Java System property passed into
the Java program at startup. For example,
-Djava.io.tmpdir=/home/.symmetricds/tmp
Symmetric DS v2.2 62
Administration
• DATA
• DATA_EVENT
• OUTGOING_BATCH
• INCOMING_BATCH
• ???
The purge job is enabled by the start.purge.job SymmetricDS property. The job runs periodically
according to the job.purge.period.time.ms property. The default period is to run every ten minutes.
Two retention period properties indicate how much history SymmetricDS will retain before purging. The
purge.retention.minutes property indicates the period of history to keep for synchronization tables. The
default value is 5 days. The statistic.retention.minutes property indicates the period of history to keep
for statistics. The default value is also 5 days.
The purge properties should be adjusted according to how much data is flowing through the system and
the amount of storage space the database has. For an initial deployment it is recommended that the purge
properties be kept at the defaults, since it is often helpful to be able to look at the captured data in order to
triage problems and profile the synchronization patterns. When scaling up to more nodes, it is
recomended that the purge parameters be scaled back to 24 hours or less.
Symmetric DS v2.2 63
Appendix A. Data Model
What follows is the complete SymmetricDS data model. Note that all tables are prepended with a
configurable prefix so that multiple instances of SymmetricDS may coexist in the same database. The
default prefix is sym_.
SymmetricDS configuration is entered by the user into the data model to control the behavior of what
data is synchronized to which nodes.
At runtime, the configuration is used to capture data changes and route them to nodes. The data changes
are placed together in a single unit called a batch that can be loaded by another node. Outgoing batches
are delivered to nodes and acknowledged. Incoming batches are received and loaded. History is recorded
for batch status changes and statistics.
Symmetric DS v2.2 64
Data Model
A.1. NODE
Representation of an instance of SymmetricDS that synchronizes data with one or more additional nodes.
Each node has a unique identifier (nodeId) that is used when communicating, as well as a domain-specific
identifier (externalId) that provides context within the local system.
NODE_GROUP_ID VARCHAR X The node group that this node belongs to, such
(50) as 'store'.
Symmetric DS v2.2 65
Data Model
CREATED_AT_NODE_ID VARCHAR The node_id of the node where this node was
(50) created. This is typically filled automatically
with the node_id found in node_identity where
registration was opened for the node.
A.2. NODE_SECURITY
Security features like node passwords and open registration flag are stored in the node_security table.
Symmetric DS v2.2 66
Data Model
INITIAL_LOAD_TIME TIMESTAMP The timestamp when this node started the initial
load.
CREATED_AT_NODE_ID VARCHAR X The node_id of the node where this node was
(50) created. This is typically filled automatically
with the node_id found in node_identity where
registration was opened for the node.
A.3. NODE_IDENTITY
After registration, this table will have one row representing the identity of the node. For a root node, the
row is entered by the user.
A.4. NODE_GROUP
A category of Nodes that synchronizes data with one or more NodeGroups. A common use of
NodeGroup is to describe a level in a hierarchy of data synchronization.
A.5. NODE_GROUP_LINK
A source node_group sends its data updates to a target NodeGroup using a pull, push, or custom
Symmetric DS v2.2 67
Data Model
technique.
A.6. NODE_HOST
Representation of an physical workstation or server that is hosting the SymmetricDS software. In a
clustered environment there may be more than one entry per node in this table.
Symmetric DS v2.2 68
Data Model
A.7. NODE_HOST_CHANNEL_STATS
START_TIME TIMESTAMP PK X
END_TIME TIMESTAMP PK X
DATA_ROUTED BIGINT 0 Indicate the number of data rows that have been
routed during this period.
DATA_UNROUTED BIGINT 0
DATA_EVENT_INSERTED BIGINT 0 Indicate the number of data rows that have been
Symmetric DS v2.2 69
Data Model
DATA_EXTRACTED BIGINT 0
DATA_BYTES_EXTRACTED BIGINT 0
DATA_EXTRACTED_ERRORS BIGINT 0
DATA_BYTES_SENT BIGINT 0
DATA_SENT BIGINT 0
DATA_SENT_ERRORS BIGINT 0
DATA_LOADED BIGINT 0
DATA_BYTES_LOADED BIGINT 0
DATA_LOADED_ERRORS BIGINT 0
A.8. NODE_HOST_STATS
START_TIME TIMESTAMP PK X
END_TIME TIMESTAMP PK X
NODES_PULLED BIGINT 0
TOTAL_NODES_PULL_TIME BIGINT 0
NODES_PUSHED BIGINT 0
TOTAL_NODES_PUSH_TIME BIGINT 0
NODES_REJECTED BIGINT 0
NODES_REGISTERED BIGINT 0
NODES_LOADED BIGINT 0
NODES_DISABLED BIGINT 0
Symmetric DS v2.2 70
Data Model
PURGED_DATA_EVENT_ROWS BIGINT 0
PURGED_BATCH_OUTGOING_ROWS
BIGINT 0
PURGED_BATCH_INCOMING_ROWS
BIGINT 0
TRIGGERS_CREATED_COUNT BIGINT
TRIGGERS_REBUILT_COUNT BIGINT
TRIGGERS_REMOVED_COUNT BIGINT
A.9. NODE_HOST_JOB_STATS
JOB_NAME VARCHAR PK X
(50)
START_TIME TIMESTAMP PK X
END_TIME TIMESTAMP PK X
PROCESSED_COUNT BIGINT 0
A.10. CHANNEL
This table represents a category of data that can be synchronized independently of other channels.
Channels allow control over the type of data flowing and prevents one type of synchronization from
contending with another.
Symmetric DS v2.2 71
Data Model
MAX_DATA_TO_ROUTE INTEGER 100000 X The maximum number of data rows to route for
a channel at a time.
USE_OLD_DATA_TO_ROUTE INTEGER (1) 1 X Indicates whether to read the old data during
routing.
USE_ROW_DATA_TO_ROUTE INTEGER (1) 1 X Indicates whether to read the row data during
routing.
BATCH_ALGORITHM VARCHAR default X The algorithm to use when batching data on this
(50) channel. Possible values are: 'default',
'transactional', and 'nontransactional'
A.11. NODE_CHANNEL_CTL
Used to ignore or suspend a channel. A channel that is ignored will have its data_events batched and they
will immediately be marked as 'OK' without sending them. A channel that is suspended is skipped when
batching data_events.
Symmetric DS v2.2 72
Data Model
LAST_EXTRACT_TIME TIMESTAMP Record the last time data was extract for a node
and a channel.
A.12. NODE_GROUP_CHANNEL_WINDOW
An optional window of time for which a node group and channel will be active.
END_TIME TIME PK X The end time for the active window. Note that
if the end_time is less than the start_time then
the window crosses a day boundary.
ENABLED INTEGER (1) 0 X Enable this window. If this is set to '0' then this
window is ignored.
A.13. TRIGGER
Configures database triggers that capture changes in the database. Configuration of which triggers are
generated for which tables is stored here. Triggers are created in a node's database if the
source_node_group_id of a router is mapped to a row in this table.
Symmetric DS v2.2 73
Data Model
SOURCE_TABLE_NAME VARCHAR X The name of the source table that will have a
(50) trigger installed to watch for data changes.
Symmetric DS v2.2 74
Data Model
A.14. ROUTER
Configure a type of router from one node group to another. Note that routers are mapped to triggers
through trigger_routers.
TARGET_TABLE_NAME VARCHAR Optional name for a target table. Only use this
(50) if the target table name is different than the
source.
SYNC_ON_UPDATE INTEGER (1) 1 X Flag that indicates that this router should route
updates.
SYNC_ON_INSERT INTEGER (1) 1 X Flag that indicates that this router should route
inserts.
Symmetric DS v2.2 75
Data Model
A.15. TRIGGER_ROUTER
Map a trigger to a router.
PING_BACK_ENABLED INTEGER (1) 0 X When enabled, the node will route data that
originated from a node back to that node. This
attribute is only effective if
sync_on_incoming_batch is set to 1.
A.16. PARAMETER
Provides a way to manage most SymmetricDS settings in the database.
Symmetric DS v2.2 76
Data Model
A.17. REGISTRATION_REDIRECT
Provides a way for a centralized registration server to redirect registering nodes to their prospective
parent node in a multi-tiered deployment.
A.18. REGISTRATION_REQUEST
Audits when a node registers or attempts to register.
Symmetric DS v2.2 77
Data Model
A.19. TRIGGER_HIST
A history of a table's definition and the trigger used to capture data from the table. When a database
trigger captures a data change, it references a trigger_hist entry so it is possible to know which columns
the data represents. trigger_hist entries are made during the sync trigger process, which runs at each
startup, each night in the syncTriggersJob, or any time the syncTriggers() JMX method is manually
invoked. A new entry is made when a table definition or a trigger definition is changed, which causes a
database trigger to be created or rebuilt.
SOURCE_TABLE_NAME VARCHAR X The name of the source table that will have a
(50) trigger installed to watch for data changes.
NAME_FOR_UPDATE_TRIGGER VARCHAR X The name used when the insert trigger was
(50) created.
NAME_FOR_INSERT_TRIGGER VARCHAR X The name used when the update trigger was
(50) created.
NAME_FOR_DELETE_TRIGGER VARCHAR X The name used when the delete trigger was
(50) created.
Symmetric DS v2.2 78
Data Model
A.20. DATA
The captured data change that occurred to a row in the database. Entries in data are created by database
triggers.
EVENT_TYPE CHAR (1) X The type of event captured by this entry. For
triggers, this is the change that occurred, which
is 'I' for insert, 'U' for update, or 'D' for delete.
Other events include: 'R' for reloading the entire
table (or subset of the table) to the node; 'S' for
running dynamic SQL at the node, which is
used for adhoc administration.
Symmetric DS v2.2 79
Data Model
CHANNEL_ID VARCHAR The channel that this data belongs to, such as
(20) 'prices'
A.21. DATA_REF
Used only when routing.data.reader.type is set to 'ref.' Table that tracks the last known data_id that has
been processed. This table is used so that joins to find unprocessed data can be better optimized.
A.22. DATA_GAP
Used only when routing.data.reader.type is set to 'gap.' Table that tracks gaps in the data table so that they
may be processed efficiently, if data shows up. Gaps can show up in the data table if a database
transaction is rolled back.
Symmetric DS v2.2 80
Data Model
END_ID INTEGER PK X The last missing data_id from the data table
where a gap is detected. If the start_id is the last
data_id inserted plus one, then this field is filled
in with a -1.
A.23. DATA_EVENT
Represents routing of a data row to one or more nodes. Entries in data_event are created by database
triggers.
A.24. OUTGOING_BATCH
Used for tracking the sending a collection of data to a node in the system. A new outgoing_batch is
Symmetric DS v2.2 81
Data Model
created and given a status of 'NE'. After sending the outgoing_batch to its target node, the status becomes
'SE'. The node responds with either a success status of 'OK' or an error status of 'ER'. An error while
sending to the node also results in an error status of 'ER' regardless of whether the node sends that
acknowledgement.
STATUS CHAR (2) The current status of the Batch can be newly
created (NE), sent to a Node (SE),
acknowledged as successful (OK), and error
(ER).
LOAD_FLAG INTEGER (1) 0 A flag that indicates that this batch is part of an
initial load.
ERROR_FLAG INTEGER (1) 0 A flag that indicates that this batch was in error
during the last synchornization attempt.
INSERT_EVENT_COUNT BIGINT 0 X The number of insert events that are part of this
batch.
DELETE_EVENT_COUNT BIGINT 0 X The number of delete events that are part of this
batch.
OTHER_EVENT_COUNT BIGINT 0 X The number of other event types that are part of
this batch. This includes any events types that
are not a reload, insert, update or delete event
type.
Symmetric DS v2.2 82
Data Model
SQL_CODE INTEGER 0 X For a status of error (ER), this is the error code
from the database that is specific to the vendor.
LAST_UPDATE_HOSTNAME VARCHAR The host name of the process that last did work
(255) on this batch.
A.25. INCOMING_BATCH
The incoming_batch is used for tracking the status of loading an outgoing_batch from another node. Data
is loaded and commited at the batch level. The status of the incoming_batch is either successful (OK) or
error (ER).
STATUS CHAR (2) The current status of the batch can be loading
Symmetric DS v2.2 83
Data Model
ERROR_FLAG INTEGER (1) 0 A flag that indicates that this batch was in error
during the last synchornization attempt.
MISSING_DELETE_COUNT BIGINT 0 X THe number of times a delete did not effect the
database because the row was already deleted.
SQL_CODE INTEGER 0 X For a status of error (ER), this is the error code
from the database that is specific to the vendor.
LAST_UPDATE_HOSTNAME VARCHAR The host name of the process that last did work
(255) on this batch.
A.26. LOCK
Symmetric DS v2.2 84
Data Model
Contains semaphores that are set when processes run, so that only one server can run a process at a time.
Enable this feature by using the cluster.lock.during.xxxx parameters.
LAST_LOCKING_SERVER_ID VARCHAR The server id of the process that last did work
(255) on this batch.
Symmetric DS v2.2 85
Appendix B. Parameters
There are two kinds of parameters that can be used to configure the behavior of SymmetricDS: Startup
Parameters and Runtime Parameters. Startup Parameters are required to be in a system property or a
property file, while Runtime Parameters can also be found in the Parameter table from the database.
Parameters are re-queried from their source at a configured interval and can also be refreshed on demand
by using the JMX API. The following table shows the source of parameters and the hierarchy of
precedence.
symmetric.properties N Provided by the end user in the current system user's user.home
directory.
named properties file 1 N Provided by the end user as a Java system property (i.e.
-Dsymmetric.override.properties.file.1=file://my.properties) or in the
constructor of a SymmetricEngine .
named properties file 2 N Provided by the end user as a Java system property (i.e.
-Dsymmetric.override.properties.file.2=classpath://my.properties) or
in the constructor of a SymmetricEngine .
Java System Properties N Any SymmetricDS property can be passed in as a -D property to the
runtime. It will take precedence over any properties file property.
Symmetric DS v2.2 86
Parameters
[ Default: ]
db.driver
The class name of the JDBC driver. If db.jndi.name is set, this property is ignored.
[ Default: com.mysql.jdbc.Driver ]
db.url
The JDBC URL used to connect to the database. If db.jndi.name is set, this property is ignored.
[ Default: jdbc:mysql://localhost/symmetric ]
db.user
The database username, which is used to login, create, and update SymmetricDS tables. To use an
encrypted username, see Section 5.8, Encrypted Passwords (p. 49) . If db.jndi.name is set, this
property is ignored. [ Default: symmetric ]
db.password
The password for the database user. To use an encrypted password, see Section 5.8, Encrypted
Passwords (p. 49) . If db.jndi.name is set, this property is ignored. [ Default: ]
db.pool.initial.size
The initial size of the connection pool. If db.jndi.name is set, this property is ignored. [ Default: 5 ]
db.pool.max.active
The maximum number of connections that will be allocated in the pool. If db.jndi.name is set, this
property is ignored. [ Default: 10 ]
db.pool.max.wait.millis
This is how long a request for a connection from the datasource will wait before giving up. If
db.jndi.name is set, this property is ignored. [ Default: 30000 ]
db.pool.min.evictable.idle.millis
This is how long a connection can be idle before it will be evicted. If db.jndi.name is set, this property
is ignored. [ Default: 120000 ]
db.spring.bean.name
The name of a Spring bean to use as the DataSource. If you want to use a different DataSource other
than the provided DBCP version that SymmetricDS uses out of the box, you may set this to be the
Spring bean name of your DataSource.
db.sql.query.timeout.seconds
The timeout in seconds for queries running on the database. [ Default: 300 ]
db.tx.timeout.seconds
This is how long the default transaction time is. This needs to be fairly big to account for large data
loads. [ Default: 7200 ]
db.jdbc.streaming.results.fetch.size
This is the default fetch size for streaming result sets into memory from the database.
[ Default: 1000 ]
db.default.schema
Symmetric DS v2.2 87
Parameters
This is the schema that will be used for metadata lookup. Some dialect automatically figure this out
using database specific SQL to get the current schema. [ Default: ]
db.metadata.ignore.case
Indicates that case should be ignored when looking up references to tables using the metadata api.
[ Default: true ]
auto.config.database
If this is true, the configuration and runtime tables used by SymmetricDS are automatically created
during startup. [ Default: true ]
auto.upgrade
If this is true, when symmetric starts up it will try to upgrade tables to latest version. [ Default: true ]
auto.sync.configuration
If this is true, create triggers for the SymmetricDS configuration table that will synchronize changes
to node groups that pull from the node where this property is set. [ Default: true ]
https.allow.self.signed.certs
If this is true, a Symmetric client node to accept self signed certificates. [ Default: true ]
http.basic.auth.username
If specified, a Symmetric client node will use basic authentication when communicating with its
server node using the given user name. [ Default: ]
http.basic.auth.password
If specified, the password used for basic authentication. [ Default: ]
embedded.webserver.basic.auth.username
If specified, the username for basic authentication for an embedded server or standalone server node.
Specifying the username and password is all that's needed to enable basic authentication for an
embedded server or standalone server node. [ Default: ]
embedded.webserver.basic.auth.password
If specified, the password for basic authentication for an embedded server or standalone server node.
[ Default: ]
https.verified.server.names
A list of comma separated server names that will always verify when using https. This is useful if the
URL's hostname and the server's identification hostname don't match exactly using the default rules
for the JRE. A special value of "all" may be specified to allow all hostnames to verify. [ Default: ]
sync.table.prefix
When symmetric tables are created and accessed, this is the prefix to use for the table name.
[ Default: sym ]
engine.name
This is the engine name. This should be set if you have more than one engine running in the same
JVM. It is used to name the JMX management bean. [ Default: Default ]
start.push.job
Symmetric DS v2.2 88
Parameters
Whether the push job is enabled for this node. [ Default: true]
start.pull.job
Whether the pull job is enabled for this node. [ Default: true ]
start.purge.job
Whether the purge job is enabled for this node. [ Default: true ]
start.synctriggers.job
Whether the sync triggers job is enabled for this node. [ Default: true ]
start.heartbeat.job
Whether the heartbeat job is enabled for this node. The heartbeat job simply inserts an event to update
the heartbeat_time column on the node table for the current node. [ Default: true ]
start.watchdog.job
Whether the watchdog job is enabled for this node. The watchdog job monitors child nodes to detect
if they are offline. Refer to Section 6.15, IOfflineServerListener (p. 59) for more information.
[ Default: true ]
job.purge.period.time.ms
This is how often the purge job will be run. [ Default: 600000 ]
job.statflush.period.time.ms
This is how often accumulated statistics will be flushed out to the database from memory.
[ Default: 600000 ]
web.base.servlet.path
The base servlet path for when embedding SymmetricDS with in another web application.
[ Default: ]
auto.reload
If this is true, a reload is automatically sent to nodes when they register. [ Default: false ]
auto.update.node.values.from.properties
Update the node row in the database from the local properties during a heartbeat operation.
[ Default: true ]
http.download.rate.kb
This is the download rate for the HTTP symmetric transport. A value of -1 means full throttle.
[ Default: -1 ]
Symmetric DS v2.2 89
Parameters
http.concurrent.workers.max
This is the number of HTTP concurrent push/pull requests symmetric will accept. This is controlled
by the NodeConcurrencyFilter. The maximum number of database connections in the database pool
should be set to twice this number.[ Default: 20 ]
offline.node.detection.period.minutes
This is the minimum number of minutes that a child node has been offline before taking action. Refer
to Section 6.15, IOfflineServerListener (p. 59) for more information. [ Default: 120 ]
outgoing.batches.peek.ahead.window.after.max.size
This is the maximum number of events that will be peeked at to look for additional transaction rows
after the max batch size is reached. The more concurrency in your db and the longer the transaction
takes the bigger this value might have to be. [ Default: 100 ]
incoming.batches.skip.duplicates
Whether or not to skip duplicate batches that are received. A duplicate batch is identified by the batch
ID already existing in the incoming batch table. If this happens, it means an acknowledgement was
lost due to failure or there is a bug. Accepting a duplicate batch in this case can mean overwriting data
with old data. Another cause of duplicates is when the batch sequence number is reset, which might
happen in a lab environement. Skipping a duplicate batch in this case would prevent data changes
from loading. Generally, in a production envionment, this setting should be true. [ Default: true ]
num.of.ack.retries
This is the number of times we will attempt to send an ACK back to the remote node when pulling
and loading data. [ Default: 5 ]
time.between.ack.retries.ms
This is the amount of time to wait between trying to send an ACK back to the remote node when
pulling and loading data. [ Default: 5000 ]
dataextractor.enabled
Enable or disable all data extraction at a node for all channels other than the config channel.
[ Default: true ]
dataloader.enabled
Enable or disable all data loading at a node for all channels other than the config channel.
[ Default: true ]
dataloader.enable.fallback.update
If an insert is received, but the row already exists, then try an update instead. [ Default: true ]
dataloader.enable.fallback.insert
If an update is received, but it affects no rows, then try to insert instead. [ Default: true ]
dataloader.allow.missing.delete
If a delete is received, but it affects no rows, then continue. [ Default: true ]
cluster.server.id
Set this if you want to give your server a unique name to be used to identify which server did what
action. Typically useful when running in a clustered environment. This is currently used by the
Symmetric DS v2.2 90
Parameters
cluster.lock.timeout.ms
Time limit of lock before it is considered abandoned and can be broken. [ Default: 1800000 ]
cluster.lock.enabled
[ Default: false ]
initial.load.delete.first
Set this if tables should be purged prior to an initial load. [ Default: false ]
initial.load.create.first
Set this if tables (and their indexes) should be created prior to an initial load. [ Default: false ]
http.timeout.ms
Sets both the connection and read timeout on the internal HttpUrlConnection. [ Default: 600000s ]
http.compression
Whether or not to use compression over HTTP connections. Currently, this setting only affects the
push connection of the source node. Compression on a pull is enabled using a filter in the web.xml for
the PullServlet. [ Default: true ]
web.compression.disabled
Disable compression from occurring on Servlet communication. This property only affects the
outbound HTTP traffic streamed by the PullServlet and PushServlet. [ Default: false ]
compression.level
Set the compression level this node will use when compressing synchronization payloads. Valid
values include: NO_COMPRESSION = 0, BEST_SPEED = 1, BEST_COMPRESSION = 9,
DEFAULT_COMPRESSION = -1 [ Default: -1 ]
compression.strategy
Set the compression strategy this node will use when compressing synchronization payloads. Valid
values include: FILTERED = 1, HUFFMAN_ONLY = 2, DEFAULT_STRATEGY = 0 [ Default: 0 ]
stream.to.file.enabled
Save data to the file system before transporting it to the client or loading it to the database if the
number of bytes is past a certain threshold. This allows for better compression and better use of
database and network resources. Statistics in the batch tables will be more accurate if this is set to true
because each timed operation is independent of the others. [ Default: true ]
stream.to.file.threshold.bytes
If stream.to.file.enabled is true, then the threshold number of bytes at which a file will be written is
controlled by this property. Note that for a synchronization the entire payload of the synchronization
will be buffered in memory up to this number (at which point it will be written and continue to stream
to disk) [ Default: 32767 ]
job.random.max.start.time.ms
When starting jobs, symmetric attempts to randomize the start time to spread out load. This is the
maximum wait period before starting a job. [ Default: 10000 ]
Symmetric DS v2.2 91
Parameters
purge.retention.minutes
This is the retention for how long synchronization data will be kept in the SymmetricDS
synchronization tables. Note that data will be purged only if the purge job is enabled. [ Default: 7200 ]
statistic.retention.minutes
This is the retention for how long statistic data will be kept in the SymmetricDS staistic table. Note
that data will be purged only if the purge job is enabled. [ Default: 7200 ]
job.route.period.time.ms
This is how often the route job will be run. [ Default: 10000 ]
job.push.period.time.ms
This is how often the push job will be run. [ Default: 60000 ]
job.pull.period.time.ms
This is how often the pull job will be run. [ Default: 60000 ]
job.synctriggers.aftermidnight.minutes
If scheduled, the sync triggers job will run nightly. This is how long after midnight that job will run.
[ Default: 15 ]
schema.version
This is hook to give the user a mechanism to indicate the schema version that is being synchronized.
This property is only valid if you use the default IRuntimeConfiguration implementation.
[ Default: ? ]
registration.url
The URL where this node can connect for registration to receive its configuration. This property is
only valid if you use the default IRuntimeConfiguration implementation. [ Default: ]
sync.url
The URL where this node can be contacting for synchronization.
[ Default: http://localhost:8080/sync ]
group.id
The node group id for this node. [ Default: default ]
external.id
The secondary identifier for this node that has meaning to the system where it is deployed. While the
node id is a generated sequence number, the external ID could have meaning in the user's domain,
such as a retail store number. [ Default: ]
transport.type
Specify the transport type. Supported values currently include: http, internal. [ Default: http ]
hsqldb.initialize.db
If using the HsqlDbDialect, this property indicates whether Symmetric should setup the embedded
database properties or if an external application will be doing so. [ Default: true ]
Symmetric DS v2.2 92
Appendix C. Database Notes
Each database management system has its own characteristics that results in feature coverage in
SymmetricDS. The following table shows which features are available by database.
HSQLDB 1.8 Y Y Y Y Y
HSQLDB 2.0 N Y Y Y Y
H2 1.x Y Y Y Y Y
Firebird 2.0 Y Y Y Y Y
Informix 11 N Y Y Y N
C.1. Oracle
On Oracle Real Application Clusters (RAC), sequences should be ordered so data is processed in the
correct order. To offset the performance cost of ordering, the sequences should also be cached.
While BLOBs are supported on Oracle, the LONG data type is not. LONG columns cannot be accessed
from triggers.
Note that while Oracle supports multiple triggers of the same type to be defined, the order in which the
triggers occur appears to be arbitrary.
The SymmetricDS user generally needs privileges for connecting and creating tables (including indexes),
Symmetric DS v2.2 93
Database Notes
triggers, sequences, and procedures (including packages and functions). The following is an example of
the needed grant statements:
Partitioning the DATA table by channel can help insert, routing and extraction performance on
concurrent, high throughput systems. TRIGGERs should be organized to put data that is expected to be
inserted concurrently on separate CHANNELs. The following is an example of partitioning. Note that
both the table and the index should be partitioned. The default value allows for more channels to be added
without having to modify the partitions.
C.2. MySQL
Symmetric DS v2.2 94
Database Notes
MySQL supports several storage engines for different table types. SymmetricDS requires a storage engine
that handles transaction-safe tables. The recommended storage engine is InnoDB, which is included by
default in MySQL 5.0 distributions. Either select the InnoDB engine during installation or modify your
server configuration. To make InnoDB the default storage engine, modify your MySQL server
configuration file (my.ini on Windows, my.cnf on Unix):
default-storage_engine = innodb
Alternatively, you can convert tables to the InnoDB storage engine with the following command:
On MySQL 5.0, the SymmetricDS user needs the SUPER privilege in order to create triggers.
On MySQL 5.1, the SymmetricDS user needs the TRIGGER and CREATE ROUTINE privileges in order
to create triggers and functions.
C.3. PostgreSQL
Starting with PostgreSQL 8.3, SymmetricDS supports the transaction identifier. Binary Large Object
(BLOB) replication is supported for both byte array (BYTEA) and object ID (OID) data types.
In order to function properly, SymmetricDS needs to use session variables. On PostgreSQL, session
variables are enabled using a custom variable class. Add the following line to the postgresql.conf file of
PostgreSQL server:
custom_variable_classes = 'symmetric'
This setting is required, and SymmetricDS will log an error and exit if it is not present.
Before database triggers can be created by in PostgreSQL, the plpgsql language handler must be installed
on the database. The following statements should be run by the administrator on the database:
Symmetric DS v2.2 95
Database Notes
C.5. HSQLDB
HSQLDB was implemented with the intention that the database be run embedded in the same JVM
process as SymmetricDS. Instead of dynamically generating static SQL-based triggers like the other
databases, HSQLDB triggers are Java classes that re-use existing SymmetricDS services to read the
configuration and insert data events accordingly.
The transaction identifier support is based on SQL events that happen in a 'window' of time. The
trigger(s) track when the last trigger fired. If a trigger fired within X milliseconds of the previous firing,
then the current event gets the same transaction identifier as the last. If the time window has passed, then
a new transaction identifier is generated.
C.6. H2
The H2 database allows only Java-based triggers. Therefore the H2 dialect requires that the SymmetricDS
jar file be in the database's classpath.
Currently, the DB2 Dialect for SymmetricDS does not provide support for transactional synchronization.
Large objects (LOB) are supported, but are limited to 16,336 bytes in size. The current features in the
Symmetric DS v2.2 96
Database Notes
DB2 Dialect have been tested using DB2 9.5 on Linux and Windows operating systems.
There is currently a bug with the retrieval of auto increment columns with the DB2 9.5 JDBC drivers that
causes some of the SymmetricDS configuration tables to be rebuilt when auto.config.database=true. The
DB2 9.7 JDBC drivers seem to have fixed the issue. They may be used with the 9.5 database.
A system temporary tablespace with too small of a page size may cause the following trigger build errors:
Simply create a system temporary tablespace that has a bigger page size. A page size of 8k will probably
suffice.
C.9. Firebird
The Firebird Dialect requires the installation of a User Defined Function (UDF) library in order to
provide functionality needed by the database triggers. SymmetricDS includes the required UDF library,
called SYM_UDF, in both source form (as a C program) and as pre-compiled libraries for both Windows
and Linux. The SYM_UDF library is copied into the UDF folder within the Firebird installation
directory.
cp databases/firebird/sym_udf.so /opt/firebird/UDF
The Jaybird JDBC driver was used during testing, but the user must download the driver and place it in
the SymmetricDS "lib" folder.
C.10. Informix
The Informix Dialect was tested against Informix Dynamic Server 11.50, but older versions may also
work. You need to download the Informix JDBC Driver (from the IBM Download Site) and put the
Symmetric DS v2.2 97
Database Notes
Make sure your database has logging enabled, which enables transaction support. Enable logging when
creating the database, like this:
Symmetric DS v2.2 98
Appendix D. Data Format
The SymmetricDS Data Format is used to stream data from one node to another. The data format reader
and writer are pluggable with an initial implementation using a format based on Comma Separated
Values (CSV). Each line in the stream is a record with fields separated by commas. String fields are
surrounded with double quotes. Double quotes and backslashes used in a string field are escaped with a
backslash. Binary values are represented as a string with hex values in "\0xab" format. The absence of
any value in the field indicates a null value. Extra spacing is ignored and lines starting with a hash are
ignored.
The first field of each line gives the directive for the line. The following directives are used:
nodeid, {node_id}
Identifies which node the data is coming from. Occurs once in CSV file.
binary, {BASE64|NONE|HEX}
Identifies the type of decoding the loader needs to use to decode binary data in the pay load. This
varies depending on what database is the source of the data.
channel, {channel_id}
Identifies which channel a batch belongs to. The SymmetricDS data loader expects the channel to be
specified before the batch.
batch, {batch_id}
Uniquely identifies a batch. Used to track whether a batch has been loaded before. A batch of -9999 is
considered a virtual batch and will be loaded, but will not be recorded in incoming_batch.
Symmetric DS v2.2 99
Data Format
create, {xml}
Optional notation that instructs the data loader to run the accompanying DdlUtils XML table
definition in order to create a database table.
commit, {batch_id}
An indicator that the batch has been transmitted and the data can be committed to the database.
nodeid, 1001
channel, pricing
binary, BASE64
batch, 100
schema,
catalog,
table, item_selling_price
keys, price_id
columns, price_id, price, cost
insert, 55, 0.65, 0.55
schema,
catalog,
table, item
keys, item_id
columns, item_id, price_id, name
insert, 110000055, 55, "Soft Drink"
delete, 110000001
schema,
catalog,
table, item_selling_price
update, 55, 0.75, 0.65, 55
commit, 100