Infinispan Data Grid Platform Definitive Guide - Sample Chapter
Infinispan Data Grid Platform Definitive Guide - Sample Chapter
$ 44.99 US
29.99 UK
P U B L I S H I N G
ee
pl
C o m m u n i t y
E x p e r i e n c e
D i s t i l l e d
Sa
m
Understanding Transactions
and Concurrency
In the previous chapter, we introduced the Ticket Monster application to learn
about JBoss Developer Framework and JBoss-related technology. We added some
requirements in order to analyze how Infinispan can be used to improve scalability.
In this chapter, we will talk about transactions and how you can work with
transactions with Infinispan. We'll discuss the following topics:
Transaction fundamentals
Transaction models
Designing your cache application and the transaction configuration that connects
your application to a data store for optimal performance isn't easy. This chapter will
help you identify the best concurrency model and help you determine how to better
manage your transactions.
To begin with, we are going to be cover the basic concepts required to understand
how Infinispan is structured and how applications can use it.
[ 215 ]
Transaction fundamentals
Before we start to learn how Infinispan deals with transactions, let's learn the basics
about transactions in order to extract the best from this feature.
By definition, a transaction allows you to set the boundaries of a user-defined series
of logically related read or write operations (get() or put()). All changes brought
by a write operation (update, remove, or add an entry in the cache) are either undone
or made permanent at the same time.
The processing of these transactions is divided into individual, atomic operations
and might help you isolate one transaction from another transaction in a multiuser
application. Each transaction at the end must complete the processing with success
or failure as a complete unit, and a transaction should not complete the processing
in an intermediate state.
To execute all your data grid operations inside a transaction, you have to mark the
boundaries of that transaction. You must start the transaction and at some point
commit the changes. Generally, you conclude a transaction with the COMMIT or
ROLLBACK statements, to ensure that all or none of your changes are stored in
the grid.
The commit processes writes the data into the cache and makes it visible to other
users. Rolling back discards any changes that have been made to any cache entry
since the beginning of the current transaction.
Let's exemplify this concept with the classical example of a bank transfer
operation, in which a customer performs a transfer of $50,000 between accounts.
The bank application needs to perform an operation that consists of three separate
but related actions, as follows:
1. Debit the source account by the required amount of $50,000.
2. Credit the target account by the required amount of $50,000.
3. Write a log entry recording the transfer.
When viewed as three separate operations, it's not difficult to imagine a disaster.
These operations should be performed as a unit. Imagine that for some reason the
debit operation succeeds, and the bank's central server fails just after the debit
operation, before the credit operation completes. This kind of error can cause a
great deal of damage to the bank's image.
It is important to understand that transactions in Infinispan are not like transactions
on a relational database product. Infinispan is an in-memory database, it utilizes
volatile storage for in-memory data, which means that if the application goes down
all the data cached in the memory is lost.
[ 216 ]
Chapter 7
It's the responsibility of the cache to recreate data upon startup, either from
backup nodes or from other persistent storage systems, if you have a cache
loader/store configured.
Another important responsibility of the data grid is to protect your information and
to guarantee that the data between different grid nodes must remain consistent,
which is part of the ACID criteria commonly applied to transactional database
management systems.
In a distributed environment, things get more complicated. Infinispan uses a
two-phase commit (also known as 2PC) to manage all the cluster nodes that are
participating in a distributed transaction. A well known practice to ensure the
atomicity of a transaction that accesses multiple resource managers.
We have covered about ACID and two-phase commit
in Chapter 2, Barriers to Scaling Data.
[ 217 ]
Before asking any of the instances to commit the changes, the transaction
manager must first check with all the participants of the grid to prepare to
commit. When one of the participant instances acknowledges that it is ready
(or prepared) to commit the transaction, it is an indication that it can commit the
transaction. If any of the participants fails to prepare, the transaction manager
will rollback the whole transaction.
If all participants confirm the first phase (prepare), in the second phase,
the transaction manager will ask all the resources to commit the transaction,
which cannot fail otherwise, as we said, the transaction will be rolled back.
The transaction manager can coordinate a transaction that spans several resources.
Infinispan will perform the following operation for every transaction:
However, before you obtain a TransactionManager object, you have to specify the
settings for a transaction either declaratively or programmatically. In either case,
the result is the same.
If you want to use a configuration file, you can define the transactional characteristics
of the cache through the <transaction> tag, under a cache configuration for a
specific cache.
[ 218 ]
Chapter 7
DummyTransactionManagerLookup: A TransactionManagerLookup
interface for testing, it is not recommended for use in the production
environment and it has important limitations related to concurrent
transactions and recovery.
[ 220 ]
Chapter 7
Most cache operations within the scope of a JTA Transaction will propagate a
runtime CacheException or any of its subclasses on failure, causing the transaction
to be automatically marked for rollback for all application exceptions.
Transactional modes
The transaction mode attribute is slightly different in Infinispan 6 and 7. In Infinispan
6, the transactionalMode attribute configures whether the cache is transactional
or not.
In the next example, we can see a sample configuration using Infinispan 6:
<transaction transactionMode="NON_TRANSACTIONAL"/>
<transaction transactionMode="TRANSACTIONAL"/>
Starting with Infinispan 5.1, you cannot mix transactional and non-transactional
operations anymore; you have to decide which cache mode you want to use.
There are many reasons to follow this path, but one of the most important reasons is
the clean semantics on how concurrency is managed between multiple requestors for
the same cache entry. If we were allowed to mix transactional and non-transactional
operations, we would run the risk of experiencing unexpected behavior, such as
deadlocks, interacting with a transactional and non-transactional code.
In Infinispan 7, we have the mode attribute, which configures the transaction type
for the cache to one of the following modes: NONE, BATCH, NON_XA, NON_DURABLE_XA,
and FULL_XA.
The code in Infinispan 7 is as follows:
<transaction transaction-manager-lookup=
"org.infinispan.transaction.lookup.
GenericTransactionManagerLookup" mode="NON_XA"/>
[ 221 ]
On a transactional cache with auto commit enabled, any call performed outside a
transaction's scope is transparently wrapped within a transaction. Before the call,
Infinispan adds the logic for starting a transaction and performs a commit after
the call.
In Infinispan 6, when using the autocommit mode, you can
sacrifice data consistency for a better performance by using
the use1PcForAutoCommitTransactions attribute on
the <transaction> element. If you set it to true, Infinispan
will commit the transaction in a single phase by reducing the
number of operations from 2 to 1 Remote Procedure Calls (RPC).
Transactional models
A transactional cache in Infinispan supports two different transactional models,
optimistic and pessimistic.
The optimistic model refers to an approach in which transactions are allowed to
proceed, with conflicts resolved as late as possible, deferring lock acquisitions for the
transaction, in order to prepare time.. The entry will not be immediately locked when
it is accessed by a transaction, which means that the cache entries will be available to
other transactions for concurrent access, opening up the possibility of conflicts.
[ 222 ]
Chapter 7
At commit time, when the entry is about to be updated in the grid, Infinispan will
compare the version of the current object to the version that was initially saved at
the moment the entry was first requested in the transaction. If both versions differ
from each other, Infinispan will consider that a conflict exists, and will mark the
transaction for rollback. This avoids deadlocks, optimizes the lock acquisition time,
and increases throughput significantly.
On the other hand, the pessimistic model refers to an approach in which potential
conflicts are detected and resolved earlier. Cluster wide locks are acquired for every
write operation and only released after the transaction commits.
Optimistic transaction
As we said, during optimistic transactions, locks are acquired during the prepare
phase and are held until the time the transaction commits (or rollbacks).
Optimistic transactions are recommended when the probability of two different
users change the same data in parallel is low.
The following diagram shows how the optimistic lock works:
[ 223 ]
The diagram shows the users, John and Alice, issuing a put operation to the
k1 object in a different transaction context. We can see that John invoked the
put() method first, but Alice performed a commit operation in her transaction
(Transaction 02) before John saved his changes.
After invoking the commit operation, the Transaction Manager starts the prepare
phase and locks the k1 object, and at the end of the commit phase, it releases the
lock, making the k1 object stale to John. When he tries to commit his transaction,
Infinispan identifies a conflict, throws an exception, and marks the transaction
for rollback.
In Infinispan 6, optimistic transactions can be enabled specifying the lockingMode
attribute in the configuration file, as you can see in the next example:
<namedCache name="transactionalOptimistic">
<transaction transactionMode="TRANSACTIONAL"
lockingMode="OPTIMISTIC"/>
</namedCache>
You can enable optimistic transactions, specifying the locking attribute within
the <transaction> element. Check the following example for the Infinispan 7
configuration that creates a cache with an optimistic locking schema. The code
in Infinispan 7 is as follows:
<local-cache name="transactionCache">
<transaction transaction-manager-lookup=
"org.infinispan.transaction.lookup.
JBossStandaloneJTAManagerLookup"
mode="NON_XA" locking="OPTIMISTIC" />
</local-cache>
[ 224 ]
Chapter 7
Pessimistic transaction
Pessimistic transactions prevent that other concurrent transactions modifying
the same entry. Infinispan obtains locks on entry keys at the time it is written.
The following diagram shows how the pessimistic lock works:
In our example, when the cache.put(k1,v1) method returns, the k1 object will be
automatically locked in the transaction, preventing concurrent transactions from
updating it. Concurrent transactions are allowed only to read the k1 object, but not
update it, which is the case with Alice's transaction. The lock is released when the
transaction completes via a commit or rollback operation.
You can enable or disable pessimistic transactions in the configuration file by
changing the correspondent locking attribute.
The following sample code shows how to configure pessimistic locking with
Infinispan 6:
<namedCache name="transactional"/>
<transaction transactionMode="TRANSACTIONAL"
lockingMode="PESSIMISTIC"/>
</namedCache>
[ 225 ]
Batch mode
Infinispan provides several methods of putting data in the cache, such as the
standard map operations such as cache.put(...), cache.putAll(...), or cache.
putIfAbsent(...) an overloaded form of ConcurrentMap.putIfAbsent(), which
only stores the value if no value is stored under the same key.
[ 226 ]
Chapter 7
However, these methods will result in a separate network call for each operation,
which is not suitable for scenarios where large amounts of data must be loaded
into the data store, especially for caches in replication mode. This is the case, for
instance, building a mirror site or importing data to the cache when transaction
control is not important.
For these cases, Infinispan provides the ability to batch multiple cache operations
through the interface org.infinispan.commons.api.BatchingCache that provides
the startBatch() method to start the batch process, and endBatch(boolean) to
complete the process.
The Infinispan batching mode allows atomicity and other transactional characteristics,
but doesn't provide full JTA or XA capabilities.
In the batching mode, all configuration options related to the transaction
such as syncRollbackPhase, syncCommitPhase, useEagerLocking, and
eagerLockSingleNode are applied as well. Internally, the batching process starts a
JTA transaction using a simple internal TransactionManager implementation without
recovery. And all the entries in that scope will be queued on the same instance and
changes are batched together around the cluster in a part of the completion process,
reducing replication overhead for each update in the batch.
When you use the batch mode, there is no transaction
manager defined.
[ 227 ]
After this, configure your cache to use batching. Perhaps the easiest way to illustrate
this is to demonstrate a simple scenario showing how to import a CSV file into
Infinispan, in order to prepopulate a cache before the application makes use of it.
First, we created a CSV file in the resource folder with the name csv_guest_list.
csv, with the following content:
ID, first_name, last_name, document_number, birth_date
1,John, Wayne,832218,19801112
2,Eddy,Murphy,822712,19901003
3,Fred,Mercury,872211,19640321
4,Juliette,Lewis,862211,19720804
5,Kate,Moss,872911,19790413
The content of the CSV file is a guest list for a given event. Next, we created a POJO
class Guest for the imported data and a utility class GuestListImporter to import
CSV files.
Finally, we can use the batching process by calling startBatch() and endBatch(),
as highlighted in the following example:
Cache<Integer, Guest> cache =
container.getCache("batchingCacheWithEvictionAndPassivation");
List<Guest> guests = new
GuestListImporter().parseGuestFile("guest_list.csv");
try{
cache.startBatch();
for(Guest guest : guests){
// do some processing
cache.put(guest.getId(), guest);
}
assertEquals(guests.size(), 5);
cache.endBatch(true);
}catch(Exception ex){
cache.endBatch(false);
}
Note that the endBatch()method receives a Boolean parameter, which completes the
batch if true; otherwise, it will discard all changes made in the batch.
[ 228 ]
Chapter 7
Transaction recovery
Although XA transactions possess the ACID characteristics in order to guarantee
the atomicity of operations, our system must be able to handle failures in order to
guarantee the consistency of customer data, which can occur at any time due to
unexpected server crash or network loss.
To guarantee transaction consistency, Infinispan supports transaction recovery,
a well known feature of XA transactions, present in the specification published
by the Open Group.
Let's suppose a situation where a customer buys from Ticket Monster, a ticket for
a specific show, but we have to save the ticket in two different nodes in Infinispan.
The Transaction Manager will be responsible for communicating with both resources
that are in use.
When the transaction manager commits, in phase one, the transaction manager
asks both resources to prepare the commit. Then both resources verifies they can
persist the data and each resource sends an acknowledgement to the coordinator.
In the second phase, when resources are requested to commit, for some reason one
of the them fails to complete the commit operation, thus leaving the cache data in
an inconsistent state.
In situations like this, Infinispan supports automatic transaction recovery
coordinated by the Transaction Manager, to make sure data in both resources ends
up being consistent.
The Transaction Manager works with Infinispan to determine any transaction in
an in-doubt state that was prepared but not committed. If there are no left-pending
transactions, it will proceed normally, otherwise, the Transaction Manager will
request the Infinispan cluster to complete the commit or force the rollback to release
any resource.
There are some cases where Infinispan will not be able to recover all transactions
in an in-doubt state, and where recovery could not be complete, for these cases,
Infinispan can list transactions that require a manual intervention.
As a system administrator, you can configure Infinispan to receive notifications
about these cases that require manual intervention by e-mail or log alerts, which
require some configuration on the transaction manager.
[ 229 ]
The following diagram shows a graphical illustration of the concept, with a node
failure in the originator:
From the image you can see that the changed data is held by the Recovery Manager
only for in-doubt transactions, being removed for successfully completed transactions
after the commit or rollback phase is complete.
You can enable transaction recovery per cache level through XML configuration. If you
are using Infinispan 6, you can enable transaction recovery by adding a <recovery>
element, a child element of the <transaction> parent element, as follows:
<transaction transactionManagerLookupClass=
"org.infinispan.transaction.lookup.
GenericTransactionManagerLookup"
transactionMode="TRANSACTIONAL">
[ 230 ]
Chapter 7
<recovery enabled="true" recoveryInfoCacheName="recoveryCache"/>
</transaction>
[ 231 ]
All in-doubt transaction data will be backed up at a local cache specified through the
recoveryInfoCacheName configuration attribute, if available, which allows data to
be evicted to the disk through the cache loader as normal cache, in case it gets too big.
Besides the fact that the XA specification allows to run the recovery in a different
process, today in Infinispan it's only possible to run the recovery process in the
same process, where the Infinispan instance exists.
A future release is planned for Hot Rod clients to support transactions. We will see
Hot Rod in action in Chapter 9, Server Modules.
[ 232 ]
Chapter 7
If locking is not available and several users access a distributed cache concurrently,
concurrency problems may occur if their transactions access the same data at the
same time.
Concurrency problems include:
Dirty Reads: This means that a client can read uncommitted changes made
by another client, which is shown in the following figure:
[ 233 ]
Lost Updates: It may happen when two or more clients, during their
individual transactions select the same entry and change their value. The
transactions are independent and unaware of each other; lost updates might
happen when the last update to the entry overwrites updates performed by
other clients, which results in lost data, as shown in the following figure:
[ 234 ]
Chapter 7
To overcome these phenomena, the ANSI/ISO SQL standard defines four levels of
transaction isolation that will provide a different level of access control which may
or may not be more restrictive.
As you can deduce by the ANSI/ISO standard, isolation levels have their origins in
relational databases. The defined transaction isolation levels are:
READ COMMITTED: This level means that the read operation can see only
data committed before the operation began, because the read lock is released
immediately after operation. But, during the write operation, the system
keeps the lock until the transaction commits. Higher concurrency is possible
when using this level. This mode prevents dirty reads but allows problems
of lost updates, nonrepeatable reads, and phantom reads to occur.
REPEATABLE READ: This level means that every lock acquired during
a transaction is held until the end of the transaction. This mode prevents
the problem of non-repeatable reads, because once data has been written,
no other transaction can read it, but it allows problems related to phantom
reads to occur.
SERIALIZABLE: This is the most restrictive isolation level. This level requires
read and write locks to be released at the end of the transaction and does not
allow the insertion of new entries into the range that is locked. This mode
prevents all concurrency problems, but it can cause serialization failures.
Please consult the java.sql.Connection API
for more references.
As we have seen before, Infinispan uses the two-phase commit (2PC) protocol, in
order to coordinate with all the processes that participate in a distributed transaction.
However, there are also costs related to 2PC in a replicated cache. First, the cost
associated to memory consumption by the replicated entries that can be reduced
drastically by changing the cache mode to distribution. Also, the more objects you
have in your data grid, the more and more expensive it becomes to ensure the
consistency of these objects.
You might be saying that the distribution clustering mode can overcome these
disadvantages, by configuring the Infinispan cache in such a way that each item is
replicated in a limited number of nodes. But, in the distribution mode, we have an
additional overhead associated with the coordination costs.
[ 235 ]
For these reasons, Infinispan has opted for relaxing consistency, ensuring weaker
semantics in order to allow more efficient implementations.
Specifically, Infinispan supports the following isolation levels: read committed and
Repeatable read. But remember, transactions in Infinispan are not like transactions
in a relational database product.
In Infinispan, READ COMMITTED and REPEATABLE READ work slightly
differently than databases. In READ COMMITTED, reads can happen anytime,
while in REPEATABLE READ, once the data has been written to the cache, no
other transaction can read it so there's no chance of later re-reading the data
under the same key and getting back a completely different value.
Infinispan has only been able to provide all of this thanks to Multi-Version
Concurrency Control (MVCC), which is the subject of the next section.
[ 236 ]
Chapter 7
Isolation levels define the level a reader can see of a concurrent write. Depending on
the isolation level you choose, you will have a different behavior in how the state is
committed back to the cache.
The default isolation level used by Infinispan is READ_COMMITTED, which also
performs the best, and is generally good enough for most applications.
Let's take a look at a more detailed example that shows the difference between
during two consecutive read operations on the same key, if the key is updated by
a third transaction, the second read will return the new value, which is shown in
the following figure:
As you can see, in this diagram we are showing an example of a possible scenario
with the READ COMMITED isolation level, and at the end of the diagram, in step 7
the second read returns a new value v2.
[ 237 ]
However, if we were using the REPEATABLE_READ isolation level, step 7 would still
return v1. So, if you want to read the same entry multiple times within a transaction,
we recommend you to use REPEATABLE_READ. The REPEATABLE_READ isolation level
also allows for an additional safety check known as Write Skew Check.
In the classical literature, the term 'write skew' refers to an anomaly that can arise
with Snapshot Isolation (SI).
Snapshot Isolation is a guarantee that a transaction will always
read data from a snapshot of the cache store data as of the time
the transaction started, which is called its Start-Timestamp.
In the context of Infinispan, you have seen that Infinispan is not implementing
Snapshot Isolation, but rather an efficiently weaker consistency level. The key
difference with respect to SI is that the grid is storing a single version of each
entry, and there is no guarantee that the reads of a given transaction will return
from the same snapshot.
Infinispan provides a reliable mechanism of data versioning to improve write
skew checks when using optimistic transactions, REPEATABLE_READ, and a clustered
cache. To enable write skew check, set the <locking> element's writeSkewCheck
attribute to true in the config file. The following table describes the attributes of the
<locking> element:
Attribute
concurrencyLevel (ISPN 6)
Type
Default
Description
int
32 (ISPN 6)
1000 (ISPN 7)
concurrency-level (ISPN 7)
[ 238 ]
Chapter 7
Attribute
isolationLevel (ISPN 6)
isolation (ISPN 7)
lockAcquisitionTimeout(ISPN
6)
Type
REPEATABLE_
READ
Default
READ_
COMMITTED
READ_
COMMITTED
long
10000 (ISPN 6)
* 10 seconds
15000 (ISPN 7)
* 15 seconds
acquire-timeout (ISPN 7)
useLockStriping(ISPN 6)
Boolean
striping (ISPN 7)
false
Description
This defines
the isolation
level for the
cache. As we
said, Infinispan
supports
only the
REPEATABLE_
READ and
READ_
COMMITTED
isolation levels.
This defines
the maximum
time (in
milliseconds)
to attempt
a lock
acquisition.
If set to true,
Infinispan will
maintain a
pool of shared
locks to be
shared by the
entries that
have to be
locked. If set
to false, a lock
will be created
under request,
per entry in the
cache.
Lock striping
can help to
control the
memory
footprint of
your cache, but
may reduce
concurrency.
[ 239 ]
Attribute
writeSkewCheck (ISPN 6)
Type
Default
Description
Boolean
false
This setting is
only relevant
when the
isolation
level is
REPEATABLE_
READ. If set
to true, when
Infinsipan
identifies a
version conflict
(write skew
check), it
will raise an
exception.
write-skew (ISPN 7)
Otherwise,
if during
the commit
phase, the
writer thread
discovers that
the working
entry is
different to
the underlying
entry,
Infinispan will
overwrite the
underlying
entry with the
working entry.
[ 240 ]
Chapter 7
Lock striping provides a highly scalable locking mechanism and helps control
memory footprint, but you may reduce concurrency in the system and run the
risk of blocking irrelevant entries in the same lock.
Lock striping is disabled as a default in Infinispan; to enable lock striping, set the
useLockStriping attribute to true in the config file, and you can tune the size
of a segment used by lock striping using the concurrencyLevel attribute of the
locking configuration element.
For Infinispan 6 the configuration is as follows:
<namedCache name="transactionCacheWithLocking">
<jmxStatistics enabled="true"/>
<transaction transactionManagerLookupClass=
"org.infinispan.transaction.lookup.
JBossStandaloneJTAManagerLookup"
transactionMode="TRANSACTIONAL"
lockingMode="PESSIMISTIC" />
<locking isolationLevel="READ_COMMITTED"
writeSkewCheck="false" concurrencyLevel="5000"
useLockStriping="true" />
</namedCache>
The way you configure locking in Infinispan 7 is quite similar to the earlier version:
<local-cache name="transactionCacheWithLocking">
<transaction transaction-manager-lookup=
"org.infinispan.transaction.lookup.
JBossStandaloneJTAManagerLookup" mode="NON_XA"
locking="PESSIMISTIC" />
<locking isolation="READ_COMMITTED" write-skew="false"
concurrency-level="5000" striping="true" />
</local-cache>
If the lock striping attribute is disabled, a lock will be generated based on the hash
code of the entry's key, created per entry in the cache, which can increase memory
usage, and so on.
Previously, in Infinispan 4.x lock striping was enabled by default.
From Infinispan 5.0, due to potential deadlocks, this mechanism
is disabled by default.
[ 241 ]
The Infinispan cache interface includes basic locking methods, which allows cache
users to use these methods during a transaction, to lock the cache entries eagerly.
On lock calls, Infinispan will attempt to lock the requested cache keys across the
cluster nodes of the grid and at the commit (or rollback) phase, Infinispan will
release all locks held by the transaction, regardless of success or failure.
A cache object can be locked explicitly by the lock method:
tx.begin()
cache.lock(K1)
cache.put(K1,VX)
tx.commit()
Implicit locking obtains access rights to cache entries, as they are needed by an
application. In general, the implicit locking offered by Infinispan provides a level
of concurrency that is sufficient for most applications.
Infinispan will implicitly obtain the appropriate locks for your application at the
point at which they are needed, as cache entries are accessed for write operations.
In the following sample transaction, we can see one transaction running in one of
the cache nodes:
tx.begin()
cache.put(K1,V1)
cache.put(K2,V2)
[ 242 ]
Chapter 7
cache.put(K1,VX)
K1
tx.commit()
In a nutshell, for implicit eager locking, Infinispan will check for each modification
whether the cache entry is locked locally. If the entry is locked, it means that the
entry is also locked in the grid, otherwise, if the entry is not locked Infinispan will
send a cluster wide lock request to acquire the lock.
You can also lock a single remote node; however, this configuration is only applied
on distributed mode, and would make the number of remote locks acquired to be 1
always, regardless of the configured number of owners.
Infinispan guarantees data consistency in front of a single
node lock. The lock for a given key is always deterministically
acquired on the same node of the cluster, regardless of where
the transaction originates.
Lock timeouts
Once you have a lock, you can hold it to execute your required operations, and
then, when you finish your tasks you can release the lock for another process to
use. You can define the limit of time a cache client can spend waiting to acquire
a lock; if a lock request does not return before the specified timeout limit, one of
the transactions will rollback, allowing the other to continue working.
You can define a lock acquisition timeout (LAT), to a higher threshold (default is 10
seconds), in the <locking> element of your default or named cache configuration.
The following example sets the lock acquisition timeout to 20 seconds:
To set the LAT to 20 seconds in Infinispan 6, add the following code:
<locking lockAcquisitionTimeout="20000"/>
[ 243 ]
Deadlock detection
One risk that might come out with the use of explicit lock is the occurrence of
deadlocks, which can occur when concurrent users (two or more) are waiting
for an object that has been locked by themselves.
The following situation illustrates a deadlock, imagine we have two transactions
and each transaction has a lock on the entry it attempts to update; and the two
transactions (in parallel) proceed without committing the transaction. However,
each transaction tries to update the cache entry held by the other transaction. As
a consequence, both of them will be blocked, because both transactions will not be
able to retrieve the entry they need in order to proceed or terminate the transaction.
The following diagram depicts the scenario where two simple transactions, both
trying to lock two of the same entries, can get into a deadlock situation:
[ 244 ]
Chapter 7
Transaction 01 started off by successfully acquiring the lock on entry key K2, with
the intent to change it later. Likewise, Transaction 02 successfully acquired the lock
on entry key K1.
Now, in order to continue its processing, Transaction 01 tries to acquire the lock on
K1 as well. But now, Transaction 02 already locks K1. Transaction 01 has to wait
until Transaction 02 finishes.
In this scenario, Transaction 01 patiently waits for Transaction 02 to finish and
to release K1 eventually. But at the same time, Transaction 02 holds K1, it tries to
acquire K2, and cannot get it. This is the most common and simplest scenario of a
deadlock, where we have two or more clients (threads) waiting forever because of
a locking dependency in all the threads.
User experience may be affected because in all of the following, request(s) to the
cache entry will freeze during the deadlock period, which might extend up to LAT.
Neither transaction can obtain the desired key they need in order to proceed or
terminate the transaction. The only way out from the deadlock is to break some
of the locks by sacrificing at least one transaction, so that another transaction
can complete successfully. Also there's a chance that both Transaction 01 and
Transaction 02 rollback by timing out.
By default, deadlock detection is disabled, but in Infinispan 6, you can enable it
for individual caches, under the namedCache configuration element, by adding
the following:
<deadlockDetection enabled="true" spinDuration="1000"/>
In Infinispan 7, you can enable deadlock detection by specifying the deadlockdetection-spin attribute, which defines the time period allowed that an instance
can wait to acquire a particular lock:
<local-cache deadlock-detection-spin="1000"/>
An indication that you may need to enable deadlock detection is when you start
to see a large number of transactions rolling back and the TimeoutException
messages. In fact, TimeoutException might be caused by other factors too; however,
during deadlocks Infinispan will always throw a TimeoutException.
Another situation where you should consider using deadlock detection is when you
have a high contention on a set of keys, also there are other ways to analyze where
deadlock detection is appropriate, but the best method is to monitor and benchmark
the server from outside. You can use JMX to monitor and get statistical information
such as the number of deadlocks detected, using the DeadlockDetectingLockManager
MBean. We will see monitoring and management aspects in details in the next chapter.
[ 245 ]
Note: The deadlock detection process only works per cache basis, for
other cases, deadlock spread over multiple caches won't be detected.
Data versioning
To configure an efficient write skew check, you can also configure your Infinispan
cache to enable versioning and write skew check explicitly using the <versioning>
section in your configuration file. Versioning allows concurrency to be managed
through MVCC.
The <versioning> element defines only two attributes. The enable attribute
determines if versioning is enabled as by default it is disabled, while the
versioningScheme attribute defines the versioning scheme Infinispan should
use. The possible values are SIMPLE or NONE, the default value is NONE.
When Infinispan is operating in local mode, it's possible to make a more adequate
and reliable write skew check using Java object references to compare differences;
however, this technique is useless in a distributed environment and a more reliable
form of versioning is necessary to provide reliable write skew checks.
The org.infinispan.container.versioning.SimpleClusteredVersion class
is an implementation of the proposed org.infinispan.container.versioning.
EntryVersion interface, which provides a simple cluster-aware versioning schema,
backed by a long version attribute that is incremented each time the cache entry
is updated.
By default, versioning is disabled, so pay attention if you are using
transactions with write skew checks and REPEATABLE_READ as an
isolation level, because it is not reliable if you are using it in a cluster.
[ 246 ]
Chapter 7
Otherwise, if you want, you can create versioning programmatically by adding the
following code:
Configuration config = new ConfigurationBuilder()
.versioning()
.enable()
.scheme(VersioningScheme.SIMPLE)
.transaction()
.transactionMode(
TransactionMode.TRANSACTIONAL)
.transactionManagerLookup(
new GenericTransactionManagerLookup())
.autoCommit(true)
.build();
Summary
In this chapter, we looked at how Infinispan deals with transactions, but first
we had an introduction to transaction fundamentals, a glimpse of JTA integration,
and how to design your application to use different transactional models, optimistic
and pessimistic.
In the second part of the chapter, we had a deeper look on concurrency control
mechanisms to ensure data integrity, such as Multi-Version Concurrency Control
(MVCC), isolation level, and locking control.
Now that you know how to configure different transaction strategies for your cache,
it's time to learn how to monitor problems in production and how to manage your
cache instances.
[ 247 ]
www.PacktPub.com
Stay Connected: