CenteraExerciser UserGuide
CenteraExerciser UserGuide
0
User Guide
Created: 28/Jul/2003
Copyright
Copyright (c) 2001-2004 EMC Corporation
All Rights Reserved
This document contains the intellectual property of EMC Corporation.
Table of contents
1
About
1.1
Introduction
1.1.1
1.2
Assumptions
1.3
Terminology
1.4
References
Installation
2.1
2.2
2.2.1
10
3.1.1
Argument rules
10
3.1.2
11
3.1.3
Argument defaults
20
3.2
Basic examples
23
3.2.1
23
3.2.2
24
Logging/Statistics
27
4.1
LOGGING
27
4.2
Statistic files
27
4.2.1
28
Local parameters
30
5.1
30
5.2
5.3
35
5.4
36
HOW TO
37
6.1
37
6.2
37
6.3
How to use the name space scheme (Content Addressed Collision
Avoidance).
38
6.4
40
6.5
40
6.5.1
41
6.5.2
42
6.5.3
Rules for the creation and write operations (What happens if?)42
6.6
Created: 28/Jul/2003
43
6.7
43
6.7.1
Ramping examples
44
6.7.2
44
6.7.3
45
6.8
How to control X objects written and\or read per every X seconds
to\from Centera
46
6.8.1
46
6.9
How to control the percentage of files that are written and read to
and from Centera
47
6.9.1
48
6.10
49
6.11
51
6.11.1
51
6.11.2
How to have the tool use a different number of threads for each
asynchronous operation
52
6.11.2.1 Example of using a different number of threads for different
operations 52
6.11.2.2
53
6.12
53
6.13
How to have the tool operate for a user specified amount of time
53
6.13.1
54
6.14
57
6.15
How to control the sizes of the files written to Centera (incremental
and/or random)
57
6.15.1
57
6.15.2
58
6.15.3
59
6.16
59
6.16.1
6.16.2
6.16.3
Created: 28/Jul/2003
61
4 of 66
About
This document details how to use the CenteraExerciser tool.
1.1 Introduction
The CenteraExerciser tool is used for both performance measurement and load
testing of a Centera platform.
Some of the basic abilities of the CenteraExerciser tool follow:
Synchronous & Asynchronous, single and multi threaded abilities:
1) Writing X number of files to Centera.
2) Reading X number of files from Centera.
3) Partially read X number of files from Centera.
4) Deleting X number of files from Centera.
5) Backing up the C-Clips and associated blobs from Centera to disk.
6) Restoring backed up C-Clips and associated blobs to Centera, from disk.
Synchronous abilities, single threaded ability:
1) Querying Centera for C-Clips that a user wrote to Centera using the
CenteraExerciser tool.
IMPORTANT NOTE:
By default the CenteraExerciser tool only uses one (1) FPPool for all the operations
it performs. This pool is opened at the start of the test, shared by all threads, and
closed at the end of the test. To change this default functionality see the
explanations of the usemultipools and usepoolperfile boolean switches in section
3.1.2
1.1.1
1.2 Assumptions
The reader is familiar with Centera and has a general understanding of how it
works.
1.3 Terminology
Switch
Value
5 of 66
A command line argument that follows a switch. Values
do not have minus signs preceding them. Values are pieces
of information that make certain switches make sense to
the tool. For example, the address switch would have no
meaning to the tool if it did not have a value following it.
-address 10.15.54.101
Boolean switch
CDF (C-Clip)
Blob
Global
parameter
A switch and its associated value that affects all of the users
requested operations.
Local parameter
1.4 References
6 of 66
Installation
In order to use the CenteraExerciser tool, you must have JRE 1.4.1 or higher
installed on the machine that is used to run the tool.
NOTE:
The .so libraries and modules must be for the specific platform. What this means
is that the libFPLibrary32.so and libPAI_module32.so that is designed for a Solaris
platform does not work for a Linux platform.
Furthermore, the libFPLibrary32.so and libPAI_module32.so that is designed for
Solaris5.8 does not work with Solaris 5.6
7 of 66
b. For Solaris 5.8 & Linux (using bash), open a shell in the directory
where you put the required files and type the following:
1) export LD_LIBRARY_PATH=.
c. For Solaris 5.6 (using tcsh), open a shell in the directory where you
put the required files and type the following:
1) setenv LD_LIBRARY_PATH .
d. For AIX-5.1 (using ksh), open a shell in the directory where you put the
required files and type the following:
1) export LIBPATH=.
e. For AIX-4.3 (using bsh), open a shell in the directory where you put the
required files and type the following:
1) LIBPATH=.
2) export LIBPATH
f.
Once you complete the directions that are appropriate for the platform you are
using, you are ready to use the tool.
Make sure you read section 2.2.1 below for special Unix cases that may apply.
2.2.1
8 of 66
2) Some versions of the SDK ship with dot as and not dot sos or sls. In these
cases, follow the same procedures described up to this point, except use the
dot as in place of the dot .sos or .sls
For example, SDK version 2.3.218 for the AIX platforms ship with the following
two files (notice that they are dot a and not dot so
a) libFPLibrary32.a.2.3.218
b) libPAI_module32.a.2.3.28
These files are used exactly the same way as described in the above sections
except the dot a is used instead of the dot sl.
9 of 66
3.1.1
Argument rules
The specific order of the switches and boolean switches are not important as long
as the following rules are adhered to.
1) The address switch is required, and a value (valid IP address to a cluster)
MUST follow this switch.
2) The very first argument must be a switch (boolean is ok), but it cannot be a
value. See section 1.3 (terminology) for definitions of switch, boolean switch,
and value.
3) If the switch is NOT a boolean switch, a value MUST always follow it. For
example:
a)
switch/value example : -address 10.15.54.101 (This is ok)
b) boolean/boolean example: -address 10.15.54.101-memfile -del (This is ok)
c) switch/boolean example : -address -memfile (! THIS WILL NOT WORK !) In
the above example, -address is the switch and 10.15.54.101 is a value.
Example a is OK because -address is a switch that references a value
(10.15.54.101) and the value is following its related switch.
Example b is OK because it is in the form
<switch><value><boolean><boolean>. boolean switches do not make any
references to values, so the tool is not expecting a value to follow them. In fact,
it is illegal for a value to follow a boolean. This is the reason why rule #4 below
was created.
Finally, example c will not work because -address is a switch that needs to
reference a value. In example c there is no value that follows the address
switch. Always remember that non-boolean switches always need to precede
values.
4) A value can never follow a boolean switch. For example:
boolean/value example : -memfile c:\temp (! THIS WILL NOT WORK !)
a)
5) Two values immediately following each other is not allowed. For example: a)
-address 10.15.54.101 c:\temp\logfile1.log
In this example, the (-address 10.15.54.101) part is ok, but the value
c:\temp\logfile1.log would not be allowed because it is following the value
10.15.54.101
Summary of rules:
10 of 66
The first argument MUST be a switch, either boolean or non-boolean. The value of
a switch must immediately follow the switch. Boolean switches obviously do not
require values because the presence of them represents the meaning. Two boolean
switches next to each other IS allowed. Two values next to each other is NOT
allowed.
If the above rules are broken the tool will notify the user with an error message at
the command line.
3.1.2
-order
The allowable value for this switch is one or more predefined requests. If more than
one request makes up the value, they must be separated by commas. Spaces are
NOT allowed between the commas.
The value of this switch represents the order the user wants the tool to execute the
requested operations.
If this switch is not provided on the command line the tool will automatically
perform a write operation and then a read operation.
As mentioned above, the value of this switch is restricted to a known list of options.
The options that are allowed to be in the make up of the value of this switch are:
w (stands for Write)
rd (stands for Read)
rrd (stands for RawRead)
p (stands for Purge)
d (stands for Delete)
c (stands for Create) (See section 6.5 for a full understanding)
s (stands for Sleep)
11 of 66
These options can be used by themselves, or in combinations, and are not case
sensitive.
A user can attach special conditions, called local parameters, to most of the above
options. (See the section 5)
These options can be used more than once in the make up of the value. There is
one simple rule that must be followed when combining these options to make up
the value of this switch which is:
RULE: If the option ro is used in the make up of this switches value, the option rrd
MUST be used in the make up of the value any place BEFORE the use of the ro
option.
Example of rule:
java jar CenteraExerciser.jar address <cluster IP address> -order rrd,ro
It is NOT mandatory that the rrd option immediately precede the ro option as long
as it precedes it.
Meaning of the allowable predefined options:
The value of this switch can be any combination of the options mentioned above.
VALUE
MEANING
Able to be threaded?
(can more than 1 thread
perform the operation?)
write
Yes
rd
read
yes
Rrd
rawread
yes
purge
yes
ro
rawopen
yes
query
no
delete
yes
c OR c(#)
no
s(#)
no
MEANING
w,rd,d
w,rrd d,rd
12 of 66
w,q
c,w,s(30),
d
create one file then write that file then sleep for 30 seconds then
delete the file
c(10),w
Remember that the main options can be used by themselves (stand alone), or with
any combination of the others, or themselves as long as the one rule is followed.
(see explanation of the rule above under this explanation of the order switch.
NOTE: See section 6.2 and 6.1 in order to understand how to write directories and
set retention periods.
-log
The value of this switch is a string that tells the software the place and name on
the hard drive where to create the log file.
Example:
-log c:\temp\CenteraExerciser.log
The path given, not including the name part, must exist before the log file can be
created there.
The name and location of the log file will be defaulted to the current directory with
the name CenteraExerciser.log if the log switch is not used.
-embedthreshold
This switch controls the global embedded blob feature of the SDK. The value of
this switch is a threshold in bytes to determine whether or not files written to
Centera are embedded in the CDF. By default, the threshold is 0 bytes no
embedding is performed.
Example:
-embedthreshold 10240
All files written to Centera that are under 10K will be embedded in the CDF.
-store
The value of this switch is a string that tells the software the place on the hard
drive to store/create the unique data files that will be written to the cluster, before
they are actually written.
Example:
-store c:\temp
The path given must exist before the data files can be created there. If this switch
is not present, the directory will default to the current directory.
-files
13 of 66
The value of this switch must be a whole number and represents how many files
the tool is to use for the requested operations.
-size
The value of this switch must be a whole number. The number given will represent
the size of each file(s) written.
The units of this whole number are determined by the value given for the units
switch.
If this switch is not present, the value will default to 1.
For Example, it the desired file size is 10 KB a user would use this switch as follow.
java jar CenteraExerciser.jar address <cluster_address> -size
10
Since the default units is KB, the units switch with a value is not needed,
although it could be used if a user wanted to do some extra typing.
If the desired file size is 32 megabytes, a user would use the switch in combination
with the units switch as follows:
java jar CenteraExerciser.jar address <cluster_address> -size 32
-units MB
This switch can be used to tell the tool to either use files that incrementally
increase in size from a min size to a max size, or to randomly use files from a min
size to a max size.
In order to do what is described in the above paragraph, one of the following
predefined templates must be used:
1) size <#Units>,<#Units>
2) size <#Units>,<#Units>,random
3) size <#Units>,<#Units>,random_#
The # sign in the above templates must be replaced with a whole number. The
Units word must be replaced with one of the following units representation:
KB
MB
GB
See the section in this document on how to use the size switch.
-units
The value of this switch is a string that tells the software what units to assign to
the value of the size switch.
Only one of the following allowed values can be given for any one test:
1) KB
2)
MB
3) GB
If this switch is not present, then the units will default to KB.
-threads
EMC Centera Development Group
14 of 66
The value of this switch must be a whole number. This number represents the
number of threads the tool uses to perform each thread-able operation.
-maxconnections
The value of this switch must be a whole number. This number represents how
many socket connections will be able to be open at any one time during an
operation. The max value cannot exceed 999.
-retries
The value of this switch must be a whole number. This number represents how
many times the software should try to re-connect to the IP address given, and how
many times to attempt an API call, before giving up if previous attempts have failed.
If this switch is not present, the tool will use the SDK default.
-sleep
The value of this switch must be a whole number. This number represents how
much time (in milliseconds) should pass before attempting to retry the connection
to the cluster if the previous attempt has failed, or retry the API call if it has failed.
If this switch is not present, the tool will use the SDK default.
-calc
The value of this switch is a string that tells the software how to calculate the
data's address (MD5) while writing files to Centera. Inputs to this switch can be
only one of the following for any one test:
Option
Meaning
SC
CC
UF
NC
No Check
SCNC
CCNC
UFNC
-offset
The value of this switch must be a whole number. This number represents from
which byte to start reading the data. If this switch is used without the length
switch, the remaining bytes of the file from the offset point will be read.
Note:
EMC Centera Development Group
15 of 66
If both this switch and the randompartial switch are used together, (which would
not make sense), the randompartial switch will be ignored and this
offset
switch will take precedence.
-length
The value of this switch must be a whole number. This number will represent how
many bytes to read from the point of the value set by the offset switch, or from
byte 0 if the offset switch is not given. The units of this length switch are always
in bytes.
If this length switch is given at the command line with is value, and the offset
switch is not given, the tool will only read the data from byte 0 to <the byte value of
this switch>. i.e., If reading a 2KB file, and this length switch is given on the
command line with a value of 1024, and the offset switch is not given, only bytes
0 to 1023 of the 2KB file will be read.
If the value that is provided with this length switch is greater than the length of
the file being read, the tool will read the file starting at the offset (if given) or 0 (if
offset is not given) to the end of the file.
-clipfile
The value of this switch points to a file on a hard drive that contains a list of C-Clip
IDs. The C-Clip IDs that are in the pointed to file must be on independent lines.
If the value of this switch is pointing to the clip file that was created by using the
boolean switch saveclips, or the switch saveclipsas, then the format of the clip
file will be correct and there will be no need to make sure that the C-Clip IDs that
are in the clip file are each on independent lines.
The presence of this switch and its value tells the tool to perform reads using the
C-Clip IDs that are contained in the file that is pointed to by the value of this
switch.
-testname
The value of this switch should be a String that represents the name a user would
like to associate to the test that is to be run. This String value will show up in
both the log file and the statistical files.
-operatefor
The value of this switch is a mix of a whole number and a letter that represents
how long the tool should run its tests.
The value template for this switch is as follows:
-operatefor <#Units>
Note: The angle brackets are not required or allowed.
Where the # sign is replaced by a whole number, and the word Units is replaced
by one of the following predefined time units:
Allowable time units
Meaning
seconds
minutes
16 of 66
h
hours
-writefilesto
The value of this switch should be a String that represents a fully qualified
directory name.
All files created by the tool will be written to the directory specified instead of to the
cluster.
The directory specified does not have to exist as long as the path leading up to the
final directory exists, the final directory will be created automatically by the tool if
it does not already exist.
Example:
java jar CenteraExerciser.jar address <cluster_address> -files 10
writefilesto X:\users\Donp
This example will create 10 files of 1 KB size and write them to the directory
users\Donp that is located on a drive that is mapped as X:
NOTE: The address <IP> does not get used for writing the files, but it is required
only because it is a required switch to allow the tool to operate. A valid IP address
is required to be passed as the value of this switch. When using the writefilesto
switch, the tool will first establish a connection with the given IP address and do
nothing else with that connection except close it when the tool is finished.
Important note about this switch: This writefilesto is only intended to be used
to write files to a hard disk location. The CenteraExerciser tool cannot read the
files back that it writes using this switch. The main reason this switch was added
is to give QA a quick way to test writes ONLY to a Storigen gateway. If a user
attempts to read or manipulate the files written by this switch, using the
CenteraExerciser tool, an error message gets thrown that looks something like the
following:
java.lang.NullPointerException
at com.emc.centera.qa.core.CenteraSDK.retrieve(CenteraSDK.java:1840)
at com.emc.centera.qa.utils.ReadThread.run(ReadThread.java:239)
-saveclipsas
NOTE: This switch is an alternate to the saveclips boolean switch.
This switch and its value tells the software to save the C-Clips IDs that are created
when writing the Clip to the cluster.
The C-Clip IDs are saved to a file on a hard drive. The name of the C-Clip ID file
that is created is the name of the value that is provided with this switch.
Example of use:
java jar CenteraExerciser.jar address <cluster_address> -files 10
saveclipsas myClips.txt
EMC Centera Development Group
17 of 66
Note the value myClips.txt. This name can have a fully qualified path attached.
This will be the name of the clip file that is stored on the hard drive. This file will
be stored in the current directory that the tool is operated from if a fully qualified
path is not specified, or in the last directory of the fully qualified path if one is
provided.
NOTE: The resulting file of this saveclipsas switch should be used as the value of
the clipfile switch.
-saveclips
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to save the C-Clips IDs that are created
when writing the Clip to the cluster. The C-Clips will be saved to a file on a hard
drive. If this switch is used, the name of the C-Clip ID file that is created will be
made up of a date and time, and will ALWAYS end with the letters
C-Clip.txt.
The following two examples show the resulting file that is created when this
saveclips switch is used.
Example of clip file name (the date and time will differ):
java jar CenteraExerciser.jar address <cluster_address>
-saveclips
resulting file:
2004-04-22_10-11-49-C-Clips.txt
NOTE: In this example, there are the letters C-Clips.txt at the end of the file
name. The file created by using this switch will be placed in the same directory
that the general log file is placed in.
NOTE: The resulting file of this saveclips boolean switch should be used as the
value of the clipfile switch.
-threaddelay
The value of this switch represents the amount of time in milliseconds that should
pass between the start of each thread. This is a one time delay and only happens
when starting each thread for the very first time.
- usemultipools
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to use a different FPPool for each thread
of the same operation.
CenteraExerciser will determine how many threads it must use, and create a pool
of FPPools that contain one pool for each thread of the same operation.
EMC Centera Development Group
18 of 66
For example, if the tool is asked to perform a write and a read operation, (that is
two operations), and the tool is also told to use 10 threads, the tool will create a
pool of 10 FPPools. Each thread from the write operations will take and use one
FPPool from the pool to perform its work. When done, each thread of the read
operation will reuse the 10 FPPools and each read thread will take one to do their
work.
IMPORTANT NOTE:
By default the CenteraExerciser tool only uses one (1) pool for all the operations it
performs. This pool is opened at the start and closed at the end of the test.
-usepoolperfile
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to use a new FPPool for each file object a
thread works on.
A file object is defined as a single file, or a directory. For example, if the tool is
asked to write a whole directory using the order w(file=<path to dir>), then the
directory pointed to is considered to be a file object. On the other hand, if the tool
is asked to write an entire directory using the order w(dir=<path to dir>) then each
file in the directory is considered by the tool to be a file object not the whole
directory.
See section 6.2 in this document for and understanding between the two ways to
write whole directories.
When the tool is operating in this mode, each thread will open a new pool for each
file object it works on, and then close the pool when finished.
This functionality differs from the usemultipools functionality because the
usemultipools functionality only opens as many pools as the max number of a
thread type. Those pools are open at the beginning of the tool test, and only closed
at the end of the tool test. (See the explanation of the usemultipools boolean
switch above.)
IMPORTANT NOTE:
By default the CenteraExerciser tool only uses one (1) pool for all the operations it
performs. This pool is opened at the start and closed at the end of the test.
-useroundrobin
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to select the next AN (access node) in a
round robin fashion, effectively disabling load balancing in the SDK.
Note that the use of this switch is only effective when the CenteraExerciser tool is
used with SDK version 2.0.233 or higher.
If this switch is used with an SDK version lower than 2.0.233, the tool will just
ignore this switch and will not throw an error.
-timefirstbyte
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to perform a time to first byte test.
19 of 66
If this switch is used, the tool will write as many unique bytes (up to 256) as the
value for the files switch specifies, using as many threads as the value of the
threads switch specifies, read those files back, then delete the written files from the
cluster so the test can be performed again without having to worry about the files
already existing on the Centera.
The writes and reads of this test are timed just like any other operation. The
important thing to remember about using this switch is that regardless of what
value is used in the order switch, or if the clipfile switch, or memfile boolean
switch is used, using this timefirstbyte switch will cause the tool to only perform a
write of the files, then a read of the same files, then delete the written files.
Example of use:
java -jar CenteraExerciser.jar address <cluster_address> -files
10 threads 5 timefirstbyte
The statistics of this test can be found in the CenteraExerciserStats_Write.xls and
CenteraExerciserStats_Read.xls files.
-memfile
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to create all the files that need to be
written to Centera in the computer memory instead of on the hard drive before
actually doing the writes to Centera.
By using this switch, the total test time, NOT THE TIMED portion of the test,
should be quicker. The use of this switch will also allow for testing of larger files.
If this switch is used along with either the store or-del or both, switches. the
memfile switch will take precedence and the other two mentioned switches will be
ignored because they do not make sense when using memfile.
Using this boolean switch will cause MemFile to automatically check the data read
back from Centera to make sure it matches the data that was written to Centera.
If the user of this tool is measuring performance, this switch should not be used.
Use the memfiledcr boolean switch instead. See the explanation below.
IMPORTANT NOTE:
As of version v1.1.8 and above of the CenteraExerciser tool, the MemFile option
can be requested either globally or locally.
To request it globally, just include this boolean switch on the command line. To
request it locally for the write and/or read operations you must attach a local
parameter to either or both of those operations. See section 5.1
One possible use for the local memfile feature is to write using MemFiles and read
not using MemFIles. This would allow the written MemFiles to be read back to
disk. Normally, if the tool is using the global memfile option during reads, the
read files are not read to disk but rather just read in memory. See section 5.1 for
details on how to request local MemFile usage for the Write and/or read
operations.
-memfiledcr (dont check reads)
20 of 66
This switch is a boolean switch and cannot have a value associated with it. The
only difference between this boolean switch and the memfile boolean switch is the
use of this switch will prevent the MemFiles from checking the data retrieved from
Centera against the data that was written to Centera to make sure it matches.
In other words, this boolean switch tells MemFile (dont check reads). The
memfile switch will do the check by default, this one will not.
Using this switch might be something that performance people would want to use
instead of the memfile boolean switch.
IMPORTANT NOTE:
As of version v1.1.8 and above of the CenteraExerciser tool, this MemFile dont
check read option can be requested either globally or locally. See the IMPORTANT
NOTE under the explanation of the memfile boolean switch above.
-del
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to delete the intermediate files that are
created on the hard drive (if MemFIle is not being used) that are written to Centera,
after the writes are finished.
The files that are deleted are not the files that were written to the cluster, and
would now be stored on the cluster. Rather they are the files that are created on
the hard drive prior to the writes and used as the source that are written to
Centera.
-randompartial
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to do a random partial read of the data.
The offset, and length, of the partial read will be calculated by the software
randomly, and recorded in the log file.
If this switch is present along with the -offset and/or length switch, this switch
will be ignored, and a non-random read will be performed instead. In order to
achieve a random partial read, this switch must be used without either the
21 of 66
java jar CenteraExerciser.jar address <cluster_address> -files 10
writedups
This example would write 10 unique files that have a size of 1 KB each. After all ten
files have been written to the cluster, the tool will attempt to write the same 10 files
to the cluster a second time.
-ignorewriteerrors
This switch is a boolean switch and cannot have a value associated with it. The
presence of this switch tells the software to ignore any errors encountered during
the write process.
By default, if a write error is encountered during the write process, the tool will
continue to try to write all the files and then when all the threads are finished, if
any of the writes failed during the process, the tool will report the error(s) and exit
with a 1 error code.
If this ignorewriteerrors switch is used, the tool will log any errors the tool
encountered during the write process, and continue with the rest of the operations
that was asked of it such as reads, deletes, etc.
NOTE: This will most certainly result in more error messages being posted to both
the standard output and the log files.
3.1.3
Argument defaults
The only required elements to the command line are the address switch and its
corresponding value which is <The IP ADDRESS OF THE CLUSTER>.
If any switch, value, and boolean switch is not present on the command line, a
default value will be used.
The following is a list of command line arguments and their defaults.
Switch Name
Value
Defaulted Value
-address
<Cluster IP address>
NONE
-order
1) w
w,rd
2) rd
3) d
4) rrd
5) ro
6) s(#) where # is a whole number
7) c or c(#)where # is a whole
number
8) p
EMC Centera Development Group
22 of 66
In any comma separated
combination (no spaces) as long as
the rule of the
order switch is followed. (see
section 2.3.2)
Example of comma separated list
value:
w,rd,w,rd,p
This example would write then read
then write then read then rawread
the purge.
-log
./CenteraExerciser.log
c:\temp\MyLog.log
-embedthreshold
0 bytes
-store
./
Example:
c:\temp
KB or MB or GB
KB
-maxconnections
100
23 of 66
during an operation.
-retries
SDK default
SDK default
1) CC
CC
2) SC
3) UF
4) NC
5) CCNC
6) SCNC
7) UFNC
The choice of one of these values
tells the tool to how to calculate the
MD5 for the blob. (see calc in
section 2.3.2 for more details)
-offset
-1
not present
-writefilesto
not present
-saveclipsas
not present
24 of 66
May be a fully qualified path
-threaddelay
-memfile
false
-memfiledcr
false
-del
false
-randompartial
false
-saveclips
false
-usemultipools
false
-usepoolperfile
false
-nologs
false
-writedups
false
-ignorewriteerrors
false
-nostats
false
3.2.1
25 of 66
3) java jar CenteraExerciser.jar address <cluster_address> -order w
files 20
threads 6 -del
4) java jar CenteraExerciser.jar address <cluster_address> -order w
files 20
threads 6 size 4 -del
5) java jar CenteraExerciser.jar address <cluster_address> -order w
files 20
threads 6 size 4 units MB -del
6)
This example first creates one unique file of 1KB size on the clients
local hard drive. Then it writes that file to Centera using one thread.
Example#2
This example first creates twenty unique file of 1KB size each, on the
clients local hard drive. Then it writes those file to Centera using one
thread.
Example#3
This example first creates twenty unique files of 1KB size each, on the
clients local hard drive. Then it writes those file to Centera using six
threads.
Example#4
This example first creates twenty unique files of 4KB size each, on the
clients local hard drive. Then it writes those file to Centera using six
threads. Then it deletes the local files from the clients hard drive.
Example#5
This example first creates twenty unique files of 4MB size each, on
the clients local hard drive. Then it writes those file to Centera using
six threads. Then it deletes the local files from the clients hard drive.
Example#6
This example first creates twenty unique files of 4MB size each, on
the clients local hard drive. Then it writes those file to Centera using
six threads. Then it writes the resultant C-ClipIDs to a file by the
name of myClips.txt in the directory c:\temp\clipDir on the local
client machine. Then it deletes the local files from the clients hard
drive.
Example#7
This example first creates twenty unique files of 4MB size each, in the
clients memory. Then it writes those file to Centera using six threads.
Then it writes the resultant C-ClipIDs to a file by the name of
myClips.txt in the directory c:\temp\clipDir on the local client
machine. Because the memfile boolean switch is used, there are no
local files to delete.
If the myClips.txt file already exists in the directory, it is appended
with the new C-ClipIDs
26 of 66
3.2.2
NOTE:
The quotation marks around the value of the order switch in the above examples
are not required for the provided examples.
However, quotation marks are required to be around the value of the order switch
on most platforms when a user attaches local parameters to any of the requested
operations. For this reason, they are introduced here.
Local parameters are discussed in section 2.5.
This example first creates one unique file of 1KB size on the clients
local hard drive. Then it writes that file to Centera using one thread.
Then reads that same file back from Centera and stores it on the
clients local hard drive in a directory named retrieve.
The retrieve directory is located as a subdirectory in either the
directory the tool is being operated from, or under the directory that
is pointed to by the value of the store switch.
Example#2
This example first creates twenty unique files of 1KB size each, on the
clients local hard drive. Then it writes those files to Centera using
one thread. Then reads those same files back from Centera, also
using one thread, and stores them on the clients local hard drive in a
directory named retrieve.
The retrieve directory is located as a subdirectory in either the
directory the tool is being operated from, or under the directory that
27 of 66
is pointed to by the value of the store switch.
Example#3
This example first creates twenty unique files of 1KB size each, on the
clients local hard drive. Then it writes those files to Centera using six
threads. Then reads those same files back from Centera, also using
six threads, and stores them on the clients local hard drive in a
directory named retrieve. Then it deletes all the local files, including
the retrieved ones in the retrieve directory, from the client.
Example#4
This example first creates twenty unique files of 1KB size each, on the
clients local hard drive. Then it writes those files to Centera using six
threads. Then reads those same files back from Centera, also using
six threads, and stores them on the clients local hard drive in a
directory named retrieve. Then it saves the resultant C-ClipIDs of
the files that are written to a file on the clients hard drive by the
name of myClips.txt Then it deletes all the local files, including the
retrieved ones in the retrieve directory, from the client.
Example#5
This example reads the files that are related to the C-ClipIDs that are
stored in the referenced clipfile using six threads. As the reads take
place, the read files are being put in the retrieve directory.
Example#6
This example reads the files that are related to the C-ClipIDs that are
stored in the referenced clipfile using six threads. As the reads take
place, the tool is comparing the read files against the written files that
relate to the read C-ClipIDs checking for corruption.
The read files are read and compared in memory and no files are
retrieved to the clients hard drive.
Example#7
This example first creates twenty unique files of 4MB size each, on
the clients local hard drive. Then it writes those files to Centera using
six threads. Then reads those same files and the files relating to the
C-ClipIDs contained in the reference myClips.txt clipfile, back from
Centera, also using six threads, and stores them on the clients local
hard drive in a directory named retrieve. Then it saves the resultant
C-ClipIDs of the files that are written during this run, to the already
existing myClips.txt clipfile (appending to it).
28 of 66
Logging/Statistics
4.1 LOGGING
A log file that records all the passed in parameter information along with the total
time (in milliseconds) it took for the test to finish, the throughput (in MBps), and
the total amount of bytes transferred is created in either the current directory the
tool is running from, or a directory of the users choice by using the log switch.
This log file is appended to after each test, saving the data from the previous tests.
The log file will also record the FPLibrary error codes if errors are encountered
during a write or read.
Name of log files:
If the log switch is not used, the name of this log file will be CenteraExerciser.log,
and the location of it will be in the current directory.
If the log file is used, the name will be what ever is given in the value of that
switch, and the location will be what ever the path is set to in the value of the
log switch.
CenteraExerciserStats_Write_#_.xls
The statistical results for the reads are stored in a file by the name of:
CenteraExerciserStats_Read_#_.xls
The statistical results for the backed up (RawRead) data files are stored in a file by
the name of:
CenteraExerciserStats_RawRead_#_.xls
The statistical results for the purged data files are stored in a file by the name of:
CenteraExerciserStats_Purge_#_.xls
The statistical results for the deleted data files are stored in a file by the name of:
CenteraExerciserStats_Delete_#_.xls
29 of 66
The statistical results for the restored data files (RawOpen) from hard drive to
cluster, are stored in a file by the name of:
CenteraExerciserStats_RawOpen_#_.xls
NOTE: The # in the above name will be replaced by actual numbers such as 0, 1,
2, 3, Depending if there are already files of the same name in the directory
The statistical files include raw data that the tool collected during the operation,
and calculates the results based on Excel formulas that are imbedded into the
statistical files from the tool. There should be no questions as to what formulas
are being used to obtain the results because the formulas can be seen in Excel.
4.2.1
Just click no when/if this message appears. When the SummaryStats.xls file is
open and it has references to files that are not open before it, the following will be
displayed in the cells of the file (#REF!).
Name
of
Test
Date
&
Time
of log
# of
Files
Total
bytes
#REF!
#REF!
#REF!
#REF!
30 of 66
#REF!
#REF!
#REF!
#REF!
#REF!
#REF!
#REF!
#REF!
#REF!
#REF!
#REF!
#REF!
31 of 66
Local parameters
As explained in the terminology section 1.3, local parameters are a way the user is
able to specifically set options for individual types of operations.
NOTE:
Local parameters are sometimes referred to as special requests and this document
uses the two terms interchangeably.
The examples in section 2.4 above all use global parameters, which means that all
the parameters used pertain to all of the requested operations.
For example, in the following command line:
java jar CenteraExerciser.jar address <IP> -order w,rd files 3 threads 2
The values assigned to the files and threads switches pertain to both the write
and read operations that are requested in the makeup of the value of the order
switch. In other words, both the write and read operations will work on 3 files
using 2 threads.
If a user wants to have the write operations write 10 files using 8 threads, and read
those files using only 2 threads, local parameters have to be used.
The following example shows what is described in the above paragraph:
java jar CenteraExerciser.jar address <IP> -order w(tm=8),rd(tm=2) files 10
At this point in the document, it is not expected for you to understand the values
of the local parameters, only the way in which to attach them to an individual type
of operation.
In order to attach local parameters to an individual operation open and closing
parentheses are used immediately following the requested operation, and before
the comma that separates the next requested operation (if there is one).
The types of operations that are allowed to have local parameters attached to them
are:
1) w
2) rd
3) rrd
4) ro
5) d
6) p
Each type of operation has predefined local parameters that a user is able to attach
to that specific operation. There are some local parameters that can be used with
all the above mentioned operations, and others that can be used with a select few.
The next section lists the predefined local parameters and the operations they can
be attached to.
32 of 66
Meaning
R-Value
What it does
ti
thread initial
An integer
tm
thread max
An integer
ri
ramp interval
An integer
rt
ramp time
An integer
L-Value
The following table lists the predefined local parameters that can only be attached
to the write and read operations.
Local parameters that can be used only on write and read operations
<L-Value>=<R-Value>
Request/Parameter
EMC Centera Development Group
Meaning
R-Value
What it does
33 of 66
L-Value
mem
MemFile
NO VALUE
This is a
boolean
parameter
oi
object interval
An integer
ot
object time
An integer
percent
An integer
The following table lists the predefined local parameters that can only be attached
to the write operations.
Request/Parameter
34 of 66
What it does
Meaning
R-Value
ret
Retention
An integer
The R-Value
assigned to this LValue represents
the number of
seconds that need
to pass before the
files that are being
written can be
deleted.
embed
Embed blob in
CDF
NO VALUE
Overrides the
global embedded
blob threshold and
embeds the blob
within the CDF for
this particular
write operation.
Directory
A String
L-Value
dir
This is a
boolean
parameter
The R-Value
assigned to this LValue represents a
directory. The RValue can be either
just a directory
name or fully
qualified path. If
just a name is
given, the tool will
look in the current
directory for the
directory name
given.
The tool will write
all the files
contained in the
directory given, to
Centera.
file
File
A String
The R-Value
assigned to this LValue represents a
file object. The file
object can be either
a directory or a
single file. If the RValue represents a
directory, all the
files,
subdirectories, and
35 of 66
all other files and
subdirectories
continuing all the
way through the
hierarchy, will be
written to Centera.
If the R-Value only
represents a single
file, that file will be
written to Centera.
eb
Extended Blob
(One of two
String values)
on
OR
ec
Extended Clip
of
(One of two
String values)
on
(OR)
of
edd
Enable duplicate
detection
(The only
allowable
value for this
is the String)
on
cl
Clip
An integer
36 of 66
that the tool will
use during the
write operations.
Tg
Tag
An integer
Meaning
R-Value
What it does
MemFile dont
check reads
NO VALUE
This is a
boolean
parameter
os=
Offset
A positive long
value or zero
to start at the
beginning
len=
Length
A positive long
value.
L-Value
memdcr
If you wish to
read to the
end of the
stream do not
provide this
local param.
37 of 66
The template to follow when attaching more than one local parameter to the same
operation is:
<operation>(<local param>&<local param>&<local param>)
Remember to encapsulate the entire value of the order switch in quotation marks
when using local parameters.
Examples:
1) java jar CenteraExerciser.jar address <IP> -order w(dir=c:\temp)
2) java jar CenteraExerciser.jar address <IP> -order
w(dir=c:\temp&ret=180&embed)
3) java jar CenteraExerciser.jar address <IP> -order w(tm=12),rd(tm=3) files
50
4) java jar CenteraExerciser.jar address <IP> -order w(tm=12),rd files 50
threads 30
5) java jar CenteraExerciser.jar address <IP> -order w(tm=12),rd(tm=4) files
50 threads 30
This example writes all the files that are contained in the c:\temp
directory to the Centera.
Example#2
This example writes all the files that are contained in the c:\temp
directory to the Centera and sets a retention period on all the file to 3
minuets. It will also embed the blob data within the CDF.
Example#3
Example#4
Example#5
EMC Centera Development Group
38 of 66
reads those files back from Centera using 4 threads.
Since the read operation has a local parameter attached to it that tells
it to use 4 threads, the global thread parameter of 30 is not used.
39 of 66
HOW TO
If w(file=<fully qualified file object name>) is used, the tool writes the file object
that is pointed to by the portion <fully qualified file object name>.
A file object is either a directory containing one or more files and subdirectories, or
it is a single file.
EMC Centera Development Group
40 of 66
No matter if file= points to a single file, or a whole directory, only one C-ClipID will
be returned. If it was a whole directory that this is pointed to, the single C-ClipID
can be referenced to retrieve all the files and subdirectories that were written.
If this (file=<fully qualified file object name>) points to a directory, not only will all
the files in that directory be written, but all the directories and files and
subdirectories of those directories, and so on will be written.
Example using w(file=<full path to directory>):
java jar CenteraExerciser.jar address <IP address> -order
w(file=C:\temp\dirOfFiles)
NOTE: Spaces are NOT allowed when specifying special request. For that matter,
spaces are not allowed anyplace in the make up of the value for the order switch.
Attaching local parameters to one operation does not effect how you continue to
use additional operational requests in the value makeup of the order switch. For
example: If a read and a purge are desired after writing a directory of files by the
name of myFilesToWrite, it would be done as follows:
java jar CenteraExerciser.jar address <IP address> -order
w(dir=c:\myFilesToWrite),rd,p
The following is a comparison example without the local parameters attached to
the write.
This following example writes, reads, and purges just like the above example, only
the tool automatically generate its own file to write instead of using the
myFileToWrite file.
java jar CenteraExerciser.jar address <IP address> -order w,rd,p
ec=on
Use
When this request is used, the clip MD5 will be stored using an
extended value that will uniquely identify this clip from all others. This
is used to help eliminate the possibility of clip collision.
EMC Centera Development Group
41 of 66
ec=off
Do not use
When this request is used, the tool will not use the extended name
space scheme while writing the clips for the files it writes. In the
absence of any ec=<value>, the tool will automatically default to not
using the extended name space scheme for the clips.
eb=on
Use
When this request is used, the blob MD5 will be stored using an
extended value that will uniquely identify this blob from all others. This
is used to help eliminate the possibility of clip collision.
ec=off
Do not use
When this request is used, the tool will not use the extended name
space scheme while writing the blobs for the files it writes. In the
absence of any eb=<value>, the tool will automatically default to not
using the extended name space scheme for the blobs
Examples:
1) java jar CenteraExerciser.jar <IP> -order w
2) java jar CenteraExerciser.jar <IP> -order w(ec=off&eb=off)
3) java jar CenteraExerciser.jar <IP> -order w(ec=on)
4) java jar CenteraExerciser.jar <IP> -order w(ec=on&eb=off)
5) java jar CenteraExerciser.jar <IP> -order w(eb=on)
6) java jar CenteraExerciser.jar <IP> -order w(ec=off&eb=on)
7) java jar CenteraExerciser.jar <IP> -order w(ec=on&eb=on),w(eb=on)
8) java jar CenteraExerciser.jar <IP> -order w(ec=on&eb=on),w(ec=off&eb=on)
This example will write one file of 1KB size and not use CACA for
either the clip or the blob.
Example#2
This example will do the exact same thing as example #1. This
example is only showing that a user can explicitly specify what the
defaults are already without any problems.
Example#3
This example will write one file of 1KB size and use CACA for the clip
only.
Example#4
This example will do the exact same thing as example #3. This
example is only showing that a user can explicitly specify what the
defaults are already without any problems
Example#5
This example will write one file of 1KB size and use CACA for the blob
only.
Example#6
This example will do the exact same thing as example #5. This
42 of 66
example is only showing that a user can explicitly specify what the
defaults are already without any problems
Example#7
This example will first write one file of 1KB size using CACA for both
the clip and the blob. Once the first write operation is finished, a
second write operation will be started. The second write operation will
write one 1KB file using CACA for the blob only.
Example#8
This example will do the exact same thing as example #7. This
example is only showing that a user can explicitly specify what the
defaults are already without any problems
NOTE:
The position of ec=<value> and eb=<value> for the local parameters is not
important.
For example:
java jar CenteraExerciser.jar address <IP> -order w(eb=on&ec=off)
is exactly the same as:
java jar CenteraExerciser.jar address <IP> -order w(ec=off&eb=on)
Example#2
This example shows two write operations. The first write operation
writes one file of 1K size to Centera using duplicate detection. The
second write operation writes one file of 1K size to Centera NOT using
duplicate detection. Please note that there is no special request write
for edd= that explicitly allows the user to turn it off such as edd=off.
By not providing the edd=on the tool will not enable duplicate
detection. Not providing it is the same as turning it off.
43 of 66
There will also be a stack trace accompanying this error.
NOTE:
The edd=on local parameter is not allowed to be used with the eb=on local
parameter. If the two are found to be included as local parameters for the same
write operation, the following error will be thrown, and the tool will exit:
ERROR: IllegalWriteRequest eb=on and edd=on cannot both be set for the same
write operation!
c(#)
The next sections will explain the differences of the two above options.
6.5.1
44 of 66
This example causes the tool to first create one data file, then write that data file
that is created, then read the data file that was written, then create a new data file,
then write that new data file that is created for the second write request. Note that
the second new data file that is created is created because of the w (write request)
and not the c (create request).
Example 3:
java jar CenteraExerciser.jar address <IP> -order c,w,rd,c,w
This example causes the tool to first create one data file, then write that data file
that is created, then read the data file that was written, then create a different
data file, then write the different data file that is created. Note that the two data
files that are created in this example are because of the c (create requests) that
are present.
Example 4:
java jar CenteraExerciser.jar address <IP> -order w,rd,c,w
This example accomplishes the same result as example 3. This example causes the
tool to first create one data file, then write that data file that is created, then read
the data file that was written, then create a different data file, then write the
different data file that is created.
If the tool first encounters a letter w (write operation) and there are no letter c any
place preceding the write request, the tool will automatically create the data file(s)
before it writes.
6.5.2
45 of 66
java jar CenteraExerciser.jar address <IP> -order c,w,rd,c(8),w
This example causes the tool to first create one data file, then write that data file
that is created, then read the data file that was written, then create 8 different
data files, then write the 8 different data files that are created. Notice the mixing of
c and c(#) although the first c is not required.
Example 4:
java jar CenteraExerciser.jar address <IP> -order w,rd,c(4),w,c,w files 16
Pay special attention to this example. It may not work the way one would first
expect.
This example causes the tool to first create 16 data files, then write those data files
that are created, then read those data files that were written, then create 4
different data files, then write those 4 different data files that are created, then
create 16 more different data files, then write those data files. The reason for this
is explained in section 6.5.3 below
6.5.3
46 of 66
This example causes the tool to write a file, then sleep for 60 seconds, then read
the file that was written.
There are no restrictions on where the s(#) can be placed in the makeup of the
order switch.
What is needed?
6.7.1
Explanation
Self-explanatory
Self-explanatory
Number of threads to
increment by.
Ramping examples
A typical way for a user to request that a particular type of thread control its
ramping functionality is to attach local parameters to the value that represents the
type of thread in the order switch of the tool.
47 of 66
The following example shows a command line that asks the tool to write 100 files,
but the tool will start writing these files using only 1 thread. Every 3 seconds the
tool allows 2 more threads to start writing files. This ramping continues until all 10
threads have passed the ramp control unless the threads that previously passed
finish the work before the remaining threads are allowed to pass, in which case the
remaining threads are allowed to exit the ramp control and die.
java jar CenteraExerciser.jar address <IP> -order w(ti=1&tm=10&ri=2&rt=3)
files 100 threads 10
The reader should focus on the following part of the above command line:
w(ti=1&tm=10&ri=2&rt=3)
There are local parameters attached to the write operation in the above command
line. These same local parameters are also allowed to be attached to the other
threading operations.
See section 2.5 for details on the local parameters used here.
The order of the above local parameters is not important and can be applied in any
order.
6.7.2
48 of 66
Since no local parameters exist in the CenteraExerciser tool that allows the user to
control the number of files being worked on, the global number is always used (files 100 in this example).
6.7.3
Required?
ti
NO
Default
1 if this request,
the tm request,
&& the threads
switch is not
present.
The value of the
tm request if a tm
request is present
and this request
is not present.
The value of the
global threads
switch if both
this request and
the tm request
are not present
and the threads
switch is present.
tm
NO
1 if this request,
the ti request,
and the threads
switch is not
present.
The value of the
ti request if this
request is not
present and the
ti request is
present.
The value of the
global threads
switch if both
this request and
the ti request are
not present and
the threads
Will throw
error
If the value
for this
request is >
the value that
is set for the
tm request.
If this request
is present,
and the local
max request
is not present,
and the global
request is
present, and
the value for
this request is
> the global
value.
If the value
for this
request is <
the value that
is set for the
ti request.
49 of 66
switch is present.
ri
NO
N/A
If this request
is present and
the rt request
is not present.
rt
NO
N/A
If this request
is present and
the ri request
is not present.
6.8.1
50 of 66
The following is an example that shows how to use the local parameters shown in
the table above. This example only shows the write operation. The read operation is
done the exact same way. Also, both the write and read operations can have these
local parameters attached in the same test.
The example, shown above, writes 2 objects every 3 seconds to Centera using 6
threads. The write operation finishes once all 100 files are written.
Because the default units are being used in this example, the size of the files are
all 1 KB. Since all the files are <= 100 MB, each file will have 2 blobs associated
with it (1 for the clip, and 1 for the file data). In this example the tool can assure
that the users request of 2 objects will be sent to the API call that writes the data,
every 3 seconds. The tool cannot guarantee that the SDK will complete the transfer
of the objects to Centera in the allotted time.
In this next example the user asks the tool to write 3 objects every 2 seconds. This
is a good example that illustrates how the tool calculates and figures out when it
allows objects to be sent to the API call that will start writing them.
java jar CenteraExerciser.jar address <IP> -order w(oi=3&ot=2) files 100
threads 6
Again, the file sizes of the files in the above example are all 1 KB. Because of this
fact, there are 2 objects per every 1 file (1 clip, 1 data object). So for every thread
that writes a file, it will be writing 2 objects. The tool makes no attempt to extract
the objects from the files, and then dole them out (3 objects per thread in this
case) to be written. This would be a very tedious process and would result in using
many more system resources than necessary.
What the tool does instead of extracting the objects from the files is to allow
threads to pass the object control point, keeping track of how many objects are in
each thread, until the total number of objects that pass are >= the amount of
objects that are allowed to pass in the given time interval. It then calculates how
much extra time it needs to add to the original time interval before more threads
are allowed to pass.
For the above example, 2 threads will pass the control point and be allowed to start
the write process. In other words, 4 objects are allowed to be sent to the API call
within the first 2 second time interval. The tool will calculate how many extra
objects passed the point (1 in this example), figure out the time interval for each
object (0.5 in this example), and add the extra time interval to the original user
provided time interval. This new time interval is used and must expire before the
tool will allow anymore files to be written.
Calculations based on above example:
3 objects per 2 seconds must be maintained
4 objects are allowed to pass within the first 2 seconds.
2/4 = 0.5 seconds per object.
51 of 66
1 extra object was allowed to pass so 0.5 seconds is added to the original interval
of 2 seconds and that is the amount of time the tool will wait before letting more
files be written.
The end result for this example is that the tool is allowing 4 objects to be written
every 2.5 seconds instead of 3 objects every 2 seconds.
3/2
6.9 How to control the percentage of files that are written and
read to and from Centera
The CenteraExerciser tool has the ability to control the percent of writes and reads
to and from Centera.
This control may not be precise. The tool does not guarantee that at any given
percentage the user requests for the operations will be exact, but it should be
close.
In order to use this functionality of the tool, the following rules must be adhered
to:
1) The tool must be asked to operate in asynchronous mode (using the asynch
switch).
2) The total percent between the write operation and the read operation must be
100.
3) The user must set a specific amount of time for the tool to operate by using the
operatefor switch correctly.
4) Delete operations are not allowed and therefore must not be requested.
5) Object control (section 6.8 above) is not allowed to applied to the same
operations.
6.9.1
Example#1
52 of 66
This example will concurrently write and read to and from Centera for
a period of ten hours.
The tool will attempt to make sure that during the ten hour time
frame, 20% of the operations (between the writes and reads) will be
writes, and 80% will be reads.
The write operation will use 12 threads, and the read operation will
use 25 different threads.
Example#2
Example#3
This example will produce a command line error before the test can
start because there is no read operation requested, and the total
percentage requested of the write operation is less than 100.
Example#4
This example fails for the same reason explained in example#3 above
except it is the write operation that is missing.
Example#5
The actual percentages achieved between the write and read operation is posted in
the log file.
53 of 66
6.10
This above command line not only writes 10 files to Centera, but it also writes 10
Clips to Centera. This is not a new concept introduced by the CenteraExerciser
tool, it is how the SDK fundamentally works. In order to write a file, or object, to
Centera, it must be written in terms of a Clip. (If you do not understand why this
is, you may want to familiarize yourself with how Centera works before continuing
with this section.)
To sum up this default functionality of the tool:
Every file requested to be written to Centera is written using a separate Clip, thus
causing each Clip written to Centera from the tool to contain only one (1) user tag.
This default functionality can be changed by adding special requests, or local
parameters (see section 5), to the write operation.
The following table lists the special requests, or local parameters, required to
control how many user tags are written per clip.
Local parameter
Meaning
R-Value
Default RValue
What it does
cl
clip
An integer
tg
tag
An integer
L-Value
Examples:
1) java jar CenteraExerciser.jar address <IP> -order w( cl=5&tg=7)
2) java jar CenteraExerciser.jar address <IP> -order w( cl=5&tg=7) files 4
3) java jar CenteraExerciser.jar address <IP> -order w( tg=7)
4) java jar CenteraExerciser.jar address <IP> -asych w( cl=5&tg=7) operatefor 1h
EMC Centera Development Group
54 of 66
5) java jar CenteraExerciser.jar address <IP> -asych w( tg=7) operatefor 1h
Example#1
Example#2
Example#3
Example#4
Example#5
55 of 66
6.11
6.11.1
56 of 66
The reason for this is because in order for the read operations to do work on a file,
that file must first be written. And it would not be good if the tool deleted a file
before it was written or read, so the delete operation waits for a file that had work
done on it from a higher level operation before it deletes that file.
This is not to say that all of one type of operation will finish before the other
operations start.
What happens in the above example is the tool puts all the files into a queue. The
write operations take files from the queue and writes them to Centera. When a file
is finished being written, the C-ClipID of that file is put into the queue for another
operation to use (in this case the read operation). Once a C-ClipID becomes
available to read, it will get read. The write operations could very well still be going
on (using other files) while the read (and any other) operations are in progress.
When the read operation is finished with its particular C-ClipID, that ID will be
marked for deletion and the delete operation will delete it. All this continues until
all the files have been worked on by all the operations.
This is why the finish order of the operations can always be expected to be an exact
order. A user should be able to look at a command line that requests an
asynchronous operation and logically conclude which operation will finish first,
second, third, etc even though they are all operating asynchronously.
6.11.2
6.11.2.1
57 of 66
The above example #2 writes then reads the same 50 files. The write process writes
the files using 12 threads (tm=12), and the read process reads the files using 5
threads (tm=5). In this example, all the 50 files will first be written before the read
operations start.
It is important to realize that all the threaded operations have the ability to be
individually controlled like the above examples. See section 3.2 under the -order
switch for a list of the types of operations that can be multi threaded.
6.11.2.2
The above example writes 300 files to Centera while at the same time performing
reads from Centera. This example also saves the IDs of the written clips to a
ClipFile by the name myClips.txt. The myClips.txt file will be populated with
ClipIDs every time after the tool finishes writing 100 files, or until the total amount
of files have been reached.
6.12
6.13
58 of 66
The tool does not guarantee that at the exact moment the specified time expires it
will end all operations. The tool does guarantee that at least the specified time will
expirer before the operations will end. The reason for this is because the tool will
not abruptly interrupt a single operation that as already started.
In the attempt to explain the above paragraph further, lets look at an example.
Lets assume that we tell the tool to continue to write and read 20 files using 14
threads for 1 hour. Lets further assume that our request is ordered and not
asynchronous. The blow command line shows this request:
java -jar CenteraExerciser.jar address <IP> -order w,rd files 20 threads 14
operatefor 1h
As you can see above, the operatefor switch and its value of 1h (one hour) is how
the user tells the tool to continually run the operations for one hour of time. As
stated earlier, the value for this switch can be in seconds (s), minutes (m), or hours
(h). The numeric portion of the value can be any positive integer, and there must
not be a space between the integer and the time units.
Lets fast forward to just about the end of the hour given in the above example.
Lets say that there is 0.5 seconds left before the one hour time is expired. Let us
also assume that the tool is performing the write operations at this point in time,
and the 7th thread grabs the 10th file, of the 20 the tool need to write this round,
and starts the BlobWrite() on this file.
Now lets say that the remaining 0.5 seconds passes before another write thread
can grab a file and start the BlobWrite() process. The remainder of the write
threads (6 of them) will die because the time is up. The thread that started the
BlobWrite() process will continue to write the file until finished, and then it will die
(this could be some time after depending how long the write takes). All the read
threads will die immediately and never read any of the previous 10 files that got
written because the time is expired.
As you may or may not have ascertained when reading the above paragraphs, the
operations of the above example will continually loop for the one hour time. The
following is the cycle for the above example:
1) Create 20 files
2) Write 20 files using 14 threads
3) Read the 20 files that are written in step one above using 14 threads
4) Record the log information for both the writes and the reads
5) Record the statistical information for both the writes and reads
6) Start over again at step 1 until time expires.
The point being that depending on the order given, there may be more of one
operation that is performed than another (i.e. more writes than reads or deletes).
The log file will list the total number of files that are operated on for each
operation.
On the other hand, if the asynch switch is used instead of the order switch in the
above example, both the write and read operations will operate concurrently for the
user specified time.
59 of 66
The following is the cycle using the asynch switch in place of the order in the
above example.
1) Create 50 files. (The files switch is ignored)
2) Write those 50 files until either all the files are written or the time expires. If
after writing all 50 files, and the time has not expired, go back to step 1.
3) At the same time step 2 is going on, keep reading the files that are ready to be
read.
4) Stop when time expires.
6.13.1
#############################################
#
START OF TEST
#
#############################################
Creating files...
Source dir
:
c:\temp\trash\
c:\temp\trash\retrieve
: 371
: 10240
: 0.0263
60 of 66
Deleting clips...
Delete time (ms)
Number of clips Deleted
Creating files...
Source dir
:
: 1102
: 10
c:\temp\trash\
Parameter summary:
Version of tool
Test name
Test date & time
: BETA_2.0.158
: not given
: 04-16-06-50-12-552
61 of 66
---------- Read params
Total files read
Max read threads
Init read threads
Read ramp interval
Read ramp time
---------: 10
: 2
: 2
: 0
: 0
1) Two statistical write files. One for each time the write process finished.
2) One statistical read file. (Only one read process happened in the above exampe)
3) One statistical delete file. (Only one delete process happened in the above
example)
6.14
6.15
62 of 66
6.15.1
Using the above example, one hundred files will be created with the following sizes,
in KB because of example:
100, 103, 106, 109, , 400
Notes:
1) Regardless of the increment size, the min value will always be the first file size
created, and max value will always be the last file size created as long as the
number of files is >= 2.
2) Regardless of the increment size, the total number of files created never
exceeds the user specified file number.
3) If a min and max is given, and the number of files is = 1, then only the min file
size is used.
4) If in write mode, the files that are created are written to Centera in the order
created. In other words starting at the min working up to the max. For obvious
reasons, when using multiple threads the tool only guarantees that the created
files are stored in the write queue in the order created, but because of the
nature of threads they may not be written in the exact order.
EXAMPLE OF COMMAND LINE USE:
java jar CenteraExerciser.jar address <IP> -files 100 size 100KB,400KB
In the above command line, the switch size controls the sizes of the unique files
that are created by the tool.
In the above example, the value to the size switch is 100KB,400KB.
Lets take a closer look at this value.
63 of 66
The 100KB portion of the value is the minimum file size to be created. This
minimum portion is followed by the maximum files size to be created of 400KB.
Both the min and max values are separated and connected by a comma (,). No
space is allowed in the makeup of the value.
When using this incremental file size functionality, in general terms, the make up
of the size switch value is as follows:
-size <min file size #><units for min size>,<max file size #><units for max size>
The minimum files size must always precede the maximum files size and be less
than or equal to the max file size.
6.15.2
size switch, the exact same random files will be created every time. Note that the
randomness is dependent not only on the seed, but also on the min and max file
sizes.
NOTE:
64 of 66
The underscore and seed portion of the random is not required. If it is not
provided, the tool will use the current time in milliseconds as the seed for the test
and publish this seed in the log file so it can be used again if needed.
6.15.3
6.16
6.16.1
65 of 66
1) The first and last lines of any ClipFile must be a whole number. This
number represents the time in milliseconds since midnight Jan 1,
1970 and the time when the first and last C-ClipID contained within
the ClipFile where written to Centera.
2) All other lines contain the C-ClipIDs. (One per line)
As long as you maintain the required format of the ClipFile, you can insert any
valid C-ClipID into it, and the CenteraExerciser tool can be used to Read, Delete,
Purge, and or Query the Centera using the C-ClipIDs contained in the ClipFile.
EASY TIP:
If the ClipFile is not going to be used as an input for a query operation, the first
and last lines of the ClipFile can be any whole numbers.
For Example:
Lets say you have a bunch of non-indigenous C-ClipIDs and you want to perform
any of the operations that the CenteraExerciser tool can do except query. All you
need to do to create a ClipFile is the following:
1) Open a blank document in a text editor.
2) On the first line of the blank document, put the number 1.
3) On the second line of the document put the number 2.
4) Put all your C-ClipIDs on individual lines between the number 1 and 2 that
you put into the document in steps two and three above.
5) Save your document.
6.16.2
66 of 66
6.16.3
If the above command line are run in order, the following are the results.
Example#1
This example will write 1000 1KB files to Centera, saving the
resultant C-ClipIDs to a file on the hard drive (in the current
directory) by the name of myClips.txt
Example#2
This example will write 500 1KB files to Centera, saving the resultant
C-ClipIDs to the already existing ClipFile that was created in
example#1. The 500 C-ClipIDs from this command line are
appended to the existing ClipFile.
If the user does not want the 500 C-Clips from this command line to
be appended to the existing ClipFile, then a different name needs to
be given as the value of the saveclipsas switch,
Example#3
This example will read the files that the C-ClipIDs, that are contained
in the myClips.txt ClipFile, refer to, and then query for those CClipIDs.
Example#4
This example will write 15 files to Centera. Save the resultant CClipIDs of these 15 files to the already existing myClips.txt ClipFile
(created in example#1). The C-ClipIDs of this 15 files are appended to
the already existing ClipFile.
The example will then read all the files associated with all the CClipIDs, (the ones that were written in example numbers 1 and 2 and
4).