StorNext 4 0 UserGuide
StorNext 4 0 UserGuide
StorNext® 4.0
StorNext
6-01658-09
Document Title, 6-01658-09 Rev A, March 2010, Product of USA.
Quantum Corporation provides this publication “as is” without warranty of any kind, either express or
implied, including but not limited to the implied warranties of merchantability or fitness for a particular
purpose. Quantum Corporation may revise this publication from time to time without notice.
COPYRIGHT STATEMENT
Copyright 2010 by Quantum Corporation. All rights reserved.
Your right to copy this manual is limited by copyright law. Making copies or adaptations without prior
written authorization of Quantum Corporation is prohibited by law and constitutes a punishable
violation of the law.
TRADEMARK STATEMENT
Quantum, the Quantum logo, DLT, DLTtape, the DLTtape logo, Scalar, and StorNext are registered
trademarks of Quantum Corporation, registered in the U.S. and other countries.
Backup. Recovery. Archive. It’s What We Do., the DLT logo, DLTSage, DXi, DXi-Series, Dynamic
Powerdown, FastSense, FlexLink, GoVault, MediaShield, Optyon, Pocket-sized. Well-armored, SDLT,
SiteCare, SmartVerify, StorageCare, Super DLTtape, SuperLoader, and Vision are trademarks of Quantum.
LTO and Ultrium are trademarks of HP, IBM, and Quantum in the U.S. and other countries. All other
trademarks are the property of their respective companies.
Specifications are subject to change without notice.
Chapter 1 Introduction 1
About StorNext File System. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
About StorNext Storage Manager. . . . . . . . . . . . . . . . . . . . . . . 2
About Distributed LAN Clients . . . . . . . . . . . . . . . . . . . . . . . . . 2
About Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Purpose of This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
How This Guide is Organized. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Notes, Cautions, and Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
About StorNext File StorNext File System streamlines processes and facilitates faster job
System completion by enabling multiple business applications to work from a
single, consolidated data set. Using SNFS, applications running on
different operating systems (Windows, Linux, Solaris, HP-UX, AIX, and
Mac OS X) can simultaneously access and modify files on a common,
high-speed SAN storage pool.
This centralized storage solution eliminates slow LAN-based file
transfers between workstations and dramatically reduces delays caused
by single-server failures. In high availability (HA) configurations, a
redundant server is available to access files and pick up processing
requirements of a failed system, and carry on processing.
About StorNext Storage StorNext Storage Manager enhances the StorNext solution by
Manager reducing the cost of long term data retention, without sacrificing
accessibility. SNSM sits on top of SNFS and utilizes intelligent data
movers to transparently locate data on multiple tiers of storage. This
enables customers to store more files at a lower cost, without having to
reconfigure applications to retrieve data from disparate locations.
Instead, applications continue to access files normally and SNSM
automatically handles data access – regardless of where the file resides.
As data movement occurs, SNSM also performs a variety of data
protection services to guarantee that data is safeguarded both on site
and off site.
About Distributed LAN StorNext supports distributed LAN clients. Unlike a traditional StorNext
Clients SAN client, a distributed LAN client does not connect directly to
StorNext via fibre channel or iSCSI, but rather across a LAN through a
gateway system called a distributed LAN server. The distributed LAN
server is itself a directly connected StorNext client, but it processes
requests from distributed LAN clients in addition to running
applications.
Any number of distributed LAN clients can connect to multiple
distributed LAN servers. StorNext File System supports Distributed LAN
client environments in excess of 1000 clients.
Besides the obvious cost-savings benefit of using distributed LAN
clients, there will be performance improvements as well.
Distributed LAN clients must be licensed in the same way as StorNext
SAN clients. When you request your permanent StorNext license, you
will need to specify the number of distributed LAN clients you plan to
use. Naturally, you can always purchase additional distributed LAN client
licenses as your needs expand. For more information about StorNext
licensing, see Step 2: License on page 19
StorNext provides distributed LAN client information via the status
monitors on the StorNext home page. More detailed information is
available through the Clients Report and the Distributed LAN Client
Performance Report. For more information about StorNext reports, see
StorNext Reports on page 201.
Before you can fully use distributed LAN clients, you must first configure
a distributed LAN server and distributed LAN clients as described in the
StorNext Installation Guide.
About Licensing Beginning with StorNext 4.0, licensing has changed significantly
compared to previous releases. Multiple licenses are now required for
various StorNext features, as well as to perform an upgrade to a new
release.
If you have not already installed StorNext 4.0 (or upgraded from a
previous release), be sure to read the procedure in the section Step 2:
License on page 19 before you proceed.
Document Conventions
This guide uses the following document conventions to help you
recognize different types of information.
Conventions Examples
This section describes how to access and navigate through the StorNext
GUI.
This chapter includes the following topics:
• Accessing the StorNext GUI
• The StorNext Home Page
2 In the browser’s Address field, type the full address of the machine
and its port number, and then press Enter. For example: http://
<machine name>:<port number>. Use the name of the machine and
port number you copied when you installed the StorNext software.
Note: Typically, the port number is 81. If port 81 is in use, use the next
unused port number. (I.e., 82, 83, etc.)
After you enter the machine name and port number, the following
window appears:
6 On this screen you can determine if the StorNext File System and
Storage Manager components are currently started. If not, click
Start for each component to start them.
7 Click the home (house) icon in the upper right corner to go to the
StorNext Home Page.
Note: When you log into StorNext for the first time, you might see a
message warning you about a security certificate. Refer to the
Quantum Knowledge Base for a permanent workaround to this
issue. For a temporary solution, create a certificate exception
that will allow you to log into StorNext without seeing the
warning message during subsequent logins.
StorNext Monitors The StorNext Home Page displays the following status and capacity
monitors which are used to show the current state of the StorNext
system:
• File Systems Capacity Monitor
• Libraries Capacity Monitor
• Storage Disks Capacity Monitor
• Tape Drive Status
• Policy Capacity Monitor
Use these monitors to view current statistics of managed or unmanaged
file systems and configured libraries and/or drives, including file system,
library, and drive information. Each of the status monitors provides an
at-a-glance view of the total number of components (file systems,
libraries, storage disks, or tape drives) and the current state of the file
system: green for normal, yellow for warning, and red for error.
The information shown in the monitors is refreshed periodically. You can
specify the refresh rate by choosing the desired interval from the
Refresh Rate list:
• No Refresh
• 30 seconds
• 1 minute
• 2 minutes
• 5 minutes
• 10 minutes
• The number of store candidates, which are files selected for storage
to secondary media.
• The number of files that have been stored and meet the criteria to
become a truncation candidate.
• Current status (Error, Warning or Normal)
Note: NOTE: The home page status and capacity monitors are
intended to give you an approximate at-a-glance view of all
the file systems, libraries, storage disks etc. on your system.
StorNext Home Page The dropdown menu options located in the bar at the top of every page
Dropdown Menus allow you to access StorNext setup, tools, service, and reporting
options.
The StorNext home page contains these drop-down menus and menu
options:
Step 1: Welcome
The first screen in the Configuration Wizard is the Welcome screen. This
screen shows disks and libraries that are currently available for StorNext
usage. As you add new disks and libraries, the information on this
screen is updated.
If desired, you can manually update the screen by clicking Refresh.
When you are ready to proceed to the next step, click Next in the left
column.
Step 2: License
Use the Enter License wizard to enter license strings for the StorNext
products you have purchased. You must have a license to configure or
use StorNext products or features.
If the license.dat file does not contain permanent licenses, StorNext
produces an auto-generated license with an expiration date for all
StorNext products and features except Deduplication. In some cases
Quantum may provide evaluation licenses for features. Evaluation
licenses also have a fixed expiration date.
Beyond the evaluation period, you must have a permanent license to
configure or use StorNext features.
Here is a list of StorNext licenses:
• File System: A File System license enables you to create and modify
StorNext-supported file systems.
• LAN Client: You must have a Distributed LAN Client license for each
LAN client you use with StorNext (in addition to any SAN clients).
2 Read the license carefully, and then click Accept. The Setup >
License > Entry screen appears.
License Expiration and Each StorNext feature license has a license expiration date (shown in the
Limits Expires column) and a limit shown in the Limit column.
Following is an explanation of the limit for each feature as it pertains to
licensing:
• File System: The number displayed is the maximum number of SAN
clients allowed
• LAN Clients: The number displayed is the maximum number of LAN
clients allowed
• Storage Manager: The number displayed is the licensed capacity
for files being managed by the SNSM software.
Note: The capacity for storage disks does not include “dead
space” but does include all data on the file system where
the storage disk has been configured, not just files copied
to the file system by the Storage Manager. To maximize the
licensed Storage Manager capacity, the storage disk file
systems should be restricted to Storage Manager data only.
If your storage disk contains ‘user’ data you should
consider moving that data to an alternate location prior to
performing a StorNext upgrade.
• Replication: Unlimited
Updating Licenses You will need to update a license if the license expires or if your
configuration changes (for example, if you add additional clients or
increase capacity.)
Licensing and Upgrades The new licensing implementation affects StorNext upgrades, both for
release 4.0 and future releases. Be aware of the following upgrade-
related implications and plan accordingly:
• A non-expired Maintenance license is required to perform a
StorNext upgrade. This means you must contact Quantum Technical
Support for a Maintenance license before you can upgrade to
StorNext 4.0.
• The Maintenance license provided by Quantum Technical Support
must be put into place prior to the upgrade, or you will not be
allowed to proceed with the upgrade. This step is done by manually
editing the license.dat file because the StorNext GUI has not
been installed and therefore you cannot enter licenses through the
GUI. This applies to upgrades to StorNext 4.0 and also to future
upgrades.
• For future upgrades, for any StorNext feature or component
currently in use, you must have a license in place prior to the
upgrade. Otherwise, you will not be allowed to proceed with the
upgrade.
• For future upgrades, after an upgrade you will still be allowed to
run StorNext if the Maintenance license expires. However, no future
upgrades will be allowed.
• The Maintenance license must remain in place even after expiration
to allow the StorNext software to run, because it indicates which
version of the software was purchased.
• If you are ready to upgrade and then notice that the Storage
Manager capacity has been exceeded, you can follow the procedure
below to free up capacity to bring it under the licensed value. These
steps will clean up “dead space” on tape media, and do not apply to
storage disks.
2 To add a new name server, enter the IP address in the field to the left
of the Add button. The new name server appears in the list of
available name servers.
3 When a message informs you that Name Servers has been updated,
click OK to continue.
4 If there are previously configured name servers, you can specify the
order in which name servers are used. To set the order, select a
server and then click Move Up or Move Down until the selected
server is in the correct order.
A green check mark icon under the Enabled column heading indicates
that the server is currently enabled as a name server. A red X icon
indicates that the server is not currently enabled.
Deleting a Name Server To delete a name server, select the name server you want to delete and
then click Delete. Finalize the deletion by clicking Apply.
the Setup menu.) The Setup > File System screen displays all
currently configured file systems. (If you are running the
Configuration Wizard for the first time, there will be no existing file
systems.)
From this screen you can view, add, edit, or delete a file system. For
information on these procedures, see the online help.
2 Click New to add a new file system. The Setup > File System > New
Screen appears.
3 Click New. The Setup > Storage Destinations > Library > New Screen
appears.
4 Enter the fields at the top of the screen. (For detailed information
about what to enter on this screen, see the online help.)
5 In the Drives section, select a tape drive to add to your new library,
or click Scan to have StorNext discover available drives for you.
6 In the Media section, view media available for use. (Click Scan to
have StorNext discover available media for you.)
7 Click Apply.
8 After a message informs you that the library was successfully
created, click OK.
9 Repeat steps 3 - 8 to add additional tape drives and media to the
new library.
Viewing an Existing Follow this procedure to view details for a previously created library:
Library 1 Choose Storage Destinations from the Setup menu. If necessary,
click the Library tab. The Setup > Storage Destinations > Library
screen appears. (See Figure 11.)
2 Select the library whose information you want to view.
3 Click View, or choose View from the actions dropdown list. The
library detail screen appears.
Editing a Library Follow this procedure to edit parameters for an existing library:
1 If you have not already done so, choose Storage Destinations from
the Setup menu and then click the Library tab.
2 Select the library you want to edit.
3 Click Edit, or choose Edit from the actions dropdown list. After you
select this option StorNext scans the library, which could take some
time to complete depending on your configuration.
4 If desired, click Scan if you want StorNext to scan the library for
drives.
5 Select the tape drives you want included in your library, or click All
to include all available tape drives. (To exclude all drives, click None.)
6 Click Apply.
7 When a confirmation message appears, click Yes to proceed, or No
to abort.
8 After a message informs you that the library was successfully
modified, click OK.
Performing Other Towards the middle of the Setup > Storage Destinations > Library
Library Actions screen is a dropdown list of actions you can perform for libraries.
Select the library for which you want to perform the action, and then
choose one of these options from the Select Action dropdown list:
• Audit: Select this option to perform an audit on the selected library.
An audit is a physical check of each library component to verify its
integrity and make sure the database and library are synchronized.
Quantum recommends that you audit the library after each restore.
• Remap-Audit: Select this option to perform an audit on the
selected library with a physical inventory of the library. This option
synchronizes the StorNext databases with the library databases.
• Online: Select this option to set the library online. No additional
actions are required.
• Offline: Select this option to take the library offline. No additional
actions are required.
• Drives Online: Select this option to place the drives in the library
online. No additional actions are required.
• Drives Offline: Select this option to take the drives in the library
offline. No additional actions are required.
• Add Media Bulkload: Select this option to add media to the library
via the bulk loading method.
• Add Media Mailbox: Select this option to add media to the library
through the library's mailbox.
Storage Disk Overview Storage disks are external devices on UNIX-based file systems that can
be used for long term data storage. Storage disks function and operate
the same way as physical tape media.
When a storage disk is configured, the StorNext Storage Manager
moves data to storage disks for long-term retention in addition to, or
instead of tape. This enables users to leverage the specialized third-party
functionality of appliances or store small files that might take longer to
retrieve from tape. Many users will still use tape for long-term storage
and vaulting, but storage disk can be used to create tape-free archives.
Here are a few differences storage disks have over tape media, aside
from the obvious cost-saving benefit:
• A storage disk either belongs to no policy class, or belongs to a
single policy class
• A storage disk can store file copies only with the same copy ID.
Note: Before you create a storage disk, the disks you plan to use must
reside in an existing, mounted file system.
Adding a New Storage Follow this procedure to add a new storage disk.
Disk 1 Click the Storage Disk tab. The Setup > Storage Destinations >
Storage Disk Screen appears.
2 Click New. The Storage Destinations > Storage Disk > New
Screen appears.
3 Enter the fields on the screen. (For detailed information about what
to enter on this screen, see the online help.)
4 Click Apply.
5 Repeat steps 2 - 4 to add additional storage disks.
Viewing an Existing Follow this procedure to view a list of previously configured storage
Storage Disks disks.
1 Choose Storage Destinations from the Setup menu.
2 Click the Storage Disk tab. Any previously configured storage disks
are displayed.
Editing a Storage Disk Follow this procedure to edit a currently configured storage disk.
1 If you have not already done so, choose Storage Destinations from
the Setup menu and then click the Storage Disk tab.
2 Select the storage disk whose information you want to edit.
3 Click Edit.
4 Modify any of the fields you entered when creating the storage disk.
(For field information, see the online help or the descriptions in
Adding a New Storage Disk on page 39.)
5 Click Apply.
6 When a confirmation message appears, click Yes to proceed, or No
to abort.
7 After a message informs you that the storage disk was successfully
modified, click OK.
Deleting a Storage Disk Follow this procedure to delete a currently configured storage disk.
1 If you have not already done so, choose Storage Destinations from
the Setup menu and then click the Storage Disk tab.
2 Select the storage disk you want to delete.
3 Click Delete.
4 When a confirmation message appears, click Yes to proceed with
the deletion or No to abort.
5 After a message informs you that the storage disk was successfully
deleted, click OK.
Adding a New Data Follow this procedure to add a new replication target host.
Replication Host 1 Click the Replication Targets tab. The Setup > Storage Destinations
> Replication / Deduplication Screen appears.
Editing a Data Follow this procedure to edit an existing data replication target.
Replication Host 1 If you have not already done so, click the Replication Targets tab.
2 If necessary, click the plus sign (+) beside the Replication Targets
heading in the box titled Replication Target Configuration.
3 Select the replication target you want to edit.
4 Click Edit.
Adding a New Mount Follow this procedure to add a new mount point to a replication target.
Point 1 If you have not already done so, choose Storage Destinations on
the left side of the screen. (Alternatively, choose Storage
Destinations from the Setup menu.)
2 Click the Replication Targets tab.
3 Select the replication target (host) to which you would like to add a
mount point. (You might need to click the dash to the left of the
Replication Targets heading to display the available hosts.)
4 Click Add Mount Point.
5 Click Scan Host to identify available mount points on the selected
host.
6 At the Mount Point field, select a mount point and then click Add.
7 Repeat steps 3 - 6 to add additional mount points.
8 Click Apply to save the changes.
Enabling Data The Deduplication tab enables you to create a blockpool on a specified
Deduplication file system.
To create the blockpool, select the desired file system from the
dropdown list next to the Blockpool File System label, and then click
Apply.
Note: The blockpool should not be placed on a file system that will be
used as the HA shared file system. This is a requirement even if
you do not plan to use the StorNext Deduplication feature.
Adding a Storage Follow this procedure to add a new Storage Manager storage policy:
Manager Storage Policy 1 When the Configuration Wizard is displayed, choose Storage Policy
on the left side of the screen. (Alternatively, choose Storage Policy
from the Setup menu.) The Setup > Storage Policy Screen
appears.
For instructions on what to enter on this screen, see the online help.
For instructions on what to enter on this screen, see the online help.
For instructions on what to enter on this screen, see the online help.
For instructions on what to enter on this screen, see the online help.
For instructions on what to enter on this screen, see the online help.
Adding a Replication The steps for creating a replication storage policy are described in Step
Storage Policy 4: Create a Replication Storage Policy on page 131.
Viewing a Storage To view storage policy details To view storage policy details for a Storage
Policy Manager or Replication policy, do the following:
1 From the Setup > Storage Policy screen, select the storage policy
you wish to view.
2 Click View.
If you are editing a Storage Manager policy, you can edit fields on
the General, Relocation, Steering, Schedule and Associated
Directories tabs. For more information about fields on these tabs,
see the online help.
If you are editing a Replication global policy, you can edit fields on
Deduplication, Truncation, Outbound Replication, Inbound
Replication, and Blackout tabs. If you are editing a Replication
target policy, you can modify field on only the Inbound Replication
tab. For more information about fields on these tabs, see the online
help.
4 Click Apply to save changes and return to the Setup > Storage
Policy screen, or Cancel to abort.
Note: The Email Server option does not configure your email server.
Instead, it allows you to specify a previously configured email
server so StorNext knows which server is responsible for
processing notification messages. Before you use the Email
Server option, make sure your email SMTP server is already
configured.
Adding an Email Server Follow this procedure to add a new email server.
1 When the Configuration Wizard is displayed, choose Email on the
left side of the screen. (Alternatively, choose Email Server from the
Setup menu.) The Setup > Email Server Screen appears.
Note: In order for this feature to work properly, make sure you have
specified a configured email server as described in Adding an
Email Server on page 55.
3 Click New. The Setup > Email Notifications > New screen
appears.
4 Complete the fields for the new email recipient. (For detailed
information about what to enter on this screen, see the online
help.)
5 Click Apply to save your changes.
6 When the confirmation message appears, click Yes to proceed or No
to abort.
7 When a message informs you that the email notification recipient
was successfully added, click OK to return to the Setup > Email
Notifications screen.
Viewing Email Recipient Follow this procedure to view details for an existing email recipient.
Information 1 If you have not already done so, when the Configuration Wizard is
displayed, choose Email Notifications on the left side of the screen.
(Alternatively, choose Email Notifications from the Setup menu.)
2 On the Setup > Email Notifications screen, review the list of
current email recipients.
3 Select the recipient whose information you want to view, and then
click View.
Editing an Email Follow this procedure to edit information for a previously entered email
Recipient recipient.
1 If you have not already done so, when the Configuration Wizard is
displayed, choose Email Notifications on the left side of the screen.
(Alternatively, choose Email Notifications from the Setup menu.)
2 On the Setup > Email Notifications screen, select the recipient
whose information you want to edit and then click Edit.
3 Modify any of the fields on the screen. (For detailed information
about what to enter on this screen, see the online help.)
4 When you are finished making modifications, click Apply to save
your changes and return to the Setup > Email Notifications
screen. (To exit without saving, click Cancel.)
Deleting an Email Follow this procedure to delete a previously entered email recipient.
Recipient 1 If you have not already done so, when the Configuration Wizard is
displayed, choose Email Notifications on the left side of the screen.
(Alternatively, choose Email Notifications from the Setup menu.)
2 On the Setup > Email Notifications screen, review the list of
current email recipients.
3 Select the recipient you want to delete and then click Delete.
4 When the confirmation message appears, click Yes to proceed or No
to abort the deletion.
5 When a message informs you that the email notification recipient
was successfully deleted, click OK return to the Setup > Email
Notifications screen.
Step 9: Done
The last step in the Configuration Wizard is to click Done to indicate
that you have completed all configuration steps.
On this screen you can also convert to a high availability (HA)
configuration by clicking Convert to HA. Clicking this button is the
same as choosing HA > Convert from the Tools menu. For information
about entering the fields on this screen and converting to an HA system,
see Converting to HA on page 193.
In addition to the basic file system tasks described for the Configuration
Wizard in Step 4: File System on page 29, the Tools > File System menu
contains additional options that enable you to perform the following
file system-related tasks:
• Label Disks: Apply EFI or VTOC label names for disk devices in your
StorNext libraries
• Check File System: Run a check on StorNext files systems prior to
expanding or migrating the file system
• Affinities: Allocate additional storage to a file system by creating a
new stripe group in the file system configuration file, and assigning
new disks to the stripe group
• Migrate Data: Move data files from a source file system to a
destination stripe group, freeing stripe groups so they can be
removed from an existing StorNext file system
• Truncation Parameters: Enter truncation parameters for your file
systems in order to free up file storage that isn’t being actively used
Label Disks
Each drive used by StorNext must be labeled. (A new drive must be
labeled only one time.) You can label a drive from any StorNext server or
client that has a fibre channel (FC) connection to the drive.
There are two types of label:
• EFI labels are required if you plan to create LUNs that are larger than
2TB. (For Solaris, EFI labels are also required for LUNs with a raw
capacity greater than 1TB.) EFI labels will not work with the IRIX
operating system.
• VTOC labels were used for all operating systems in previous
StorNext and Xsan releases, and are still required for the SGI IRIX
operating system, Solaris releases prior to Solaris 10 Update 2, and
LUNs less than 1TB.
Labeling a Device Follow this procedure to label any new or unused devices, or relabel a
device that has been unlabeled.
1 Choose Label Disk from the Tools > File System menu. The Tools >
Label Disks screen appears.
2 Select the disk devices to which you want to apply labels. (Click All
to select all available disks.) If a disk device already has a label,
continuing with this procedure overwrites the existing label.
Note: If you later unlabel a device and then decide to make the
unlabeled device usable by the StorNext File System, you must
first relabel the device. The relabeling process is identical to
labeling initially.
Unlabeling a Device Follow this procedure to remove a label from a previously labeled
device. If you unlabel a device and then decide later to make the
unlabeled device usable by the StorNext File System, you must first
relabel the device. The relabeling process is identical to labeling initially
as described in Labeling a Device.
Note: You cannot remove the label from a disk device that has been
previously assigned to a file system. You can identify these
devices by the file system name under the Filesystem heading.
1 If you have not already done so, choose Label Disk from the Tools
> File System menu.
2 Select the disk devices from which you want to remove labels. (Click
All to select all available disks.)
3 Click Unlabel.
4 When the confirmation message appears, click OK to verify that you
want to unlabel the selected disk(s). (Click Cancel to abort without
unlabelling the disk.)
Caution: When you unlabel a device, all data on that device will be
lost. Additionally, the unlabeled device will no longer be
used by the file system until it is relabeled.
1 Choose Check File System from the Tools > File System menu. The
Tools > Check > [file system name] screen appears.
Note: If the file system you select is currently started and mounted,
the check will be automatically performed in read-only mode.
In read-only mode on a live file system (started and mounted,)
you could receive false errors.
Viewing and Deleting a After you have run at least one file system check, information about the
Check Report process appears at the bottom of the screen: file system name, the time
the check was initiated and completed, and the status of the check. To
view details about a specific check, select the desired check at the
bottom of the screen and then click Report. When you are finished
viewing the report, click Done to return to the previous screen.
To delete a check report from the list, select the check you want to
delete and then click Delete. To delete all previously run checks listed,
click Delete All.
File System Check If you do not want to use StorNext to view output from the file system
Output Files check, you can view output in two files:
• /usr/cvfs/data/<fsname>/trace/cvfsck-<timestamp>
For example: /usr/cvfs/data/snfs1/trace/cvfsck-
02_22_2010-12_15_19
• /usr/adic/gui/logs/jobs/CHECK_FS-<timestamp>-
<jobid>
For example: /usr/adic/gui/logs/jobs/CHECK_FS-
20100222_121519-77
Affinities
This section describes StorNext’s “stripe group affinity” feature, and also
provides some common use cases.
A stripe group is a collection of LUNs (typically disks or arrays) across
which data is striped. Each stripe group also has a number of associated
attributes, including affinity and exclusivity.
An affinity is used to steer the allocation of a file’s data onto a set of
stripe groups. Affinities are referenced by their name, which may be up
to eight characters long. An affinity may be assigned to a set of stripe
groups, representing a named pool of space, and to a file or directory,
representing the logical point in the file system and directing the
storage to use the designated pool.
Exclusivity means a stripe group has both an affinity and the exclusive
attribute, and can have its space allocated only by files with that affinity.
Files without a matching affinity cannot allocate space from an exclusive
stripe group. Files with an affinity that is exclusive cannot be stored on
other stripe groups without that affinity. If the exclusive stripe group(s)
become filled, no more files with that affinity can be stored.
Affinities for stripe groups are defined in the file system configuration
file. Although stripe groups can be created by adding one or more
Affinity lines to the configuration file’s StripeGroup section, Quantum
recommends using the StorNext GUI to add stripe groups. A stripe
group may have multiple affinities, and an affinity may be assigned to
multiple stripe groups.
Allocation Strategy • StorNext has multiple allocation strategies which can be set at the
file system level. These strategies control where a new file’s first
blocks will be allocated. Affinities modify this behavior in two ways:
• A file with an affinity will be allocated only on a stripe group with
matching affinity.
• A stripe group with an affinity and the exclusive attribute will be
used only for allocations by files with matching affinity.
Once a file has been created, StorNext attempts to keep all of its data on
the same stripe group. If there is no more space on that stripe group,
data may be allocated from another stripe group. If the file has an
affinity, only stripe groups with that affinity will be considered; if all
stripe groups with that affinity are full, new space may not be allocated
for the file, even if other stripe groups are available.
Example Use Cases Affinities can be used to segregate audio and video files onto their own
stripe groups. For example:
• Create one or more stripe groups with an AUDIO affinity and the
exclusive attribute.
• Create one or more stripe groups with a VIDEO affinity and the
exclusive attribute.
• Create one or more stripe groups with no affinity (for non-audio,
non-video files).
• Create a directory for audio using ‘cvmkdir -k AUDIO audio’.
• Create a directory for video using ‘cvmkdir -k VIDEO video’.
Files created within the audio directory will reside only on the AUDIO
stripe group. (If this stripe group fills, no more audio files can be
created.)
Files created within the video directory will reside only on the VIDEO
stripe group. (If this stripe group fills, no more video files can be
created.)
To reserve high-speed disk for critical files:
• Create a stripe group with a FAST affinity and the exclusive
attribute.
• Label the critical files or directories with the FAST affinity.
The disadvantage here is that the critical files are restricted to only using
the fast disk. If the fast disk fills up, the files will not have space
allocated on slow disks.
To reserve high-speed disk for critical files, but allow them to grow onto
slow disks:
• Create a stripe group with a FAST affinity and the exclusive
attribute.
• Create all of the critical files, pre allocating at least one block of
space, with the FAST affinity. (Or move them using snfsdefrag, after
ensuring they are non-empty.)
• Remove the FAST affinity from the critical files.
Because files will allocate from their existing stripe group, even if they
no longer have a matching affinity, the critical files will continue to
grow on the FAST stripe group. Once this stripe group is full, they can
allocate space from other stripe groups, since they do not have an
affinity.
This will not work if critical files may be created later, unless there is a
process to move them to the FAST stripe group, or an affinity is set on
the critical files by inheritance but removed after their first allocation (to
allow them to grow onto non-FAST groups).
4 At the File System field, select the file system to which you want to
associate the new affinity.
5 Click Apply to create the affinity.
6 When a message notifies you that the affinity was successfully
created, click OK to continue.
Migrate Data
Migrating file system data refers to moving data files from a file
system’s source stripe group to all the other stripe groups on the same
file system, and then freeing the source stripe group so it can be
removed from the file system. You can select the source stripe group
only, not the destination stripe group(s). Files will be moved randomly
to new stripe groups while respecting their affinity rules (if any). When
migrating, make sure the source stripe group is completely empty when
the process completes, because source files that are updated while the
file system is running may be left behind, requiring a second iteration of
the migration.
During file system migration, you indicate a file system from which to
move data.
StorNext then moves all data of the same type (either data or
metadata,) from the source file system to the specified destination
stripe group. During movement the file system is left online and read/
write operations occur normally.
The time it takes to complete the migration process depends on the
amount of data being moved between source file system and target
stripe groups. When moving a data stripe group, the file system
continues to run during the move. StorNext does not block any new
read/write requests, or block updates to existing files on the source file
system. All operations (including metadata operations) are handled
normally, but no new writes are allowed to the source stripe group,
which will be marked read-only.
Use the following procedure to perform file system migration.
1 Choose Migrate Data from the Tools > File System menu. The
Tools > File System > Migrate screen appears.
2 Select the target file system from which files will be migrated.
3 Select the destination stripe group to which files will be migrated.
4 Click Migrate.
Truncation Parameters
The Truncation Parameters screen enables you to view or change the
following information pertinent to the truncation feature as it pertains
to StorNext Storage Manager:
• Run: Indicates the current status of the truncation feature: Online of
Offline.
• Mount: Indicates whether the file system is currently mounted.
• File System: Displays the name of the truncation-enabled file
system.
• Mount Point: Shows the mount point for the truncation-enabled
file system
• Truncation Parameters: Shows the current truncation setting, such
as Time-based 75%.
Figure 39 Truncation
Parameters Screen
The Tools > Storage Manager menu contains options that enable you
to perform the following Storage Manager-related tasks:
• Storage Components: View your system's libraries, storage disks,
and tape drives, and place those devices online or offline
• Drive Pool: View, add, edit, or delete drive pools (groups of tape
drives allocated for various administrator-defined storage tasks)
• Media Actions: Perform various actions on the storage media in
your library
• Library Operator Interface: The StorNext Library Operator Interface
allows you to perform media-related actions remotely from the
library
• Software Requests: View current software requests in progress or
cancel a request
• Scheduler: Schedule tasks to run automatically based on a specified
schedule
• Alternate Retrieval Location: Specify a remote retrieval location to
use in situations where files stored on tape or a storage disk cannot
be accessed.
• Distributed Data Mover (DDM): Spread the distribution of data
across several machines rather than the primary server.
Storage Components
The Tools menu's Storage Components option enables you to view your
system's libraries, storage disks, and tape drives, and place those devices
online or offline. The Tools > Storage Manager > Storage
Components screen is divided into three sections corresponding to
libraries, storage disks and tape drives.
To access the Tools > Storage Manager > Storage Components
screen, choose Storage Components from the Tools > Storage
Manager menu.
Figure 40 Storage
Components Screen
Setting Devices Online The process for setting devices online or offline is identical regardless of
and Offline device type. Select the library, storage disk or tape drive you want to
place online or offline. You can select multiple devices in each category,
or select all available devices in each category by clicking All. After you
are satisfied with your selections, click either Online to place selected
devices online, or Offline to take selected devices offline.
Additional Options for There are four additional options available for tape drives:
Tape Drives • Dismount Delay: This option enables you to specify the time, in
seconds, that a tape drive remains idle before the media in that
drive is dismounted. Select the tape drives for which you want the
delay, enter the desired time interval at the Dismount Delay field,
and then click Dismount Delay.
• Enable Compression: Compression is a feature supported by some
tape drives which maximizes the amount of available storage space.
To enable compression, select the tape drives for which you want to
enable compression and then click Enable Compression.
• Disable Compression: If compression was previously enabled and
you want to disable it, select the tape drives for which you want to
disable compression and then click Disable Compression.
• Clean: This option allows you to request that a drive be cleaned.
Before choosing this option, make sure the tape drive is loaded with
a cleaning cartridge. When you are ready to proceed, click Clean.
Drive Pool
Drive pools are groups of tape drives allocated for various administrator-
defined storage tasks, and enable you to delimit storage processes
based on data type, performance, security, location, or all of these
variables. Drive pools can reside in a single tape library or span multiple
tape libraries.
Viewing Drive Pool Follow this procedure to view drive pool information.
Information 1 Choose Drive Pool from the Tools menu. The Drive Pool screen
appears.
2 Select the drive pool whose information you want to see, and then
click View.
3 The following information appears:
• Serial Number: The serial numbers of all tape drives in the drive
pool
• Drive Alias: The corresponding alias number for each drive
• Media Type: The type of tape drive media for each drive (e.g.,
LTO)
• Library: The name of the library to which each drive belongs
• Pool Name: The name of the drive pool to which each drive
belongs
4 When you are finished viewing drive pool information, click Done.
1 If you have not already done so, choose Drive Pool from the Tools
menu.
2 Click New to add a new drive pool. The Drive Pool > New screen
appears.
6 After a message informs you that the drive pool was successfully
created, click OK to continue.
1 If you have not already done so, choose Drive Pool from the Tools
menu.
2 Select the drive pool you want to modify, and then click Edit.
3 Select or deselect available drives for the drive pool. (You cannot
change the drive pool name.)
4 Click Apply.
5 When the confirmation message appears, click Yes to proceed or No
to abort.
6 After a message informs you that the drive pool was successfully
modified, click OK to continue.
Deleting a Drive Pool Follow this procedure to delete a drive pool. Before you begin, you must
first remove all drives in the pool you want to delete.
1 If you have not already done so, choose Drive Pool from the Tools
menu.
2 Select the drive pool you want to delete, and then click Delete.
3 When a confirmation message appears, click Yes to proceed with
the deletion or No to abort.
4 After a message informs you that the drive pool was successfully
deleted, click OK.
Media Actions
The Tools menu's Media Actions option enables you to perform various
actions on the storage media in your library.
To access the Tools > Storage Manager > Media Actions screen,
choose Media Actions from the Tools > Storage Manager menu.
Viewing Media After you choose the Media Action option, the following information
Information about all of the storage media appears:
• Media ID: The unique identifier for the media.
• Library: The name of the library in which the media resides.
• Media Type and Class: The media type and class of media. (For
example, LTO, F0_LTO_DATA)
• Policy Class: The name of the policy class, if any associated with the
media.
Filtering Media Most Media Action screens contain a filtering feature that allows you
to restrict the available media to those whose media ID contains the
string you specify. Follow these steps to filter media:
1 At the Media ID Filter field, enter the string you want all available
media IDs to include.
2 Click Set Filter.
3 Click Refresh to update the list of available media. Only media
whose IDs contain the string you entered will be shown.
4 To reset the filter string, click Clear Filter. If desired, repeat steps 1 -
3 to use a new filter string.
Performing Media At the top of the screen is a dropdown list of actions you can perform
Actions for selected media. Select the media for which you want to perform the
action, and then choose one of these options from the Available Actions
list:
Mount Media
Select this option to mount the storage media.
1 After you select this option, select from the Library dropdown list
the library containing the media you want to mount.
2 Select the media to mount.
3 At the Mount Media Parameters > Drive field, select the drive on
which to mount the media.
4 Click Apply.
Dismount Media
Select this option to dismount previously mounted media.
1 After you select this option. a list of mounted media appears.
2 Select the media you want to dismount, and then click Apply.
3 When the confirmation message appears, click Yes to dismount the
media, or No to abort.
Move Media
Select this option to move media from one library to another.
1 After you select this option, select from the Library dropdown list
the library containing the media you want to move.
2 Select one or more media to move, or click All to select all media.
3 At the Move Media Parameters > Destination Library field, select
the destination library to which you want to move the selected
media.
4 Click Apply.
5 When the confirmation message appears, click Yes to move the
selected media, or No to abort.
Remove Media
Select this option to remove media from the StorNext Storage Manager.
Only media with no active files on a media can be selected for removal.
The media is removed from the system and is physically ejected from the
library.
1 After you select this option, select from the Library dropdown list
the library containing the media you want to remove.
2 Select one or more media to remove, or click All to select all media.
3 Click Apply.
4 When the confirmation message appears, click Yes to remove the
selected media, or No to abort.
Purge Media
Select this option to purge media from the StorNext Storage Manager.
All files are removed from the selected media, and then the media is
removed from the StorNext Storage Manager and is physically ejected
from the library.
1 After you select this option, select from the Library dropdown list
the library containing the media you want to purge.
2 Select one or more media to purge, or click All to select all media.
3 Click Apply.
4 When the confirmation message appears, click Yes to purge the
selected media, or No to abort.
Reclassify Media
Select this option to change the media type classification for selected
media.
1 After you select this option, select from the Media Class dropdown
list the current media class designation you want to change.
2 Select one or more media to reclassify, or click All to select all
media.
3 At the Reclassify Media Parameters > Destination Media Class
field, select the new media type designation for the selected media.
Select one of these options:
• DATA: This media class means that media are candidates for
read/write operations. Most media residing in the library have
this classification unless they are full.
• ADDBLANK: This is the default class with which media are
associated when they are added to StorNext MSM. (Running the
Fsmedin command pulls media from this class and changes the
classification to DATA.)
• IMPORT: Before running the fsmedin command on TSM-
exported media, the classification should be changed to
IMPORT.
• CHECKIN: This classification is used for re-entering media
which have been checked out. Media must be reclassified with
Transcribe Media
Transcribe (copy) the contents of one media type to another media type,
or reclaim (defragment) media. During the transcription or reclamation
process, StorNext uses two drives to transcribe one media to another
media, file by file.
Media Attributes
Select this option to view the attributes currently assigned to your
media, or to change attributes.
1 If desired, filter the displayed list of media by selecting one or more
of the following media attribute filters: Suspect, Marked, Full,
Unavailable, or Write Protected. The list refreshes each time you
select a media attribute filter.
"Suspect" means the media might not be physically sound, and
could be in a potentially damaged or unusable condition.
"Marked" means the media should be made inaccessible.
"Full" means the media has reached capacity and should not be
available for further writing.
"Unavailable" means the media is not available for writing or
reading.
"Write Protected" means the media is protected against further
writing and cannot be overwritten or have data added.
2 Select from the list one or more media whose attributes you want to
change, or click All to select all media.
3 At the Media Attributes Parameters > New Media State field,
select the new attribute you want to apply to the selected media.
4 Click Apply.
5 When the confirmation message appears, click Yes to move the
selected media, or No to abort.
6 Repeat steps 3 - 5 to apply additional attributes to the selected
media.
1 Select one or more media you want to clean, or click All to select all
media.
2 At the Clean Media by Media ID Parameters > End Time field,
enter the time when you want the cleaning process to stop. (You
can also use the displayed default end time.)
3 Use the format yyyy:MM:dd:HH:mm:ss when entering an end time.
4 Click Apply.
5 When the confirmation message appears, click Yes to begin
cleaning media, or No to abort.
Software Requests
The Software Requests menu option enables you to view software
requests currently in process, and to cancel requests.
On the Tools > Software Requests screen you can view the following
information about pending and currently running software requests:
• ID: The software request’s identifier
• Type: The type of software request currently running
• State: The current state of the request
• Time: The time the software request was initiated
• Priority: The priority assigned to the software request
1 Choose Software Requests from the Tools > Storage Manager
menu. The Software Requests Screen appears.
Scheduler
StorNext events are tasks that are scheduled to run automatically based
on a specified schedule. The following events can be scheduled:
• Clean Info: This scheduled background operation removes from
StorNext knowledge of media.
• Clean Versions: This scheduled event cleans old inactive versions of
files.
• Full Backup: By default, a full backup is run once a week to back up
the entire database, configuration files, and the file system
metadata dump file.
• Health Check: By default, health checks are set up to run every day
of the week, starting at 7:00 a.m.
• Partial Backup: By default, a partial backup is run on all days of the
week the full backup is not run. Partial backups include database
journals, configuration files, and file system journal files.
• Rebuild Policy: This scheduled event rebuilds the internal candidate
lists (for storing, truncation, and relocation) by scanning the file
system for files that need to be stored.
The StorNext Scheduler does not dynamically update when dates and
times are changed greatly from the current setting. You must reboot the
system for the Scheduler to pick up the changes.
Each of these events initially has a default schedule, but you can
configure the schedules to suit your system needs. To change the
schedule, see Modifying an Existing Schedule.
Viewing a Schedule The procedure for viewing an event's existing schedule is the same
regardless of the event type.
1 Choose Scheduler from the Tools > Storage Manager menu. A list
of currently scheduled events appears.
2 Select the event you want to view, and then click View.
3 When you are finished viewing event details, click Done.
4 Repeat steps 2 - 3 to view additional events.
3 At the Name field, enter the name you want to assign to the new
schedule.
4 Select one of the following schedulable event types from the
Feature dropdown list:
• Clean Versions
• Clean Info
• Rebuild Policy
• Partial Backup
• Full Backup
• Health Check
5 At the Period field, select the execution interval for the new
schedule: Daily, Weekly or Monthly. You can also select multiple
days by holding down the Control key as you click the day.
6 At the Run Time field, specify when you want the schedule to start.
Enter the hour, minute, and a.m. or p.m.
7 At the Start Window field, specify the window in which you want
the StorNext Scheduler to start the event. The Scheduler attempts to
begin the event within the specified Start Window time (e.g., 30
minutes). If the event cannot begin at that time, the Scheduler tries
again during the next cycle.
8 Click Apply to save the new schedule, or Cancel to exit without
saving.
9 When a message informs you that the new schedule was
successfully created, click OK to continue.
Editing an Existing Follow this procedure to edit an existing schedule. The procedure for
Schedule modifying an existing schedule is the same regardless of the event type.
1 If you have not already done so, choose Scheduler from the Tools >
Storage Manager menu.
2 Select the schedule you want to modify, and then click Edit.
3 Change the schedule Period interval, Run Time, or Start Window
as desired. You cannot change the schedule name or select a
different feature (schedule type).
4 Click Apply to save your changes, or Cancel to exit without saving.
5 When the confirmation message appears, click Yes to proceed or No
to abort.
6 When a message informs you that the new schedule was
successfully modified, click OK to continue.
Deleting an Existing Follow this procedure to delete an existing schedule. The procedure for
Schedule deleting an existing schedule for an event is the same regardless of the
event type. Each event type has a default schedule. You must have at
least one schedule, so you will not be allowed to delete a solitary
schedule.
1 If you have not already done so, choose Scheduler from the Tools >
Storage Manager menu.
2 Select the schedule you want to delete, and then click Delete.
3 When the confirmation message appears, click Yes to proceed or No
to abort.
4 When a message informs you that the new schedule was
successfully deleted, click OK to continue
Distributed Data Mover Quantum developed the Distributed Data Mover feature to enhance the
Overview data movement scalability of its StorNext Storage Manager software.
With this feature the data movement operations are distributed to client
machines from the metadata controller, which can improve the overall
throughput of data movement to archive tiers of storage.
Previously, the data mover process, fs_fmover, ran only on the
metadata controller (MDC), allowing up to one fs_fmover process per
tape drive or storage disk (SDisk) stream to run at one time.
Note: The DDM feature supports only storage disks on StorNext file
systems, not on NFS.
Legend:
• fs_fcopyman: Manages copy requests and invokes a mover proxy
when copy resources have become available
• fs_fmover: The process that performs copy operations, either on
the MDC or a client
• fs_fmoverp: The proxy for a mover that always runs on the MDC.
This proxy kicks off and waits for an fs_fmover process, and then
makes any needed database updates etc. when the mover
completes.
Feature Highlights
The Distributed Data Mover feature provides the following benefits:
• Concurrent utilization of shared StorNext tape and disk tier
resources by multiple Distributed Data Mover hosts
• Scalable data streaming
• Flexible, centralized configuration of data movement
• Dynamic Distributed Data Mover host reconfiguration
• Support for StorNext File System storage disks (SDisks)
• Works on HA systems without additional configuration
Limitations
Quantum does not currently support using multiple paths to tape
drives. Also, VTL does not support SCSI-3 persistent reservations.
Installing the DDM You must install the snfs-mover rpm on each client you want to act as a
Feature on Clients distributed data mover. Follow these installation steps for each client:
1 Log in as root.
2 Obtain the .tar archive from the metadata controller.
3 Extract the contents of the .tar archive by running the command
tar -xvf sn_dsm_linuxRedHat50AS_x86_64_client.tar.
4 Install the rpms in the .tar archive.
• For a new client installation, run either the command rpm -ivh
*.rpm or rpm -Uvh *.rpm
• For a client upgrade, run the command rpm -Uvh *.rpm
Accessing Distributed To enter DDM settings and manage hosts and devices, choose
Data Mover Distributed Data Mover from the Tools > Storage Manager menu.
The Setup > Distributed Data Movers screen appears. This screen
shows any previously configured DDM-enabled hosts, managed file
systems, tape drives and storage disks, as well as the current status:
Enabled, Not Configured, Not Enabled or Internally Disabled.
Enabling DDM The DDM screen’s Use Distributed Movers field allows you to enable or
disable the DDM feature, or to enable a mode called “Threshold.”
When DDM is disabled, data moving responsibilities rest solely on the
primary server. When DDM is enabled, data moving responsibilities are
distributed among the hosts you specify as described in Managing DDM
Hosts on page 107.
When you select the Threshold option, the local host (i.e., the metadata
controller) is given a preference over the remote clients. The
characteristics of this option are:
Managing DDM Hosts The Distributed Data Mover screen enables you to add and configure a
new host for DDM, or change settings for a previously configured host.
You can also delete a host from the list of DDM-enabled hosts.
Adding a Host
1 If you have not already done so, choose Distributed Data Mover
from the Tools > Storage Manager menu.
2 Click New. Fields appear where you can enter host information
3 At the Host field, enter the name or IP address of the host you are
adding.
Note: If you use the IP address instead of the host name you must
continue using the IP address and cannot switch between
using IP addresses and host names. For example, you can’t
configure a host using the IP address and then try and
configure a device on the host using the host name.
6 To add your selections to the new host, click Apply. (To exit without
saving, click Cancel. To remain on the screen but clear your entries,
click Reset.)
7 When the confirmation message appears, click Yes to proceed or No
to abort.
8 After a message informs you that your changes were successfully
saved, click OK to continue.
Deleting a Host
1 If you have not already done so, choose DDM from the Tools >
Storage Manager menu.
2 Select from the Hosts list the host you want to delete.
3 Click Delete to exclude the host from DDM operation. (Clicking
Delete does not actually delete the host from your system, but
rather excludes it from the list of DDM hosts.)
4 When the confirmation message appears, click Yes to proceed or No
to abort.
5 After a message informs you that the host was successfully deleted,
click OK to continue.
Host Priority When all hosts are chosen, no special preference is given to the local
host. Following are the operating characteristics when all hosts are
chosen:
• Internally within the StorNext Copy Manager (fs_fcopyman) there
will be a list of hosts to use (that is, the local host and the remote
clients). At this time there is no way to specify the order in which
hosts appear in the list.
• For each host in the list, the following information is tracked:
• The number of movers currently running on the host
• The time of the last assignment
• Whether the host is currently disabled
• When it is time to choose a host, a single pass through the host list
is made. Hosts are picked based on the following criteria:
1 The first valid host in the list will be selected. A host is
considered valid if:
• It is currently enabled
• The devices needed for the copy operation are also currently
enabled on the host
2 The next host will be skipped if not valid.
3 The next host in the list will be skipped if the number of running
movers on that host is greater than the current selection.
4 The next host will replace the current selection if the number of
running movers on that host is less than the current selection.
5 The next host will replace the current selection if the number of
movers is equal to the current selection AND the time of last
assignment is older than the current selection’s.
Criteria 2-5 are repeated for the remaining hosts in the list. Upon
completion, the host with the fewest running movers is selected. If
there is a tie, the host that has been used the least recently is chosen. (A
host that was not the least recently used may be chosen if its movers
completed quickly compared to the others.)
Note: If a host has fewer drives and all other hosts have multiple
drives (for example, two drives versus ten,) the host with the
fewer drives will almost always be chosen for operations on
those two drives because it is likely to have the fewest running
movers.
Distributed Data Mover A DDM Report which shows current configuration information and
Reporting activity is available from the Reports menu. For more information about
the DDM report, see The Distributed Data Mover Report on page 210.
Replication Overview
This section provides some background information that will help you
understand how replication is configured and how processing occurs.
Replication StorNext Replication makes a copy of a source directory and sends the
Configuration Overview information to one or more target directories. The target directories
may be on other host machines, or may be on the same host as the
source directory.
Replication behavior is defined by a Replication/Deduplication Policy.
(The other type of StorNext policy is a Storage Manager Policy, which
governs how StorNext Storage Manager works).
Here are some important facts about StorNext Replication/Deduplication
policies.
• A replication/deduplication policy exists on only one SNFS file
system. For example, a policy in a file system called /stornext/
sn1 can be used only to replicate directories in that file system. A
separate policy would be needed to replicate directories from file
system /stornext/sn2.
• If a replication/deduplication policy will be used in any file system
on a machine, you must configure a blockpool for that machine.
The blockpool for a machine contains data (called blocklets) if the
Deduplication feature is used, but the blockpool must be configured
for replication use even if you do not use deduplication.
• A policy may be applied to one directory or more than one directory
in the file system.
• A single policy can define behavior for replication sources and
targets, as well as for deduplication. This single policy can also
define the directories affected by the policy.
• However, it is often convenient to configure a policy that does
primarily one thing. For example, you could create a policy that
controls replication source behavior. Such a policy might be called a
"replication source policy" or a "source policy," even though the
policy could be configured to define other actions.
from the source directory to the target host. The entire file is
copied whenever a file is created or modified.
When data movement is in progress or even after it has just
completed, the replicated files may not be visible yet in the target
file system’s directories. Replicated files become visible in stage 2.
2 File System Namespace Realization Stage: In this stage StorNext
enumerates all the files in the source directory and recreates the file
name and subdirectory structure (the namespace) on the target file
system. Unlike in the Data Movement Stage, this stage happens only
at scheduled times, or when namespace realization is manually
initiated by the administrator.
The following illustration shows in simple terms how replicated data
flows from the one replication source to one replication target.
Files Excluded From Certain files may not be included in the replication process for various
Replication reasons. For example, a file that is open for read-only would be
replicated, but a file that is open for write (including all of the various
varieties of “write”), would not be replicated.
To determine which specific files were not included in the replication
process, see the Replication/Deduplication Completion Report, which
is accessible from the Replication/Deduplication Policy Summary
Report. For more information about Replication reports, see
Replication Deduplication Reports on page 211.
Here are some situations in which a file may be excluded from the
replication process:
Namespace Realization Namespace refers to the directory structure which contains replicated
data. Replicated data is always transferred separately from namespace
data (although some small file data is transferred along with the
namespace).
Namespace realization refers to the process in which the replicated
directory structure (the namespace) appears on the replication target.
Because file data and namespace data is transferred separately, in some
situations it might take longer for replicated data to complete
transferring than for the namespace realization to complete. This is
especially likely to happen if there is a backlog of file data waiting to be
transferred at the time when namespace is either scheduled to run or is
manually initiated.
Note: Once you specify the file system on which the blockpool
resides, you cannot later choose a different blockpool file
system. Use care when specifying the blockpool file system.
Blackout Period A Blackout is a period during which replication does not occur. You may
schedule replication blackouts for periods in which you do not want
replication data transfers to occur on a busy network. For example, an
administrator may create a blackout during periods when WAN usage
and traffic is the heaviest. In this situation replication might run after
hours when WAN usage and traffic would be much lower.
Replication Target A replication target directory is the location to which replicated data is
Directory sent. The replication target may be a directory on a separate host
machine, or it may be a directory on the source host machine.
Regardless of where the target directory resides, it is very important that
you use the replication target directory only for replicated data. Also, do
not allows users to modify the files in the replication target directories.
Replication Schedule You can specify a replication schedule to define when the file system
namespace realization should occur for an outbound replication
schedule. For example, you might specify that you want namespace
realization to occur at 6am and 6pm every day.
If you do not specify a replication schedule, you must manually run the
replication policy whenever you want the realization to occur.
Replication Copies Replication Copies is the number of copies of replicated data saved on
the target. StorNext currently supports 1 to 16 replication copies per
target. The number of replication copies is entered or modified in
replication policies.
Bandwidth Throttling Bandwidth Throttling refers to limiting the receive rate and transmit rate
for replicated data (Replication Stage 1). This StorNext feature allows
network administrators to specify (in bytes per second) a ceiling for
incoming and outgoing replicated data. When bandwidth throttling is
enabled, replicated data will not be transmitted or received at a rate
higher than the specified maximum. Bandwidth throttling is a useful
tool for controlling network traffic.
Virtual IP (vIP) Virtual IP or vIP is similar to an alias machine name. StorNext uses virtual
IP addresses to communicate with machines rather than using the
physical machine name. Virtual IPs are required in HA (high availability)
environments, and are also used for multilink NICs.
Your network administrator can provide you with the virtual IP
addresses and virtual netmasks you need for your system.
Scenario 1: Simplest In this simple replication scenario, the host machine host1 contains a
Replication StorNext file system called /stornext/fs1/. Host machine host2 has
a StorNext file system called /stornext/fs2.
In this scenario we can replicate directory /stornext/fs1/video on
host1 to file system /stornext/fs2 on host2. Replicated files will
appear in the directory /stornext/fs2/video, which is the default
location on host2.
All of these files share the same data extents in the file system. An even
greater space saving can be realized by enabling deduplication for
replication, but using this feature is outside of the scope of the current
scenario discussion.
Additional Replication Here are some other possible replication combinations StorNext
Possibilities supports:
• Replication of a directory from one source host to multiple target
hosts and/or multiple target file systems.
Example
In this non-supported configuration, Machine host1, file system fs1,
directory video1 replicates to Machine host2 , file system fs2,
directory video2
While at the same time
Machine host2, file system fs2, directory video2 replicates to
Machine host1, file system fs1, directory video1
Figure 57 Non-Supported
Replication From Source to
Target
Configuring Replication
This section describes how to configure simple one-to-one replication
from one source directory on one file system to one target file system.
The source and target StorNext server computers can be the same
machine, standalone servers, or High Availability (HA) redundant
servers. When replication-target file systems are on an HA Cluster, it is
best to convert the cluster to HA before configuring replication source
policies that point to them. This allows the use of the virtual IP (vIP),
which is required for HA configurations.
Additional configuration options for StorNext features such as HA or
Replication with Storage Manager are also covered.
Before you begin configuring, make sure you have the Replication and/
or Deduplication licenses required for these features. If you are using an
HA configuration, basic StorNext single-server or HA Clusters should
already be set up. (For more information, see Chapter 3, The
Configuration Wizard).
These instructions assume you are using the StorNext Configuration
Wizard and have already completed the first three steps: Welcome,
Licenses, and Name Servers.
Step 1: Create Source After you complete the first three Configuration Wizard steps, the first
and Target File Systems replication step is to create file systems: the blockpool file system(s), and
the source and target file systems you plan to use.
1 If you have not already done so, launch the StorNext Configuration
Wizard and proceed through Welcome, License and Name Servers
steps to the File System step.
2 The Setup > File System screen appears.
3 On the Setup > File System screen, click New. The Setup > File
System > New Screen appears.
4 At the File System Name field enter the name of a file system to be
used as a replication source. A default mount-point path
automatically appears but you can change this mount point if you
wish.
5 Choose the Replication/Deduplication option. A warning message
alerts you that "A blockpool has not been created." Disregard this
message for now because you will create the blockpool file system
in the Step 2: Setting up the Blockpool.
6 Select a set of LUNs for the file system, and then click Assign.
7 Click Configure. StorNext creates the file system configuration and
displays the Finish tab.
8 Click Apply to save the new file system. (For more information
about creating file systems, see Step 4: File System on page 29.)
2 Configure another file system for the Blockpool that has neither
Data Migration nor Replication/Deduplication enabled.
Step 2: Setting up the In this step you will set up the blockpool on the blockpool file system
Blockpool you just created in the previous step.
1 Choose the StorNext Configuration Wizard’s Storage Destinations
task. The Setup > Storage Destinations screen appears.
There are four tabs on this screen: Library, Storage Disk,
Replication Targets, and Deduplication. When configuring
replication we are concerned with the Replication Targets and
Deduplication tabs. (The deduplication infrastructure is used to
handle the transfer of file data for the replication feature, so it must
be configured even when the deduplication feature is not used.)
2 Click the Deduplication tab. The Setup > Storage Destinations >
Deduplication Screen appears.
3 Click the Deduplication tab. This tab has only one field called
Blockpool Host File System. At this field select from the dropdown
list the file system to use for the blockpool. (This is the file system
you created in the previous step.)
Step 3: Creating In this step you will specify the actual targets to which you want
Replication Targets replicated data sent. (Namespace realization will also occur on these
targets.)
1 Click the Replication Targets tab. The Setup > Storage
Destinations > Replication Targets Screen appears.
2 Click Add Host.
4 Click Scan Host to populate the Mount Point box with appropriate
file systems which are configured for replication/deduplication.
5 Select the file system you created for use as the target in Step 1:
Create Source and Target File Systems, and then click Add.
6 Click Apply. At this point you should see your file system listed as a
replication target.
Step 4: Create a The next step in configuring replication is to create a replication storage
Replication Storage policy. This policy contains the replication "rules" specific to your
Policy replication source and target file systems. You must create a replication
policy for the source directory and enable inbound replication for the
target file system.
Figure 64 Outbound
Replication Tab > Replication
Schedule
5 Under the heading Select Available Targets, select the target file
system on the target server.
6 Create a schedule by making a selection in every column. If you
select none of the schedule columns, this creates an unscheduled
policy that must be run manually. The schedule shown in Figure 64
will run at midnight every day.
7 Click Continue to complete the schedule and target selections.
8 Click Apply to finish creating the policy with the options from all of
the tabs.
9 After a message informs you that the policy was created
successfully, click OK.
2 When the Setup > Storage Policy > Edit > target screen appears,
Click the Inbound Replication tab.
Note: If you do not turn on replication, the process will fail and
you will receive an error message saying, “Replication
disabled on target.” It is VERY IMPORTANT that you enable
replication by setting Inbound Replication to On.
4 Click Apply to finish editing the policy with your selected options.
Configuration Steps The preceding four configuration steps accomplished the following:
Summary • Created a source replication policy and associated a source directory
with it
• Selected a target file system on a target host machine and left the
target directory unspecified, which uses the directory name of the
source
• Set a replication schedule that runs every day at midnight
• Enabled inbound in the target policy
• Enabled outbound replication in the source policy
The contents of the source directory (additions and deletions) will now
be replicated to the target directory every night. You can test this by
running the policy manually at any time on the Setup > Storage Policy
screen. (Select the policy you want to test and then click Run.)
Optional HA and When the High Availability (HA) feature is used with replication, a virtual
Multilink Configuration IP (vIP) address must be configured to allow a replication source to use a
single IP address to access whichever server is currently performing the
Primary function in the target HA cluster.
The vIP is automatically moved to the correct server as part of the
failover process for the HaShared file system. (See Virtual IP (vIP) on
page 120 for an expanded definition.)
It is easiest to set up the vIP during the initial HA conversion. The vIP
configuration items appear automatically at this time if a license exists
for replication. It is not necessary to have a replication policy
configured.
The IP address used for the vIP must be statically allocated and routable
to the physical network interface of both servers in the HA cluster.
Please request this IP address and netmask information from your
network administrator before starting the HA conversion.
Note: This step describes only the tasks necessary for configuring
replication on an HA system. For general instructions about
configuring HA, see Converting to HA on page 189.
2 At the Shared File System field, select from the dropdown list the
file system that will be dedicated to StorNext HA internal functions.
3 At the MDC Address field, select from the dropdown list the
primary system’s IP address for use in communicating between HA
MDCs.
4 Since this HA system runs a blockpool, you must configure a Virtual
IP Address (vIP). Under the heading Virtual Network IP Address
Configuration, check Enable and then enter the vIP (virtual IP)
Address and vIP Netmask provided by your network administrator.
5 Click Convert to convert the primary node to HA.
6 When the confirmation message appears, click Yes to proceed or No
to exit without converting.
7 When a message informs you that the operation was completed
successfully, click OK. The configuration items for the Secondary
System will be added to the page.
Configuring Multilink
Virtual IPs are also used in an HA environment if the multilink feature is
configured. A virtual IP address is required for each NIC card you use for
replication.
1 Choose Replication/Deduplication > Replication Bandwidth from
the Tools menu. The Tools > Replication > Bandwidth screen
appears.
Replication Reports There are two reports for replication: Policy Activity and Policy
Summary.
• The Policy Activity report shows replication performance statistics.
• The Policy Summary report shows replication-related information
for each policy.
Both of these reports also show information related to deduplication.
Access these replication reports by choosing Replication/Deduplication
from the Reports menu.
For more information about replication reports, see Replication
Deduplication Reports on page 211.
Replication The Administration option available under the Tools > Replication/
Administration Deduplication menu allows you to view current replication process, or
pause, resume, or stop replication.
After you choose Administration from the Tools > Replication/
Deduplication menu, the Tools > Replication/Deduplication >
Administration screen appears.
StorNext Jobs At any time you can view currently running StorNext jobs, including
replication. The Reports > Jobs screen shows the job ID and type of
job, the start and end times, and the current status.
To view jobs, choose Jobs from the Reports menu. The Reports > Jobs
report appears.
For more information about StorNext jobs, see The Jobs Report on
page 205.
Troubleshooting Replication
The Troubleshooting appendix in this guide contains simple
troubleshooting procedures related to replication. For more
information, see Troubleshooting Replication on page 351.
For issues not covered in that section of the appendix, contact the
Quantum Technical Support
Figure 72 Deduplication
which is data movement. If a blob is shared by more than one file, less
data is transferred than when replication occurs without deduplication.
Replicated data moves from the source machine's blockpool to the
target machine's blockpool. If the source and target machine are the
same, then no data needs to move for replication Stage 1.
When the replication namespace realization occurs in replication Stage
2, the replicated files appear in the target directory as truncated files.
The blob tags needed to reconstitute the file are replicated along with
other file metadata during Stage 2. When replication is complete, an
application can access the replicated file and data will be retrieved from
the blockpool as needed.
Setting Up Deduplication
This section describes the steps necessary to configure data
deduplication. The easiest way to configure your system for
deduplication is to use the StorNext Configuration Wizard, but you can
also use the Setup menu's options to accomplish the same tasks.
Complete these tasks to set up and enable deduplication:
• Step 1: Enable replication/deduplication when you create (or edit) a
source file system.
• Step 2: Specify the file system to use for the blockpool (this is done
only once per machine.)
• Step 3: Create (or edit) a replication/deduplication storage policy
with deduplication enabled on the Deduplication tab.
Step 1: Creating a Create a file system as you normally would, or edit an existing file
Deduplication-Enabled system.
File System 1 In the Configuration Wizard, choose the File System task.
(Alternatively, choose File System from the Setup menu.)
2 On the Options tab, enable replication by selecting Replication/
Deduplication.
3 Continue creating the file system as your normally would. (If you are
editing an existing file system, click Apply to save your changes.) For
more information about creating a file system, see Step 4: File
System on page 29.
Step 2: Specifying the To use deduplication you must specify the file system on which the
Blockpool blockpool resides. If you have already enabled replication and a
blockpool already exists, you can skip this step.
The process for specifying a blockpool for deduplication is identical to
specifying a blockpool for replication. For more information, see Step 2:
Setting up the Blockpool on page 129 in the Configuring Replication
section.
Step 3: Creating a To enable deduplication you must either create a new replication/
Deduplication-Enabled deduplication storage policy or edit an existing policy.
Storage Policy 1 Choose the StorNext Configuration Wizard’s Storage Policy task.
(The Setup > Storage Policy Screen appears.
2 If you are creating a new policy, click New. The Storage Policy >
New Screen appears. (See Figure 61.)
If you are editing an existing replication policy, select the policy you
want to edit and then click Edit. Skip to Step 5.
3 Enter the following fields:
• Policy Name: The name of the new policy you are creating
• Policy Type: choose Replication /Deduplication to create a
deduplication storage policy.
4 Click Configure. The Replication /Deduplication Policy screen
appears.
Figure 73 Replication/
Deduplication Policy Screen
Deduplication Reports There are two reports for deduplication: Policy Activity and Policy
Summary.
• The Policy Activity report shows deduplication performance
statistics.
• The Policy Summary report shows deduplication-related
information for each policy.
Both of these reports also show information related to replication.
Access these deduplication reports by choosing Replication/
Deduplication from the Reports menu.
For more information about deduplication reports, see Replication
Deduplication Reports on page 211.
• Storage Manager
• Storage Components: View current status for libraries, storage
disks, and tape drives; place one or more of these components
online or offline
• Drive Pool: Add, modify, or delete drive pools
• Media Actions: Remove media from a library or move media
from one library to another
• Library Operator: Enter or eject media from the Library
Operator Interface
• Software Requests: View or cancel pending software requests
• Scheduler: Schedule file system events including Clean Info,
Clean Versions, Full Backup, Partial Backup, and Rebuild Policy
• Alternate Retrieval Location: Specify a remote retrieval
location to use in situations where files stored on tape or a
storage disk.
• Distributed Data Mover (DDM): Spread the distribution of
data across several machines rather than the primary server.
• Replication and Deduplication
• Administration: View current replication process, or pause,
resume, or stop replication
• Replication Targets: Add a host or directory for date
replication, or edit existing replication targets
• Replication Bandwidth: Configure replication bandwidth limits
and multilink
• HA
• Convert: Convert to a high availability configuration
• Manage: Manage HA system parameters
User Access
The Tools Menu's User Access option allows you to add new StorNext
users and modify permissions for existing users. User Access is also
where you change the admin's password.
Adding a New User Follow this procedure to add a new StorNext user.
1 Choose User Access from the Tools menu. The User Access screen
appears. All existing users and the admin are shown.
3 In the User Name field, type the name the new user will enter at the
User ID field when he or she logs on to StorNext.
4 In the Password field, type the password the new user will enter
when logging on to StorNext.
5 Select all the different roles you want the new user to have:
Download Client Software Manage Tickets
6 When you are satisfied with the permissions you have assigned,
click Apply to save your changes. (To exit without saving, click
Cancel.)
7 When a message informs you that the new user was successfully
added, click OK.
2 When you are finished viewing user profile information, click Back
to return to the User Access screen.
Note: NOTE: You cannot delete the admin. You can only change the
admin's password as described below.
3 When a message informs you that the new user was successfully
deleted, click OK.
Changing the Admin Follow this procedure to modify an existing user’s permission.
Password 1 If you have not already done so, choose User Access from the Tools
menu. The User Access screen appears. (See Figure 74.) All existing
users and the admin are shown.
2 Select the admin, and then click Edit. The User Access > [admin
name] screen appears.
3 In the Password field, type the new password for the admin.
4 When the confirmation message appears, click Yes to proceed, or
No to return to the Setup > [admin name] screen without saving.
5 When a message informs you that the admin’s password was
successfully modified, click OK.
Client Download
The StorNext client software lets you mount and work with StorNext file
systems.
System Control
The System Control screen enables you to tell at a glance whether
StorNext File System and StorNext Storage Manager are currently
started. In the case of Storage Manager, you can also see which
individual components are currently started or stopped. From this
screen you can start or stop File System and Storage Manager, and also
specify whether you want StorNext to start automatically whenever your
system is rebooted.
To access the System Control screen, choose System Control from the
Tools menu. The Tools > System Control screen appears.
Starting or Stopping Most StorNext operations require that the StorNext File System be
StorNext File System started, although there may be times when you need to stop the File
System.
Click Start to start the File System, or Stop to stop the File System.
Refreshing System When there is a change in system status, sometimes there is a delay in
Status updating the status. Click Refresh to immediately update the GUI
system status.
Specifying Boot If you would like StorNext to automatically start File System and Storage
Options Manager whenever your system starts, select the option Automatically
start StorNext software at boot time? and then click Apply.
• Truncate Files
• Move Files
• Modify File Attributes
• View File Information
Store Files Choose this option to store files by policy or custom parameters.
1 Choose File and Directory Actions from the Tools menu. The Tools
> File and Directory Actions screen appears.
2 Select the file you want to store. If necessary, click Browse and then
click All Managed Directories to view a list of the managed
directories. Select the directory containing the files to be stored.
Mark the files of interest and then click Continue to select them.
3 To store the selected file according to policy, at the Store
Parameters field, select By Policy.
a Click Apply.
b When the confirmation message appears, click Yes to store the
file, or No to abort.
Change File Version Choose this option to change the file version to a new version.
1 If you have not already done so, choose File and Directory Actions
from the Tools menu. The Tools > File and Directory Actions
screen appears. (See Figure 82.)
2 Choose Change File Version from the Available Actions dropdown
menu.
Modify File Attributes Choose this option to modify attributes for the selected file.
1 If you have not already done so, choose File and Directory Actions
from the Tools menu. The Tools > File and Directory Actions
screen appears. (See Figure 82.)
2 Choose Modify File Attributes from the Available Actions
dropdown menu.
View File Information Choose this option to view detailed information about the selected file.
1 If you have not already done so, choose File and Directory Actions
from the Tools menu. The Tools > File and Directory Actions
screen appears. (See Figure 82.)
3 Select the files whose attributes you want to view. If necessary, click
Browse to navigate to the file location and then select the file.
4 Click File Info to view information.
5 Click Done when you are finished viewing file information.
File System
The Tools > File System menu contains options that enable you to
perform the following file system-related tasks:
• Label Disks: Apply EFI or VTOC label names for disk devices in your
StorNext libraries
• Check File System: Run a check on StorNext files systems prior to
expanding or migrating the file system
Storage Manager
The Tools > Storage Manager menu contains options that enable you
to perform the following Storage Manager-related tasks:
• Storage Components: View your system's libraries, storage disks,
and tape drives, and place those devices online or offline
• Drive Pool: View, add, edit, or delete drive pools (groups of tape
drives allocated for various administrator-defined storage tasks)
• Media Actions: Perform various actions on the storage media in
your library
• Library Operator Interface: The StorNext Library Operator Interface
allows you to perform media-related actions remotely from the
library
• Software Requests: View current software requests in progress or
cancel a request
• Scheduler: Schedule tasks to run automatically based on a specified
schedule
• Alternate Retrieval Location: Specify a remote retrieval location to
use in situations where files stored on tape or a storage disk.
• Distributed Data Mover: Spread the distribution of data across
several machines rather than the primary server.
These tasks are described in Chapter 5, Storage Manager Tasks
HA
The Tools > HA menu options enable you to perform the following HA-
related tasks:
• Convert: Convert a shared file system to high availability
configuration
• Manage: View the current status of the file systems on your HA
system and perform various HA-related functions such as starting or
stopping nodes on the HA cluster
These tasks are described in Chapter 7, Tools Menu Functions
7 Select one or more tests to run by clicking the desired check. (There
is no need to hold down the Control key while clicking.) To deselect
a test, click it again.
8 Click Run Selected to run the tests you selected. Or, to run all tests,
click Run All.
Viewing the Health After a test has been run successfully (as indicated by “Success” in the
Check Results Status column), you can view test results.
1 To view results for one or more tests, select the desired tests and
then click View Selected.
2 To view results for all successfully completed tests, click View All.
3 When you are finished viewing, click Done.
Regardless of which View option you choose, test results are shown for
the last successful tests completed regardless of the date on which they
ran. For example, if you select two tests and then click View Selected,
there might be an interval as long as a week or more between the finish
dates for the two tests.
Viewing Health Check You can also view a history (up to five runs) of each health check test
Histories that has been run. This can be useful for comparing results over a time
span.
1 Select the tests whose history you want to view.
2 Click History. The Service > Health Check > View History screen
appears.
Running Capture State creates a log file named using the format
“snapshot-machinehostname-YYYYMMDDHHMMSS.tar.gz..”
This file contains a summary report that is produced by executing the
pse_snapshot command on all component config/filelist files.
If desired, you can download or delete a previously captured file.
Deleting a Previous Follow this procedure to delete an unwanted Capture State file.
System State Capture 1 If you have not already done so, choose Capture State from the
Service menu. The Service > Capture State screen appears. (See
Figure 95 on page 181.) All previously captured snapshots are
shown.
2 Select the file you want to delete, and then click Delete.
3 When a confirmation screen prompts you to confirm that you want
to delete the file, click Yes to continue or No to abort.
4 After the status screen informs you that the file was successfully
deleted, click OK.
2 On the Service > Admin Alerts screen you can do any of the
following:
• View a specific alert by scrolling to the right of the screen (if the
alert is longer than can be displayed on one screen)
• Refresh (update) the list by clicking the Refresh button
• Delete an individual alert by selecting the desired alert and then
clicking the Delete button
• Delete all alerts by clicking the Delete All button
Viewing Ticket 1 From the StorNext home page, choose Tickets from the Service
Information menu. The Service > Tickets screen appears.
Editing Ticket Follow this procedure to add comments or notes to the ticket in the
Information Analysis field:
1 Select the desired ticket and then click Edit. The Service > Tickets>
Edit Ticket > [number] screen appears.
Closing Tickets When you no longer need to retain ticket information, you can close
(delete) selected tickets or all tickets by following this procedure:
1 To close a specific ticket, select the desired ticket and then click
Close.
2 To delete all tickets, click Close All.
HA Overview
The StorNext HA feature is a special StorNext configuration with
improved availability and reliability. The configuration consists of two
similar servers, shared disks and possibly tape libraries. StorNext is
installed on both servers. One of the servers is dedicated as the initial
primary server and the other the initial standby server.
StorNext File System and Storage Manager run on the primary server.
The standby server runs StorNext File System and special HA supporting
software.
The StorNext failover mechanism allows the StorNext services to be
automatically transferred from the current active primary server to the
standby server in the event of the primary server failure. The roles of the
servers are reversed after a failover event. Only one of the two servers is
Failover Failover is the process of passing control of a file system from an FSM on
one MDC to a standby FSM on a redundant MDC. When that FSM is for
the HaShared file system, Primary status transfers to the redundant
MDC along with all the processing that occurs only on the Primary MDC.
This includes all the HaManaged FSMs, the Storage Manager processes,
and the blockpool server. When an FSM does not relinquish control
cleanly, an HA Reset can occur to protect against possible corruption of
file system metadata and Storage Manager databases. (See Primary
Node and Secondary Node.)
Primary Node The primary node is the main server in your configuration. Processing
occurs on this server until system failure makes it necessary to switch to
another server. Also known as the local node. The primary status is
transient and dynamic, not fixed to a specific machine.
Secondary Node The secondary node is the redundant or secondary server in your
configuration. If your primary server fails or shuts down, processing
automatically moves to this secondary server so there is not interruption
in processing. Like primary status, the secondary status is transient and
dynamic, not fixed to a specific machine. Also known as the peer node.
Virtual IP (vIP) Virtual IP or vIP is a fixed IP address that is automatically associated with
the Primary MDC to provide a static IP address for replication and
deduplication access to the target server in an HA cluster, and for access
to the blockpool.
Following are some general requirements for vIP addresses as they apply
to HA:
1 The vIP should be static (currently StorNext supports only static IP
for HA).
2 The NIC should have a physical IP address assigned.
3 The vIP should be a real and unique IP address.
4 The vIP should be reachable by other nodes, and you should also be
able to reach other node from the vIP address. For this reason,
Quantum recommends that the vIP address be on the same subnet
of the physical IP address of the same NIC.
When the NIC is also involved in multilink communication, the following
additional requirement applies:
1 The grouping address (taking the first configured maskbits of the IP
address) of the physical and vip IPs on the same NIC should be the
same, and unique on the node.
Your local Network Administrator should provide a reserved IP address
and netmask for this purpose.
Virtual Netmask This is a 32-bit mask used to divide a virtual IP address into subnets and
specify the network’s available hosts.
Pre-Conversion Steps Before converting to HA, you should perform the following steps:
1 Identify two servers, which must have similar hardware running the
same version of Linux, and have identical LAN and SAN connectivity.
For example, on multiple Ethernet port connections, both systems
must be connected to the same Ethernet ports (eth0 on System A
and System B going LAN1, eht1 on System A and System B going to
LAN2, etc.)
2 Synchronize the clocks on both systems.
3 Install StorNext on both servers.
4 Enter StorNext license information on both server nodes.
5 Launch StorNext on one server.
6 Configure an unmanaged file system for use as the HA shared file
system. (For more information about creating a file system, see Step
4: File System.)
HA and Distributed LAN On a StorNext HA system using the StorNext Distributed LAN Client/
Clients Server (DLC) feature:
Converting to HA
This section describes the configuration steps necessary to convert a
node to a primary HA server. Converting to HA consists of selecting your
dedicated unmanaged StorNext file system for use as the controlling
shared file system, and then instructing StorNext to convert the node to
HA.
Following are some other things you should be aware of concerning the
HA conversion process:
• The conversion process converts one node at a time. The second
node should be converted as soon as possible after the first node.
• StorNext operating files will be moved to the HaShared file system,
and this move cannot easily be reversed.
• Following conversion, the Primary server is identified by the vIP for
Replication/Deduplication.
• Replication/Deduplication policies must be changed to use the vIP:
• The global policy for each file system must use it as the “Address
for Replication and Deduplication”
2 At the Shared File System field, select the shared file system you
want to convert to HA.
3 At the MDC Address field, select one IP address to be placed in the
ha_peer file for use in administrative operations between the
MDCs in the HA Cluster.
4 If your HA cluster also runs the blockpool, select Enable and then
enter the virtual IP address and virtual netmask. (Ask your network
administrator for the vIP address and netmask.)
5 Click Convert to convert the primary node to HA.
6 Enter the IP address of the secondary system on the same LAN, and
then click Scan. The licenses will be populated for that secondary
system.
7 Click Convert to convert the secondary system.
Managing HA
The StorNext Manage HA screen is used to monitor the current statuses
of your primary and secondary servers.
The screen includes Enter Config Mode and Exit Config Mode buttons
to place the HA Cluster in a state that allows the Primary MDC to restart
CVFS and individual FSMs without incurring an HA Reset, failover of any
file systems, or transfer of Primary status to the peer MDC. This is
required for making certain types of configuration changes through the
GUI.
Follow these steps to lock the HA cluster and enter Config mode, and
subsequently to exit Config mode:
1 Choose HA > Manage from the Tools menu. The Manage HA
screen appears.
Troubleshooting HA
The Troubleshooting appendix in this guide contains simple
troubleshooting procedures pertaining to HA. For more information, see
Troubleshooting.
Report Navigation If a log or report spans more than one screen, navigation controls at the
Controls bottom of the screen allow you to select a page by number, or to view
one of these pages.
StorNext Logs
Report menu options enable you to access and view any of the
following types of logs:
• StorNext Logs: Logs about each configured file system
• File Manager Logs: Logs that track storage errors, etc. of the
Storage Manager
• Library Manager Logs: Logs that track library events and status
• Server Logs: Logs that record system messages
• StorNext Web Server Logs: Various logs related to the web server
• StorNext Database Logs: Logs that track changes to the internal
database
Use the following procedure to access the StorNext log files. The process
is the same regardless of the type of log you are viewing.
8 Select Logs from the Reports menu. The Reports > Logs screen
appears.
9 On the left side of the screen, select the type of log you wish to
view.
10 If desired, select a file system different from the default one shown
beneath the log categories.
The log you selected automatically appears. (If the log does not
appear, click Refresh.) If the log spans more than one screen, use
the navigation controls at the bottom of the screen as described in
Report Navigation Controls on page 200.
StorNext Reports
The procedure for accessing StorNext reports is identical regardless of
the report name.
The LAN Client The LAN Client Performance Report provides information about
Performance Report distributed LAN clients, including read and write speed.
Use the following procedure to run the LAN Client Performance Report.
1 Choose LAN Client Performance from the Reports menu. The
Reports > LAN Client Performance report appears.
The Clients Report The Clients Report provides statistics for StorNext clients, including the
number of StorNext SAN clients and distributed LAN clients, and client
performance.
Use the following procedure to run the Clients Report.
1 Choose Clients from the Reports menu. The Reports > Clients
report appears.
The File System Report The File System Report provides a list of parameters and statistics about
configured StorNext file systems.
Use the following procedure to run the File System Report.
1 Choose File System from the Reports menu. The Reports > File
System report appears.
The Jobs Report The Jobs Report provides information about previously run jobs on your
file systems. Jobs include all actions performed for file systems, such as
make, stop, start, check, and so on. Use the navigation controls at the
bottom of the screen if there are multiple screens of jobs.
Use the following procedure to run the Jobs Report.
1 Choose Jobs from the Reports menu. The Reports > Jobs report
appears.
Filter Options
The Status Filter at the bottom of the screen allows you to refine the
displayed list of jobs according to Success, Failure, Warning, Working,
Unknown, or All. Choose one of these criteria to restrict the displayed
list of jobs to your selection. After you select a Status Filter option, click
Refresh to resort and view the jobs list with your selected criteria.
The Type Filter works either together or separately from the Status
Filter. The Type Filter allows you to refine the displayed list of jobs
according to a specific job action:
Run Store Policy Media Remove File System Scan Store Files
For New Storage
The SAN Devices Report The SAN Devices Report shows a list of details for all currently
configured devices attached to your SAN.
Use the following procedure to run the SAN Devices Report.
1 Choose SAN Devices from the Reports menu. The Reports > SAN
Devices report appears.
The Distributed Data The Distributed Data Mover Report shows a list of details pertaining to
Mover Report the Distributed Data Mover feature.
Use the following procedure to run the Distributed Data Mover Report.
1 Choose Distributed Data Mover from the Reports menu. The
Distributed Data Mover Report appears.
Policy Activity Report Use the following procedure to run the Replication/ Deduplication Policy
Report.
1 Choose Replication/ Deduplication > Policy Activity from the
Reports menu. The Reports > Replication/ Deduplication >
Policy Activity report appears.
Policy Summary Report Use the following procedure to run the Replication/ Deduplication Policy
Summary Report.
1 Choose Replication/ Deduplication > Policy Summary from the
Reports menu. The Reports > Replication/ Deduplication >
Policy Summary report appears.
• Last Completed: The date and time the last replication was
finished.
• Network
• Sent: The amount of replicated data sent from the source.
• Received: The amount of replicated data received on the target.
2 To update report information, click Refresh.
3 To view details for a particular policy, select the desired policy and
then click Details.
(Local numbers for specific countries are listed on the Quantum Service
and Support Website.)
Note: The amount of reserved space is usually less than 280MB per
client. Reserved space is calculated as 110% of the buffer cache
size of each particular client. For example, a client with a 64MB
buffer cache is allocated 70MBs of reserved space by the MDC.
If the client closes all files that are open for write, the 70MBs of
space is no longer accounted for. It is important to remember
that reserved space is per stripe group.
Windows Configuration Beginning with StorNext 4.0, the configuration file for Windows is now
File Format in XML format. The Windows configuration file is now identified by a
.cfgx extension rather than .cfg for UNIX systems.
There are some differences between the XML and .cfg formats.
Specifically, the Reserved Space parameter is called reservedSpace in
the XML format, and its value must be either true or false.
Distributed LAN Server Due to significant demands placed on the network, the following
and Client Network network issues can occur when using Distributed LAN Servers and
Tuning clients:
• Configuring Dual NICs. If the Distributed LAN Server has two network
interface cards (NICs), each card must have a different address and
reside on a different subnet. In addition, to take advantage of a
second NIC in a Distributed LAN Server, the Distributed LAN Clients
must also have a second connected network interface.
• Dropped Packets. Some Ethernet switches may be unable to
accommodate the increased throughput demands required by the
Distributed LAN Server and client feature, and will drop packets.
This causes TCP retransmissions, resulting in a significant
performance loss. On Linux, this can be observed as an increase in
the Segments Retransmitted count in netstat -s output during
Distributed LAN Client write operations and Distributed LAN Server
read operations.
Note: On Linux, use ping and the cvadmin latency test tools to
identify network connectivity or reliability problems. In
addition, use the netperf tool to identify bandwidth
limitations or problems.
Distributed LAN Server The minimum amount of memory required for a Distributed LAN Server
Memory Tuning depends on the configuration.
• Windows. For a Windows Distributed LAN Server, use the following
formula:
Required memory = 1GB +
(# of file systems served
* # of NICs per Distributed LAN Client
* # of Distributed LAN Clients
* transfer buffer count
* transfer buffer size)
For example, suppose a Windows Distributed LAN Server is serving
four file systems to 64 clients each using two NICs for data traffic.
Also assume the server uses the defaults of sixteen transfer buffers
and 256K per buffer. (On Windows, you can view and adjust the
transfer buffer settings using the Client Configuration tool’s
Distributed LAN tab.) Given this configuration, here is the result:
Required memory = 1GB + (4 * 2 * 64 * 16 * 256K) = 3GB
If not all clients mount all of the file systems, the memory
requirement is reduced accordingly. For example, suppose in the
previous example that half of the 64 LAN clients mount three of the
four file systems, and the other half of the LAN clients mount the
remaining file system. Given this configuration, here is the result:
Required memory = 1GB + (3 * 2 * 32 * 16 * 256K) + (1 * 2 * 32 *
16 * 256K) = 1GB + 768MB + 256MB = 2GB
The calculation also changes when the number of NICs used for
data traffic varies across clients. For example, in the previous
example if the clients that mount only one file system each use
three NICs for data instead of two, here is the result:
Required memory = 1GB + (3 * 2 * 32 * 16 * 256K) + (1 * 3 * 32 *
16 * 256K) = 1GB + 768MB + 384K = 2176MB
• Linux. For a Linux Distributed LAN Server, use the following formula:
Required memory = 1GB +
(# of file systems served
* # of NICs on the Distributed LAN Server used for
Distributed LAN traffic
* server buffer count
* transfer buffer size)
For example, consider a Linux Distributed LAN Server that has two
NICs used for Distributed LAN traffic, serves four file systems, and
uses the default eight server buffers and 256K per buffer. (See the
dpserver and sndpscfg man pages for information about viewing
and modifying Distributed LAN buffer settings on Linux.) For this
case:
Required memory = 1GB + (4 * 2 * 8 * 256K) = 1040MB
Configuring LDAP
This sections describes how to configure the StorNext LDAP functionality
and describes related features in the Windows configuration utilities.
Using LDAP StorNext 2.7 introduced support for Light Directory Access Protocol, or
LDAP (RFC 2307). This feature allows customers to use Active Directory/
LDAP for mapping Windows User IDs (SIDs) to UNIX User ID/Group IDs.
UNIX File and Directory When a file or directory is created on Windows, the UNIX modes are
Modes controlled by the following file system configuration parameters:
UnixDirectoryCreationModeOnWindowsDefault 0755
UnixFileCreationModeOnWindowsDefault 0644
StorNext allows one set of values for all users of each file system.
LDAP Refresh Timeout Due to the implementation of the Windows Active Directory user
mappings, services for UNIX can take up to 10 minutes to be
propagated to StorNext clients.
by the system until the data structure is destroyed. Some locks that are
part of structures are seldom used, and exist for rare conditions. If the
lock is not used, the memory/event for that structure will never be
allocated.
Some data structures are not destroyed during the life of the FSM. These
include in-memory inodes and buffers and others.
When the system starts, handle use is minimal. After the FSM has been
up for a while, the handle count increases as the inode and buffer cache
are used. After a while, the system stabilizes at some number of
handles. This occurs after all inodes and buffers have been used.
The maximum number of used handles can be reduced by shrinking the
inode and/or buffer cache. However, changing these variables could
significantly reduce system performance.
For FsBlockSize the optimal settings for both performance and space
utilization are in the range of 16K or 64K.
Settings greater than 64K are not recommended because performance
will be adversely impacted due to inefficient metadata I/O operations.
Values less than 16K are not recommended in most scenarios because
startup and failover time may be adversely impacted. Setting
Note: This is particularly true for slow CPU clock speed metadata
servers such as Sparc. However, values greater than 16K can
severely consume metadata space in cases where the file-to-
directory ratio is low (e.g., less than 100 to 1).
For metadata disk size, you must have a minimum of 25 GB, with more
space allocated depending on the number of files per directory and the
size of your file system.
The following table shows suggested FsBlockSize (FSB) settings and
metadata disk space based on the average number of files per directory
and file system size. The amount of disk space listed for metadata is in
addition to the 25 GB minimum amount. Use this table to determine the
setting for your configuration.
Average No.
of Files Per File System Size: Less File System Size: 10TB
Directory Than 10TB or Larger
JournalSize Setting
The optimal settings for JournalSize are in the range between 16M
and 64M, depending on the FsBlockSize. Avoid values greater than
64M due to potentially severe impacts on startup and failover times.
Values at the higher end of the 16M-64M range may improve
performance of metadata operations in some cases, although at the
cost of slower startup and failover time. New file systems must have a
journal size of at least 1024 times the fsBlockSize.
Note: In the Windows XML format configuration file, the journal size
parameter is called journalSize. Regardless of the difference
in parameter names (journalSize and JournalSize) used
in the Windows and UNIX configuration files, the requirements
are identical for Windows and UNIX systems.
FsBlockSize JournalSize
16KB 16MB
64KB 64MB
Operating System /
Affected Component Description
All UNIX and Linux The swapon command does not work on StorNext file systems. The Linux/
Unix swapon command is used to specify devices on which paging and
swapping take place. If swapon is run on a StorNext file system, the
command fails with an invalid argument error.
Operating System /
Affected Component Description
The Solaris Security Toolkit, formally known as JASS, causes the following
two issues:
• It disables RPC by renaming the RPC startup script, disrupting the
StorNext interprocess communication. To fix the communication
problem, rename the RPC startup script in /etc/init.d from rpc.<illegal
extension> to rpc.
• It turns on IPSec, causing numerous warning messages in the system log
file. Either disable IPSec by removing the IPSec startup file in /etc/init.d or
contact Sun Technical Support to find out how to reconfigure IPSec to
ignore local loopback connections.
Windows Windows Services for UNIX (SFU) supports only NTFS for NFS exports.
Because of this limitation, a Windows system cannot act as an NFS server
for StorNext File System.
Operating System /
Affected Component Description
Windows Virus-checking software can severely degrade the performance of any file
system, including SNFS. If you have anti-virus software running on a
Windows Server 2003 or Windows machine, Quantum recommends
configuring the software so that it does NOT check SNFS.
As of StorNext release 3.5 the Authentication tab has been removed from
the Windows Configuration utility. (For several previous StorNext releases a
messaged warned that this tab would be removed in an upcoming release:
“WARNING: Active Directory will be the only mapping method supported in
a future release. This dialog will be deprecated.”)
Operating System /
Affected Component Description
Windows If you are using the StorNext client software with Windows Server 2003,
Windows XP, or Windows Vista, turn off the Recycle Bin in the StorNext file
systems mapped on the Windows machine, so the file systems will work
properly.
You must disable the Recycle Bin for the directory on which a StorNext file
system is mounted. You must also be sure to disable the Recycle Bin on
directories you have remapped. For example, if you mount a file system on
E: (and disable the Recycle Bin for that directory) and then remap the file
system to F:, you must then disable the Recycle Bin on the F: directory.
For Windows 2003 or Windows XP:
1 On the Windows client machine, right-click the Recycle Bin icon
on the desktop and then click Properties.
2 Click Global.
3 Click Configure drives independently.
4 Click the Local Disk tab that corresponds to the mapped file
system.
5 Select the Do not move files to the Recycle Bin. Remove files
immediately when deleted check box.
6 Click Apply, and then click OK.
For Windows Vista:
1 On the Windows client machine, right-click the Recycle Bin icon
on the desktop and then click Properties.
2 Click the General tab.
3 Click the mapped drive that corresponds to the StorNext mapped file
system.
4 Select the Do not move files to the Recycle Bin. Remove files
immediately when deleted check box.
5 Click Apply, and then click OK.
All To avoid parser errors, do not use “up” or “down” when naming items in
the configuration file. This applies especially to naming affinities or any
other string-type keyword. (This restriction does not apply to the Windows
XML format configuration file.)
Operating System /
Affected Component Description
All Be aware of the following limitations regarding file systems and stripe
groups:
• The maximum number of disks per file system is 512
• The maximum number of disks per data stripe group is 128
• The maximum number of stripe groups per file system is 256
• The maximum number of tape drives is 256
The Move Stripe Group Data feature (part of Dynamic Resource Allocation)
does not support moving sparse files. Sparse files are files that lack on-disk
allocations for some of the data within the data range indicated by the size
of the file.
A stripe group is defragmented as part of the data moving process. Because
defragmenting a sparse file would make it unsparce (and increase disk
usage), snfsdefrag skips sparse files when defragmenting. Therefore, all
existing sparse files remain on the original stripe group after moving of
other files is complete.
Operating System /
Affected Component Description
All If you have configured custom mount options in the /etc/fstab file other
than rw and diskproxy, if you subsequently add or remove the disk proxy
settings using the StorNext GUI, any custom mount options will be lost.
(Settings are added or removed in the StorNext GUI by navigating to the
SNFS home page and then choosing Filesystems > Modify from the Config
menu.)
In StorNext 3.0, the default buffer cache settings have been modified.
Previously, all reads/writes that were 64K or smaller went through the buffer
cache while larger I/O requests went direct. In StorNext 3.0, read/writes that
are 1MB or smaller go through the buffer cache, while larger I/O requests
go direct.
The new buffer cache settings may change the I/O behavior of some
applications. For example, on managed servers, I/O to and from tape now
goes through the buffer cache. To revert to the settings used in previous
releases, change the following mount options on StorNext clients:
auto_dma_read_length=65537
auto_dma_write_length=65537
Operating System /
Affected Component Description
As a result of log rolling changes in StorNext 3.0, logs are now rolled every 6
hours. For each log, 28 instances (7 days of logs) are retained. Log instances
are retained in the same directory as the original log.
All log files which are rolled are affected by this change, including TSM logs
(tac_00, bp_*.log, hist_01, etc.), MSM logs (tac_00, hist_01, etc.),
and any other components configured for rolling. The <component>/config/
filelist file contains roll_log entries that determine which files are rolled
(where <component> is /usr/adic/TSM, /usr/adic/MSM/, etc.).
The StorNext Library Space Used Report (accessible from the StorNext home
page by choosing Library Space from the Reports menu,) shows the
amount of nearline space used.
The nearline space amount does not include dead space, but does include
the following:
• All used space on all media in all libraries except vaults
• All space used by files that were put on a storage disk or deduplicated
storage disk
Operating System /
Affected Component Description
All As of SNFS 2.7, a change was made to the way that the Reserved Extents
performance feature affects free space reporting. In the previous release,
SNFS would reserve a certain amount of disk space which would cause
applications to receive an out of space error before the disk capacity
reached 100%.
In the current release, this reserved space is treated as allocated space. This
allows applications to perform allocations until the file system is nearly full.
NOTE: Due to allocation rounding, applications may still receive a
premature out of space error, but only when there are just a few
megabytes of space remaining. In the worst case, the error will be returned
when the reported remaining space is:
(InodeExpandMax * #-of-data-stripe-groups)
One side effect of this change is that after creating a new file system, df
will show that space has been used, even though no user data has been
allocated.
The amount of reserved space varies according to client use but does not go
below a “floor” of a few gigabytes per data stripe group. The amount of
reserved space at any time can be seen using the cvadmin command,
selecting the file system, and using show long.
While not recommended, the Reserved Extents feature can be disabled by
applying the following setting to the Globals section of the FSM
configuration file:
ReservedSpace No
This will cause the file system to not reserve space for buffered I/O, thereby
reducing buffer cache performance and possibly causing severe
fragmentation.
For more information, see The Reserved Space Parameter on page 219 and
the cvfs_config(4) man page.
complete until each of the three target file system targets have been
completely made.
If a replication source policy specified 10 for the "Number of copies to
keep on target" and specified 3 target file systems, you would eventually
have 30 replication directories: 3 after the first replication, 6 after the
second replication, etc.
Context 3: Storage Manager number of copies. Storage Manager stores
1 through 4 copies of a file. The number of files is configured under the
Steering tab when editing or creating a Storage Manager storage policy.
(Actually, 4 is the default maximum number of copies. The number can
be larger than 4.) Each of these copies is a copy of the same file
contents.
Context 4: Storage Manager version. Versions refer to changed
contents of the same file. By default, Storage Manager keeps ten
versions of a file. Unlike Storage Manager copies, Storage Manager
versions refers to different file contents. If there is a file called "log" that
changes every day, and Storage Manager stores "log" every day, after
ten days there would be ten versions of "log". The fsversion
command is used to examine and change the Storage Manager versions
of a file.
Context 5: Storage Manager File Recovery. When a file is removed from
a Storage Manager relation point, the previous copies stored by Storage
Manager are still on media and in the SM database. These previous
versions may be recovered using the fsrecover command. There is no
a limit to the number of SM instances which can be recovered in this
manner. Eventually the administrator may use the fsclean command
to clean up older versions of SM media. After running fsclean, files
that used to reside on the media can no longer be recovered with the
fsrecover command.
Number of Replication When a source directory is replicated to a target there can be from 1
Copies through 16 replicated target directories that reflect replications of the
source at different times. The number of copies is specified by the
"Copies to Keep on Target" parameter on the Inbound Replication tab or
Outbound Replication tab. You enter parameters on these tabs when
configuring a snpolicyd storage policy.
The "Copies to Keep on Target" selection allows values of 1 through 16,
and also a special case called in-place. We will not discuss the in-place
selection in this section.
First, let’s consider the case where "Copies to Keep on Target" is 2. Each
time a replication occurs a new target directory is created. This target
directory might have the same name as the previous target directory,
but it is actually a new directory. The new directory reflects files added,
deleted, and changed since the previous replication.
It is important to understand that in this example the target is a new
directory. This has implications that might not be immediately obvious.
For one thing, it means we cannot use the target directory in exactly the
same way as we might use the source directory. Following is an
explanation and examples.
Isolating a Replication To isolate a replication target directory, use the snpolicy command’s
Target Directory -exportrepcopy option. This operation is available only from the
command line, not from the StorNext GUI.
First, use the -listrepcopies option on the target node to determine
the association between the target copy number and the target
directory to use. The -listrepcopies output provides the "key" value
for the policy used to implement this replication. For example, if the
target file system is /snfs/rep, use the command:
/usr/cvfs/bin/snpolicy -listrepcopies=/snfs/rep
Here is the relevant part of the command output:
source://snfs/[email protected]:/project?key=402 ->
target://snfs/rep@node2:?key=406
0 -> /snfs/rep/project
1 -> /snfs/rep/project.1
2 -> /snfs/rep/project.2
3 -> /snfs/rep/project.3
The copy number appears in the left column, and the realization
directory for that copy number is shown after the "->".
There are two "keys" shown in the sample output. The first key (402) is
the key value for the source policy controlling the replication. The
second key value (406) controls the replication realization of the target.
Let's say you want to copy files back from /snfs/rep/project.2. To
isolate /snfs/rep/project.2 you would use this command:
/usr/cvfs/bin/snpolicy -exportrepcopy=/snfs/rep/ --
key=406 -copy=2 --path /snfs/rep/project_temp
This command renames the directory /snfs/rep/project.2 to /
snfs/rep/project_temp and prevents the policy daemon from
affecting this directory, in case replications for this target policy become
activated again during the recovery process.
The -path argument is optional: you can do only the exportrepcopy
operation and use the directory name /snfs/rep/project.2 when
recovering replicated files.
The point of this is that using the -exportrepcopy option allows you
to use the directory without having to worry about it changing, or files
disappearing as you do your work.
Once a directory has been isolated in this manner, it can then be
transformed into a replication source directory for rereplication to
another file system and/or machine.
Final Recommendation You should not change the contents of a replication target directory. It
For Target Directories should be treated as a "read-only" replica, even though StorNext does
not enforce a read-only restriction.
If you change a file in a replication target directory you may be
changing the file contents in other target directories due to the "hard-
link" usage in replication. Furthermore, if you change or add files in a
directory, that directory may disappear due to subsequent replications.
(Using exportrepcopy avoids this second issue.)
Configurable via the StorNExt Yes. Select the Storage Policy Yes. Select the Storage Policy
GUI? menu’s Storage Manager menu’s Replication /
option. Deduplication option.
Configurable via the command Yes. Use fs commands such as Yes. Use the snpolicy
line? fsaddclass and command.
fsmodclass
Where are policy internals In Storage Manager Database. In the managed file system, in
stored? One database per machine. a private directory.
Is the policy used across file Yes. One policy can be used in No. Policies apply to one file
systems? multiple directories and system, but can be applied to
multiple file systems. multiple directories.
How are truncated files The entire file must be Only portions of the file
retrieved? retrieved. containing needed regions
may be retrieved.
Previous file versions Yes. Recover previous tape Yes. Previous replicated copies
recoverable? version with the fsrecover can be kept in previous
command. Up to 10 tape replication directories. Up to
versions. 16.
Example
You create an snpolicyd policy with the StorNext GUI or with the
snpolicy command. The snpolicy command is in directory /usr/
cvfs/bin. Command line configuration must be done by the Linux
root user.
Suppose you create directory /stornext/snfs1/photos in file
system /stornext/snfs1 on machine host1. You then use the
StorNext GUI to create a replication policy named photo_rep to
replicate this directory to file system /stornext/backup on machine
host2. (As in the previous example, the policy was configured to keep
two copies on the target.)
Now use the snpolicy command to see more internal details about
the policy called photo_rep.
Use this command:
/usr/cvfs/config/snpolicy -dumppol/stornext/snfs1/
photos
The command's output looks like this:
inherit=photo_rep
key=1720399
root=/stornext/snfs1/photos
dedup=off
dedup_filter=off
max_seg_size=1G
max_seg_age=5m
dedup_age=1m
dedup_min_size=4K
dedup_seg_size=1G
dedup_min_round=8M
dedup_max_round=256M
dedup_bfst="localhost"
fencepost_gap=16M
trunc=off
trunc_age=365d
trunc_low_water=0
trunc_high_water=0
rep_output=true
rep_report=true
rep_target="target://stornext/backup@host2:"
rep_copies=2
There is a lot of output, most of which we don’t have to consider now.
Some of the important values are:
• inherit=photo_rep: This means the policy controlling this
directory receives its parameters from the policy named
photo_rep. Remember, when you create a policy you give it a
name, and the policy name belongs to the file system. There could
be a different policy named photo_rep in a different file system,
and there would be no connection between the two photo_rep
policies.
• rep_output=true: This means the policy is a source of replication.
• rep_copies=2: This means you want to keep two copies
(instances) of the replicated directory on the target file system.
• rep_target="target://stornext/backup@host2:": This
tells you the replication target directory is a directory in file system
/stornext/backup on machine host2. But which directory name
will be used in that file system? Since you did not specify anything
else, the source directory name will be used. In this case the source
directory name in the source file system is photos, so the target
directory names will be /stornext/backup/photos and
/stornext/backup/photos.1.
• dedup=off: This means the files in this directory are not
deduplicated before being replicated. Deduplication and replication
are discussed in another section.
One comment about a field not in the command output. Since there is
no line for rep_input=true, this means this directory is not a
replication target directory. This is not surprising. While it is true that a
replication target can also be a replication source, that is an advanced
case not covered here.
Digression
Following is some additional detail which you may want to skip the first
time you read this section.
Below is output from the command ls -l /stornext/backup/
.rep_private:
total 144
drwx------ 19 root root 2057 Jan 26 10:12
00047DA110919C87
drwx------ 3 root root 2054 Jan 26 10:12 config
drwx------ 3 root root 2056 Jan 25 14:11 oldest
drwx------ 3 root root 2116 Jan 26 10:13 pending
drwx------ 3 root root 2132 Jan 26 10:13 queued
drwx------ 2 root root 2048 Jan 21 16:56 source_state
drwx------ 3 root root 2048 Jan 20 17:13 target
drwx------ 2 root root 2116 Jan 26 10:13 target_state
Second Replication
The "standard" case - when we have replicated files once - is that the link
count for the target file will be two.
Now let's say that we add file4 and file5 to the source directory and
replicate again. After the second replication, target directory
/stornext/backup/photos contains the following:
total 6864
-rwxr-xr-x 3 testuser root 1388936 Jan 26 10:11 file1
-rw-r--r-- 3 testuser root 1430896 Jan 26 10:11 file2
-rw-r--r-- 3 testuser root 1397888 Jan 26 10:12 file3
-rwxr-xr-x 2 testuser root 1388994 Jan 26 11:02 file4
-rwxr-xr-x 2 testuser root 1388965 Jan 26 11:03 file5
Target directory /stornext/backup/photos.1 contains the previous
replication:
total 4144
-rwxr-xr-x 3 testuser root 1388936 Jan 26 10:11 file1
-rw-r--r-- 3 testuser root 1430896 Jan 26 10:11 file2
-rw-r--r-- 3 testuser root 1397888 Jan 26 10:12 file3
Notice that file1, file2, and file3 each have a link count of 3. One
link (name) is in directory photos, another link is in directory
photos.1, and the third is the snpolicyd "internal" link in the
.rep_private directory. The two new files, file4 and file5, appear
only in the new directory and in the .rep_private directory. They
have a link count of 2.
Since file1, file2, and file3 are really the same file in directories
photos and photos.1, no extra disk storage is needed for these files
when replicating again. In general, when you use replication with more
than one copy retained on the target, no additional storage is needed
for unchanged files. If a file is changed, both the old and the new
version are retained, so additional storage is needed in this case.
(Unless deduplication is also used, which is discussed later.)
Now let's make two changes. Say we remove file4 in the source
directory and modify file2. After the next replication, target directory
photos contains the following:
total 5200
-rwxr-xr-x 3 testuser root 1388936 Jan 26 10:11 file1
-rw-r--r-- 2 testuser root 1123155 Jan 26 11:20 file2
the file system mount point as the source directory name is relative to
the source file system mount point.
Examples
Suppose you apply a replication policy to directory /stornext/
snfs1/a/b/c/d/photos in file system /stornext/snfs1, and
replicate to file system /stornext/backup. The default target
replication directory name would be /stornext/backup/a/b/c/d/
photos, and previous replication directories would be stornext/
backup/a/b/c/d/photos.1, etc.
There are other options that can be specified on either the source policy
or on the target policy. Since we have been concentrating on the source
policy, following are examples of changes there.
When creating or editing a policy, specify the alternative path names in
the area of the screen labeled Pathname on Target on the Outbound
Replication tab. When you click the Override label, a field appears
where you can type some text. Some hints appear above that field,
showing special entry values such as %P and %D.
In all of the following examples, assume that the replication source
directory is /stornext/snfs/photos/ocean in directory photos/
ocean relative to the source file system /stornext/snfs1. For this
example we will replicate to file system /stornext/backup. We know
that if we do not override the "Pathname on Target" value, the
replication target directory name will be /stornext/backup/
photos/ocean.
• If you enter a string without any of the "%" formatting characters,
the replication directory will be the name you specify. For example,
if you specify open/sesame for Pathname on Target, the replication
directory would be /stornext/backup/open/sesame.
• %P means source pathname relative to the source file system. For
example, if you specify open/sesame/%P for Pathname on Target,
the replication directory would be /stornext/backup/open/
sesame/photos/ocean
• %D means today’s date. %T means the replication time. For
example, if you specify %D/%T/%P for Pathname on Target, the
replication directory would be /stornext/backup/2010-02-02/
16_30_22/photos/ocean (on February 2, 2010).
Deduplication Overview
Here is the view from 100,000 feet. When StorNext deduplication is
enabled, a file is examined and logically split into data segments called
BLOBs (binary large objects). Each BLOB has a 128-bit BLOB tag. A file
can be reconstructed from the list of BLOBs that make up a file. The data
for each BLOB is stored in the blockpool for a machine. We can use the
command snpolicy -report file_pathname to see the list of
BLOB tags for a deduplicated file.
When a deduplicated file is replicated, the BLOBs are replicated from the
blockpool on the source machine to the blockpool on the target
machine. If the source file system and the target file system are both
hosted on the same machine, no data movement is needed. If the same
BLOB tag occurs several times (in one file or in many files) only one copy
of the data BLOB exists in the blockpool. During replication that one
copy must be copied to the target blockpool only once.
This is why deduplicated replication can be more efficient than non-
deduplicated replication. With non-deduplicated replication, any
change in a file requires that the entire file be recopied from the source
to the target. And, if the data is mostly the same in several files (or
Enabling Deduplication When creating or editing a policy through the StorNext GUI, select the
Deduplication tab and make sure deduplication is enabled (On). If you
use the snpolicy dumppol option, you will see dedup=on in the output
when the policy has deduplication enabled.
Deduplication Note that in the "snpolicy -dumppol" output shown earlier we also
Modification Time saw dedup_age=1m. This means the file may be deduplicated after it
has not changed for at least one minute. If a file is being written its file
modification time (mtime) will be updated as the file is being written.
Deduplication age specifies how far in the past the modification time
must be before a file can be considered for deduplication.
There are no blocks in any of the three files, although each file retains its
correct size.
(As an exercise, in the previous "ls -l" and "ls -ls" examples, what
does the line that says "total some_number" tell us?)
When an application or command accesses any of the data in a
truncated file, StorNext retrieves the data it needs from the blockpool.
This may be the entire file for a small file. For a larger file, a portion of
the file would be retrieved: a portion at least large enough to contain
the file region required. If you read the entire file, the entire file will be
retrieved.
Truncation provides the mechanism by which file system storage space
may be reduced. When a file is truncated it takes no space in its file
system, but space for its BLOBs is required in the blockpool. If we receive
deduplication benefit (that is, if the same BLOB data occurs in more
than one place,) then we have less space used in the blockpool than
would be in the original file system.
Enabling Deduplication In order to enable truncation, both deduplication and truncation must
and Truncation be enabled in the storage policy. The StorNext GUI contains tabs for
both deduplication and truncation which allow you to enable
deduplication and truncation respectively.
Before a file is truncated it must pass a "Minimum Idle Time Before
Truncation" test. If this minimum age is ten minutes, then ten minutes
must elapse after the last file modification or file read before truncation
can occur. The default value for the minimum idle time is 365 days.
In the output from "snpolicy -dumppol" the parameters we have
been discussing are displayed like this:
trunc=on
trunc_age=365d
Storage Manager Storage Manager also truncates files. Storage Manager truncation is
Truncation similar to but not identical with the deduplication-based truncation we
have been discussing. Storage Manager truncation will be discussed
again when we consider deduplication / replication with Storage
Manager.
Replicating into a To replicate into a relation point, specify a target directory underneath a
Storage Manager Storage Manager relation point. Do this with the parameter "Pathname
Relation Point on Target" in the StorNext GUI, or with rep_realize=… when
configuring a policy with the snpolicy command.
Example
Suppose we are replicating to file system /stornext/backups on a
target machine, and /stornext/backups/sm1 is a Storage Manager
relation point in that file system.
Some possible choices for "Pathname on Target" would be
• sm1/%P
• sm1/mystuff
• sm1/%H/%P
You shouldn’t specify something like /stornext/backups/sm1/
mystuff because "Pathname on Target" is relative to the target file
system mount point, which in this case is /stornext/backups.
If "Copies to Keep on Target" is more than 1, the rules discussed earlier
determine the names for the directories in subsequent replications.
Example
If we replicate the source directory named photos into a relation point
using the "Pathname on Target" sm1/%P, we end up with directories like
/stornext/backups/sm1/photos, /stornext/backups/sm1/
photos.1 and so on for the replicated directories when we are keeping
more than one copy on the target.
The directories photos and photos.1 are in the SM relation point.
Let's say we have the two directories photos and photos.1 with the
contents that we discussed earlier.
Target directory /stornext/backups/sm1/photos contains the
following:
-rwxr-xr-x 3 testuser root 1388936 Jan 26 10:11 file1
-rw-r--r-- 2 testuser root 1123155 Jan 27 11:20 file2
-rw-r--r-- 3 testuser root 1397888 Jan 26 10:12 file3
-rwxr-xr-x 3 testuser root 1388965 Jan 26 11:03 file5
Target directory /stornext/backups/sm1/photos.1 contains the
following:
-rwxr-xr-x 3 testuser root 1388936 Jan 26 10:11 file1
-rw-r--r-- 1 testuser root 1430896 Jan 26 10:11 file2
-rw-r--r-- 3 testuser root 1397888 Jan 26 10:12 file3
-rwxr-xr-x 2 testuser root 1388994 Jan 26 11:02 file4
-rwxr-xr-x 3 testuser root 1388965 Jan 26 11:03 file5
Question: Will Storage Manager store all the files in photos after the
most recent replication? The answer is no. In this example, file2 is a
file that was modified since the previous replication. Thus file2 is the
only file that will be stored by Storage Manager after the most recent
replication.
When replication occurs we create store candidates for the new or
changed files that were included in the most recent replication within a
relation point. In this example, only file2 will be a store candidate
after the latest replication. You can use the showc command to see the
new Storage Manager store candidates after a replication.
Note: Even if you created a store candidate for every file in the
replicated target directory, only the new or changed files would
be stored by SM. This is because the other files are links to files
that have already been stored by Storage Manager, or at least
files that were already on the Storage Manager store
candidates list.
Example
In the earlier example we this saw this output (for a truncated file) after
running "ls -ls":
0 -rwxr-xr-x 3 testuser root 1388936 Jan 26 10:11 file1
For an untruncated file, the "ls -ls" output might look something like
this:
1360 -rwxr-xr-x 3 testuser root 1388936 Jan 26 10:11 file1
The 1360 blocks in this file are enough to contain a file of size 1388936
(since 1360 * 1024 = 1392640). However, we might also see a blocks
value that was non-zero but not enough to contain the entire file size.
This might indicate the following:
• A sparse file (this will not be discussed here)
• A file with a stub left on the disk
• A file that had been partially retrieved
Example
Suppose you have a 100 GB file that is truncated. If a process reads a
few bytes (at the front or even in the middle of the file), several
megabytes of file data are retrieved from the blockpool and the process
continues. There is no need for the entire file to be retrieved. If more of
the file is read, even larger chunks of the file are retrieved.
You can see the snpolicyd state of a file by using the command
"snpolicy -report".
Example
Running the command snpolicy -report /stornext/sn1/dd1/
kcm2 gives us output similar to this:
/stornext/sn1/dd1/kcm2
Deduplication Without SM can truncate when dedu- snpolicyd can truncate after
Replication plication has happened deduplication
The following sections summarize some of the facts above (and add
some more information) in a "usage case" or “scenario” format.
The StorNext High Availability (HA) feature allows you to configure and
operate a redundant server that can quickly assume control of the
StorNext file systems and management data in the event of certain
software, hardware and network failures on the primary server.
This appendix contains the following topics which provide an in-depth
look at HA systems and operation:
• HA Overview
• HA Internals: HAmon Timers and the ARB Protocol
• Configuration and Conversion to HA
• Managing HA in the StorNext GUI
• HA Operation
• HA Resets
• HA Tracing and Log Files
• FSM failover in HA Environments
• Replacing an HA System
HA Overview
The primary advantage of an HA system is file system availability,
because an HA configuration has redundant servers. During operation,
if one server fails, failover occurs automatically and operations are is
resumed on its peer server.
At any point in time, only one of the two servers is allowed to control
and update StorNext metadata and databases. The HA feature enforces
this rule by monitoring for conditions that might allow conflicts of
control that could lead to data corruption.
Before this so-called Split Brain Scenario would occur, the failing server
is reset at the hardware level, which causes it to immediately relinquish
all control. The redundant server is able to take control without any risk
of split-brain data corruption. The HA feature provides this protection
without requiring special hardware, and HA resets occur only when
necessary according to HA protection rules.
Arbitration block (ARB) updates by the controlling server for a file
system provide the most basic level of communication between the HA
servers. If updates stop, the controlling server must relinquish control
within a fixed amount of time. The server is reset automatically if control
has not been released within that time limit.
Starting after the last-observed update of the ARB, the redundant server
can assume control safely by waiting the prescribed amount of time. In
addition, the ARB has a protocol that ensures that only one server takes
control, and the updates of the ARB are the method of keeping control.
So, the ARB method of control and the HA method of ensuring release
of control combine to protect file system metadata from uncontrolled
updates.
Management data protection builds on the same basic HA mechanism
through the functions of the special shared file system, which contains
all the management data needing protection. To avoid an HA reset
when relinquishing control, the shared file system must be unmounted
within the fixed-time window after the last update of the ARB.
Management data is protected against control conflicts because it
cannot be accessed after the file system is unmounted. When the file
system is not unmounted within the time window, the automatic HA
reset relinquishes all control immediately.
updated. If the brand is not being updated or if the usurping FSM has
more votes than the current controlling FSM has connections, the
usurper writes its own brand in the ARB. The FSM then watches the
brand for a period of time to see if another FSM overwrites it. The
currently active FSM being usurped, if any, will exit if it reads a brand
other than its own (checked before write). If the brand stays, the FSM
begins a thread to maintain the brand on a regular period, and then the
FSM continues the process of activation.
At this point the usurping FSM has not modified any metadata other
than the ARB. This is where the HAmon timer interval has its effect. The
FSM waits until the interval period plus a small delta expires. The period
began when the FSM branded the ARB. The FSM continues to maintain
the brand during the delay so that another FSM cannot usurp before
activation has completed. The connection count in the ARB is set to a
very high value to block a competing usurpation during the activation
process.
When an FSM stops, it attempts to quiesce metadata writes. When
successful, it includes an indicator in its final ARB brand that tells the
next activating FSM that the file system stopped safely so the wait for
the HA timer interval can be skipped.
staff. It affects all the monitored FSMs and could add a significant delay
to the activation process. Quantum Software Engineering would like to
be notified of any long-term need for a non-default timer interval.
For very long HAmon interval values, there are likely to be re-elections
while an activating FSM waits for the time to pass before completing
activation. An additional usurpation attempt would fail because the ARB
brand is being maintained and the connection count is set to a value
that blocks additional usurpation attempts.
The optional configuration of this feature is in the following file:
<cvfs root>/config/ha_smith_interval
The information at the start of the file is as follows:
ha_smith_interval=<integer>
The file is read once when StorNext starts. The integer value for the
HAmon timer interval is expressed in seconds. The value can range from
3 to 1000, and the default is 5 seconds. The timer must be set identically
on both servers. This rule is checked on a server that has standby FSMs
when a server that has active FSMs communicates its timer value. When
there is a discrepancy, all the FSMs on the receiving end of that
communication are stopped and prevented from starting until StorNext
has been restarted. This status can be observed with the cvadmin tool in
the output of its FSMlist command.
In almost all cases of misconfigured timers, the mistake will be obvious
shortly after starting the HA cluster’s second server. The first server to
start StorNext will activate all of its FSMs. The second server should have
only standby FSMs. Once the second server detects the error, all of its
FSMs will stop. After this, there will be no standby FSMs, so the cluster
is protected against split-brain scenario. In the event that a server with
active FSMs resets for any reason, that server will have to reboot and
restart StorNext to provide started FSMs to serve the file systems.
peer FSMPM on the other server to ask which of these standby FSMs are
not being activated. Implicit in the response is a promise not to activate
the FSMs for two seconds. When the response is received within one
second, the first FSMPM resets the timers for those FSMs for which
usurpation is not in progress. Obviously, both server computers must be
up and running StorNext for this to function.
This can postpone the impending HA reset for a while, but an election
could occur if this goes on too long. It is important to quickly
investigate the root cause of SAN or LUN delays and then engineer them
out of the system as soon as possible.
Primary and Secondary Databases and management data for StorNext Storage Manager or the
Server Status Linux GUI must also be protected against split-brain scenario
corruption. Protection is accomplished by tieing the startup of processes
that modify this data with the activation of the shared file system.
Activating the shared file system leads to setting a Primary status in the
local FSMPM, which is read and displayed by the snhamgr command.
Primary status and the implicit Secondary status of the peer server are
distinct from the Active and Standby status of the individual FSMs on
the servers.
Unmanaged file systems can be active on either server. When an HA
Cluster has no managed file systems and no shared file system, neither
server computer has Primary status-they are equals.
File System Types HA is turned on by default for all StorNext distributions, but has no
effect unless FSMs request to be monitored. File system monitoring is
controlled by a file-system configuration item named HaFsType. Each
file system is one of three types: HaUnmanaged, HaManaged or
HaShared. The HaFsType value is read by FSMs to direct them to set
up appropriate HAmon behaviors, and it is read by the FSMPM to
control how it starts FSMs.
HaUnmanaged
Each unmanaged-file-system FSM starts an instance of the HAmon timer
for itself when it first brands the ARB. Before it changes any metadata,
an activating FSM waits for the timer interval plus a small amount of
time to elapse. The interval for a usurping FSM begins with the last time
the FSM reads new data in the ARB from a previously active FSM.
Unmanaged FSMs can be active on either server in the HA Cluster. They
can be usurped and fail over without a system reset if they exit before
the timer expires. The timer interval for an active FSM restarts with each
update of the ARB.
HaManaged
Managed-file-system FSMs do not start HAmon timers, and they do not
wait the HAmon interval when usurping. The FSMPMs only start
Managed FSMs on the Primary server, so there is no risk of split-brain
scenario. In the event that a Managed FSM exits without having been
stopped by the FSMPM, it is automatically restarted after a ten-second
delay and activated. The cvadmin tool's FSMlist command displays the
blocked FSMs on non-Primary servers. There can be zero or more
HaManaged file systems configured.
HaShared
The shared file system is an unmanaged StorNext file system that plays a
controlling role in protecting shared resources. It has the same HA
behavior as other unmanaged FSMs, but it also sets a flag that triggers
an HA reset when the cvfsioctl device is closed. This happens when the
process exits for any reason. However, if the shared file system has been
unmounted from the active server before the FSM exits, the reset-on-
close flag gets turned off. This allows ordinary shutdown of CVFS and
Linux without resetting the server.
When the HaShared FSM finishes activation, it sets the Primary status in
its FSMPM process.
Protected shared data resides on the shared file system. Since only one
FSM can activate at one time, the Primary status is able to limit the
starting of management processes to a single server, which protects the
data against split-brain scenario.
The starting of HaManaged FSMs is also tied to Primary status, which
guarantees collocation of the managed file-system FSMs and the
management processes. The GUI's data is also shared, and the GUI must
be able to manipulate configuration and operational data, which
requires that it be collocated with the management processes.
The ha_peer and StorNext HA server software uses peer-to-peer communication between
fsnameservers File servers and needs to know the peer's IP address. The fsnameservers
configuration file is not a good source for the address because some
installations configure the nameservers outside of the metadata servers.
Instead, the following file provides that information:
<cvfs root>/config/ha_peer
Following are the uses of the peer IP address:
• Negotiating timer resets
• Comparing the HAmon timer value between servers
• HA Manager communications (only on StorNext Storage Manager
for Linux)
It is very important to have correct information in the ha_peer file, but
it is not a requirement that the peer be available for communication.
Basic HA functionality operates correctly without IP communication
between peers. The file's contents can be changed without restarting
StorNext. The new value will be read and used moments after it has
changed.
Here are some other points to consider about fsnameservers:
• For best practice, the fsnameservers file should contain IP
addresses, not names.
• All the addresses in the file must be reachable by all members of the
StorNext cluster. That is, servers, clients and distributed LAN clients.
• All members of the cluster should have the same nameservers
configuration.
• Both or neither of an HA Cluster’s MDCs must be included so that a
coordinator is always available when either server is running.
• Multiple StorNext Clusters can share coordinators, but every file
system name configured on any of the clusters must be unique
across all of the clusters.
HA Manager The HA Manager subsystem collects and reports the operating status of
an HA cluster and uses that to control operations. It is part of a Storage
Manager installation that has been converted to HA with the
cnvt2ha.sh script. For manually-configured HA clusters where the
cnvt2ha.sh script has not been run, the command-line interface (HA
CLI) reports a default state that allows non-HA and File System Only HA
configurations to operate.
The HA Manager supports non-default HA Cluster functionality such as
suspending HA monitoring during administrative tasks. It attempts to
communicate with its peer at every decision point, so it is mostly
stateless and functions correctly regardless of what transpires between
decision points. Following every command, the snhamgr command line
interface reports the modes and statuses of both servers in the cluster,
which provide necessary information for the StorNext control scripts.
• single-locked
• config-peerdown
• config-locked
• locked-*
The following states are prohibited and prevented from occurring by the
HA Manager, unless there is improper tampering. For example, the last
state listed below (peerdown-*), is the case when a node that is
designated as peerdown begins communicating with its peer. If any of
these is discovered by the HA Manager, it will take action to move the
cluster to a valid state, which may trigger an HA reset.
• single-default
• single-single
• single-config
• config-default
• config-single
• config-config
• peerdown-*
HA Manager Components
The following files and processes are some of the components of the HA
Manager Subsystem:
• snhamgr_daemon: If the cnvt2ha.sh script has been run, this
daemon is started after system boot and before StorNext, and
immediately attempts to communicate with its peer. It is stopped
after StorNext when Linux is shutting down. Otherwise, it should
always be running. A watcher process attempts to restart it when it
stops abnormally. Its status can be checked with 'service snhamgr
status'. It can be restarted with 'service snhamgr start' or 'service
snhamgr restart' if it is malfunctioning.
• snhamgr: CLI that communicates with the daemon to deliver
commands and report status, or to report a default status when the
cnvt2ha.sh script has not been run. This is the interface for
StorNext control scripts to regulate component starts.
HA Manager Operation
In addition to the setting of modes, there are some commands provided
by the HA Manager to simplify and automate the operation of an HA
Cluster.
1 status: Report cluster status. All commands report cluster status
on completion. This command does nothing else unless an invalid
cluster state is detected, in which case it will take steps to correct
the cluster state.
2 stop: Transition the non-Primary server to locked mode, which will
stop StorNext if it is running. Then transition the Primary server to
config mode, which will turn off HA monitoring. Stop StorNext on
the Primary server. Finally, transition both servers to default mode.
3 start: If either MDC is in config or single mode, transition it to
default mode (CVFS stops when transitioning from config mode). If
either MDC is in locked mode, transition it to default mode. If the
remote MDC is in peerdown mode, then run peerup. If the local
MDC is stopped, run 'service cvfs start'. If the remote MDC is
The existence of that file enables the running of the snhamgr service,
which starts the HA Manager daemon.
Before the conversion script is run on the secondary, the following file
must be copied from the Primary:
/usr/cvfs/config/fsnameservers
The arguments to the conversion command for the secondary server are
as follows:
/usr/adic/util/cnvt2ha.sh secondary <sharedfs name>
<peer IP address>
This gives the redundant peer server enough information to access the
shared file system as a client. It then copies the mirrored configuration
files into its own configuration directory and sets up the ha_peer file.
The database and management-components configuration files are
rerouted to the /usr/adic/HAM/shared shared file system mount
point. Finally, the .SNSM_ha_configured touch file is created, and
StorNext is restarted.
SyncHA process Before the shared file system is up, some configuration files must be
available. These are initially “mirrored” to the secondary server by the
cnvt2ha.sh script, and then maintained across the two server
computers by the syncHA process, which is run once per minute from
cron. On the Primary, the command stat's the mirrored files to see what
has changed, and copies these out to the /usr/adic/HAM/shared/
mirror folder. On the secondary server, the files that have changed are
copied in. The list of mirrored files is defined in the /usr/cvfs/
config/filelist and /usr/adic/gui/config/filelist tables
as follows.
In the /usr/cvfs/config directory:
• license.dat
• fsmlist
• fsnameservers
• fsroutes
• fsports
• *.cfg
• *.cfgx
• *.opt
• nss_cctl.xml
• osnpolicyd.conf
• oblockpool_settings.txt
• oblockpool_root
• oblockpool_config.tmpl
• oblockpool_config.txt
• obp_settings
In the usr/adic/gui directory:
• odatabase/*
• oconfig/host_port.conf
The Manage option also enables you to perform the following HA-
related actions:
Enter Config Mode: Sets the peer (secondary) Exit Config Mode: Starts both nodes of the HA
node to locked mode and sets the local (primary) cluster in default mode.
node to config mode for administration purposes. 1 Click Exit Config Mode.
The locked mode stops CVFS on the peer, and is
2 When the confirmation message appears,
designed for automated short-duration stops of
click Yes to proceed or No to abort.
the secondary server to make configuration
changes and other modifications. This allows the 3 Click OK when a message informs you that
HA Manager to prevent HA resets while making the cluster was unlocked.
configuration changes or stopping the primary
server.
Note: In the event that TCP communication to
the secondary server is lost for any reason,
the primary server assumes the secondary
server is in default mode and transitions
the local server out of config mode. For
this reason, the locked mode is not
appropriate to use for extended secondary-
server outages, activities that might
include reboots of the secondary server,
etc. Best practice is to use Peerdown mode
when a server is turned off for an extended
period, or to simply keep the primary
server in default mode while the secondary
server is brought in and out of service in
short durations.
1 Click Enter Config Mode.
2 When the confirmation message appears, click
Yes to proceed or No to abort.
3 Click OK when a message informs you that the
cluster was locked.
HA Operation
Most of the information in this section is in regard to GUI-supported
configurations of StorNext on Linux servers; that is, those installations
having an HaShared FSM. There is very little difference for File System-
Windows and Linux HA monitoring is turned on by default when FSM configurations include
SNFS Installations the HaFsType configuration parameter. There is no need to disable HA
Without the HaShared in almost all cases. The only mechanism for turning it off is to remove
File System the configuration parameter, but this should be done only after the
redundant server has been turned off.
Linux SNMS and SNFS The HaShared file system is required for SNMS and GUI-supported
Installations with the installations. The shared file system holds operational information for
HaShared File System those components, which must be protected against split-brain
corruption. The additional complexity this entails is simplified and
automated by the HA Manger Subsystem.
The cvfs script (indirectly) starts the DSM_control script, which starts the
FSMPM, waits for it, and then repeatedly attempts to mount all of the
cvfs type file systems. The FSMPM reads the FSM configuration files and
the fsmlist file. It starts the HaShared and HaUnmanaged FSMs in the
fsmlist, but delays starting the HaManaged FSMs. The sub state of the
delayed FSMs can be displayed with the fsmlist command in the
command, which stops the cluster using the stop command, then starts
the local server to become Primary, followed by starting the Secondary
server:
snhamgr start
StorNext HA also has the ability to stop a Primary server while it is in
default mode without incurring an HA reset in most cases. It does this as
follows:
1 Stop Storage Manager processes, including the database
2 Unmount all CVFS file systems on the local server other than the
HaShared file system
3 Stop all FSMs on the local server other than the HaShared FSM
4 Unmount the HaShared file system
5 Stop the FSMPM
6 Stop the HaShared FSM
FSMs are elected and activate on the peer server as they are stopped on
the local server.
An HA reset can occur if step 4 fails. (That is, if the HaShared file system
cannot be unmounted for any reason.) This is the method for protecting
Storage Manager management data against split-brain-scenario
corruption. All of the at-risk data is contained on the shared file system,
so the unmount operation ensures that the local server cannot modify
the data.
• /etc/fstab
Non-production Operation
There is a method for starting the SNFS file systems without starting the
Storage Manager management components in the rare case that this is
needed. The following two commands accomplish the same goal:
• adic_control startonly snfs
• DSM_control startonly
HA Resets
After a suspected HA Reset, the first place to look is the /usr/cvfs/
debug/smithlog file, which contains one-line time-stamped
descriptions of probable causes for the reset.
There are three methods for producing an HA Reset:
1 Expiration of an HA Monitor timer
2 Exit of the active HaShared FSM while the shared file system is
mounted on the active MDC
3 Invocation of the 'snhamgr force smith' command by a script or
manually by an administrator
HA Resets of the First The first method of an HA Reset is explained by the following
Kind description of the FSM monitoring algorithm (patent pending). The
terms usurp and usurpation refer to the process of taking control of a
file system, either with or without contention. It involves the branding
of the arbitration block on the metadata disk to take control, and then
the timed rebranding of the block to maintain control. The HA Monitor
algorithm places an upper bound on the timing of the ARB branding
protocol to prevent two FSMs from simultaneously attempting to
control the metadata, even for an instant.
• When an activating HaUnmanaged or HaShared FSM usurps the
ARB, create a five-second timer that resets the computer if it expires
• Wait five seconds plus a small delta before completing usurpation
• Immediately after every ARB Brand update, reset the timer
• Delete the timer when the FSM exits
When there is a SAN, LUN, or FSM process failure that delays updates of
the ARB, the HA Monitor timer can run out. When it is less than one
HA Resets of the The second method of HA Reset can occur on shutdown of CVFS if there
Second Kind is an unkillable process or delayed process exit under the HaShared file
system mount point. This will keep the file system from being
unmounted. The smithlog entry indicates when this has happened, but
does not identify the process.
HA Resets of the Third The third method of HA Reset is the most common. It occurs when the
Kind snactivated script for the HaShared FSM experiences an error during
startup. The current implementation invokes the 'snhamgr force
smith' command to allow the peer MDC an opportunity to start up
StorNext if it can. A similar strategy was used in previous releases. In this
release, the failure to start will cause the /usr/cvfs/install/
.ha_idle_failed_startup touch file to be created, and this will
prevent startup of CVFS on this MDC until the file is erased with the
'snhamgr clear' command.
Using HA Manager The snhamgr rules for mode pairings are easier to understand by
Modes following a BAAB strategy for transitioning into and out of config or
single mode. In this strategy, B stands for the redundant node, and A
stands for the node to be placed into config or single mode. Enter the
desired cluster state by transitioning B's mode first, then A's. Reverse this
when exiting the cluster state by transitioning A's mode, then B's.
For the configuration-session example, place B in locked mode, then
place A in config mode to start a configuration session. At the end of
the session, place A in default mode, then place B in default mode.
For the single-server cluster example, shut down Linux and power off B,
then designate it peerdown with the 'snhamgr peerdown' command on
A, then place A in single mode. At the end of the session, place A in
default mode, then designate B as up with the 'snhamgr peerup'
command on A, then power on B.
Failover Timing The following illustration shows approximate timings of the FSM
failover in an HA cluster. The numbers in the notes correspond to the
numbers in the illustration.
In this description, both MDCs in an HA Cluster are fully started and the
Secondary MDC is ready to assume the Primary role if needed. At time
T0, an HA Reset of the Primary occurs.
0 Not shown in this diagram are the state transitions of the peer MDC
when it incurs an HA Reset. The HA Reset is not directly tied to the
failover of control to a standby FSM, but rather the detection of a
loss of services triggers failovers. The HA Reset may come before or
after the loss of services, or not at all. It is only important to know
that by the end of state 8, no FSM on the peer MDC is controlling
the arbitration block (ARB). The HA Reset mechanism guarantees
that to be true.
The example failures shown here (System Crash, Fabric Outage,
Network Outage) can result in a failover. Typically, the loss of
heartbeat from the peer MDC's FSMPM is the first indication that an
HA Reset has occurred.
1 Triggering Event: The loss of heartbeat is detected and triggers an
election at approximate time T3.5 seconds. Note that failover of a
single unmanaged file system could also be forced with the cvadmin
command without causing an HA Reset.
2 Vote Initiation: A quorum-vote election is started where the clients
of the file system identify the best-connected MDC having a standby
FSM for the file system.
3 Connectivity Tests: Each live client runs a connectivity test
sequence to each server. Connections are tested in less than .5
seconds per server, when successful, and can be repeated up to four
times (two seconds) when unsuccessful. At completion of the
election, the time is approximately T5.5.
4 Election and Start Activation: The election is completed, and an
activation message is sent to one server’s standby FSM.
5 Delayed Activation: When a server has active FSMs, its FSMPM
process sends a request to the FSMPM of its peer server to ask if the
corresponding Standby FSMs are being activated. If not, the local
FSMPM can reset the HA timer of that file system's active FSM,
which reduces the chance of an unnecessary HA Reset. When the
peer FSMPM gives permission, it is constrained from activating the
standby FSM for two seconds. Step 5 is for that delay of up to two
seconds. The delay completes at approximately T6.5.
6 Usurpation Attempts: To prevent false takeovers, the ARB is polled
to determine whether another FSM is active and must be "usurped".
Usurpation is averted if the activating FSM detects activity in the
ARB and its vote count does not exceed the active FSM's client-
connection count. A typical successful poll after an HA Reset lasts
two seconds. When the previously active FSM exits gracefully, the
usurpation takes one second.
The activating FSM then performs a sequence of I/Os to "brand" the
arbitration block to signal takeover to the peer FSM. An active FSM
is required to exit when it sees that its brand has been overwritten.
These operations take two seconds. The HAmon timer is started at
this point if the HaFsType is HaShared or HaUnmanaged. This step
completes at approximately T9.5.
7 FSM Restart: After five failed attempts to usurp control, an
activating FSM exits. The fsmpm restarts a standby FSM ten seconds
later.
8 Wait for HA Delay: When an active FSM is configured for HA
Monitoring (HaShared or HaUnmanaged), and the ARB brand is not
maintained for more than the HA Timer Interval (five seconds by
The following table presents common timing estimates for failover of all
file systems following an HA Reset of the Primary server. Actual
performance will vary according to: differences in configurations; file
system activities in progress at the time of failover; CPU, SAN and LAN
loads, latency and health; and the nature of the conditions that caused
the failover. The optimal estimates are for a forced failover at the
command line of a single unmanaged file system without an HA Reset.
Replacing an HA System
This section describes how to replace an HA server. Before beginning
this procedure make sure you have obtained the proper licenses
required for the new HA MDC.
Pre-Conversion Steps
1 If both HA MDCs are currently up and running, make sure the
system you want to replace is designated as the secondary MDC.
This can be accomplished by running “service cvfs stop” on
the designated machine.
2 Run a manual backup to tape from the StorNext GUI.
3 Make sure all store/retrieve requests have finished.
4 If you are using the Distributed Data Mover (DDM) feature, note the
value of the DISTRIBUTED_MOVING parameter (either All or
Threshold) in /usr/adic/TSM/config/fs_sysparm (or
fs_sysparm_override).
Use a text editor to set the DISTRIBUTED_MOVING value to None.
Use the adic_control restart TSM command to put this
change into effect.
5 Unmount all file systems from all clients, and then stop the SNFS
processes on each client machine. (On the Linux platform, do this by
running service cvfs stop).
6 Uninstall StorNext from the secondary server, but retain the log files.
Do this by running the command install.stornext -remove.
7 Power down the uninstalled secondary server.
Conversion Steps
1 Set the primary node to “Config” mode and the peer node to
“Peerdown” mode by running the following commands:
snhamgr peerdown
snhamgr mode=config
Post-Conversion Steps
1 After the conversion is complete, check the snhamgr status on both
MDCs. Run the cvadmin command to verify that all file systems are
listed correctly.
2 Perform a system backup by running the snbackup command. This
process may take substantial time depending on the number of
managed files in the system.
3 Start and mount StorNext file systems on the clients, and then verify
that all clients have full access
Enabling WS-API
In order to perform any command on the remote server (except for the
getSNAPIVersion call), the correct password must be specified with
the call.
The server verifies this password against the one stored in the /usr/
adic/.snapipassword file on the server. Make sure this file is the
same on all metadata controllers (MDCs) you will be accessing. This is
especially important if you are using virtual IP (vIP) addresses, because
WS-API APIs
This section provides descriptions and syntax for the APIs included with
WS-API. Examples of each API are also provided.
The doCancel API Given a requestID (which can be retrieved by running the getSMQueue
API), running the doCancel API aborts an operation in progress.
Running this API is equivalent to running the fscancel command.
Syntax
public string[] doCancel(string password, string requestID);
Example
String[] result = snapiClient.doCancel(password, requestID);
The doMediaMove API Use the doMediaMove API to move media from one archive to another.
Running this API is equivalent to running the vsmove command.
Syntax
public string[] doMediaMove(string password, string[]
mediaIDs, string archiveName, bool interactive, bool
interactiveSpecified, string remoteHost);
Example
The doRetrieve API Use the doRetrieve API to retrieve the data in files that have been
stored and then truncated. Running this API is equivalent to running the
fsretrieve command.
Syntax
public string[] doRetrieve(string password, string[] files,
bool updateATime, bool updateATimeSpecified, string copy,
string newFileName, string startByte, string endByte, string
directory);
Example
String[] result = snapiClient.doRetrieve(password, fileList,
updateATimeCheckBox.Checked, true, copy, newFileName,
startByte, endByte, directory);
The doStore API Use the doStore API to store files as specified by its respective policy.
Running this API is equivalent to running the fsstore command.
Syntax
public string[] doStore(string password, string[] files,
string mediaType, string copies, string retention, string
drivePool, string minSize, string runTime);
Example
String[] result = snapiClient.doStore(password, fileList,
mediaType, numCopies, retention, drivePool, minSize,
runTime);
The doTruncate API Use the doTruncate API to truncate files that have been stored, thus
freeing their allocated space on disk for reuse. Running this API is
equivalent to running the fstruncate command.
Syntax
public string[] doTruncate(string password, string[] files);
Example
String[] result = snapiClient.doTruncate(password,
fileList);
The getDriveReport API Use the getDriveReport API to generate a report about the state of
all storage subsystem drive components. Running this API is equivalent
to running the fsstate command.
Syntax
public string[] getDriveReport(string password, string
componentAlias);
Example
String[] result = snapiClient.getDriveReport(password,
componentAlias);
The getFileLocation API Use the getFileLocation API to generate a report about files known
to TSM. Running this API is equivalent to running the fsfileinfo
command.
Syntax
public string[] getFileLocation(string password, string[]
files, bool checksum, bool checksumSpecified);
Example
String[] result = snapiClient.getFileLocation(password,
fileList, checksumBox.Checked, true);
The getMediaInfo API Use the getMediaInfo API to produce a list of media in a data and/or
storage area. Running this API is equivalent to running the fsmedlist
command.
Syntax
public string[] getMediaInfo(string password, bool
scratchPoolOnly);
Example
The getMediaReport Use the getMediaReport API to generate a report on media based on
API its current status. Running this API is equivalent to running the
fsmedinfo and vsmedqry commands.
Syntax
public string[] getMediaReport(string password, string[]
mediaIDs, bool longReport, bool longReportSpecified);
Example
String[] result = snapiClient.getMediaReport(password,
mediaList, longReportBox.Checked, true);
The getSMQueue API Use the getSMQueue API to produce a list of currently executing
Storage Manager commands (retreives, stores, etc.). Running this API is
equivalent to running the fsqueue command.
Syntax
public string[] getSMQueue(string password);
Example
String[] result = snapiClient.getSMQueue(password);
The getSNAPIVersion Call the getSNAPIVersion API to obtain the version number of the
API currently running WS-API implementation.
Syntax
public string[] getSNAPIVersion(string @in);
This function takes one string as an argument, but that string is
currently ignored and can be blank, and returns a string which is the
version number of Web Services SNAPI running on the server.
Example
String[] result = snapiClient.getSNAPIVersion("");
The setMediaMoveInfo Use the setMediaMoveInfo API to complete a media move, letting
API TSM know whether it was successful or not. Running this API is
equivalent to running the mmconsoleinfo and mmportinfo
commands.
Syntax
public string[] setMediaMoveInfo(string password, string
archiveName, bool success, bool successSpecified, string[]
operations);
Example
String[] result = snapiClient.setMediaMoveInfo(password,
archiveNameBox.Text, successBox.Checked, true, operations);
Truncation Overview
Truncation operations fall into two categories. The first category is the
truncation that is performed as part of the normal StorNext processing.
The second category is the “space management” truncation policies
that are run only when the disk usage reaches certain key points.
For each file system defined on the MDC, there must be an entry in the
/usr/adic/TSM/config/filesystems file.
There are five variables specified for each file system:
1 Low-water mark (default value is 75%)
2 High-water mark (default value is 85%)
3 Min-Use mark (default value is 75%)
4 Min-Use enable (default is true)
5 Truncation enable (default is true)
Normal Truncation These truncations are performed as part of the normal processing done
by StorNext.
Immediate Truncation This refers to truncation performed immediately after all copies of a file
are stored to media. This is enabled on a policy class basis and can be
enabled with this command:
fsmodclass -c <classname> -f i
The default is that a stored file becomes a truncation candidate. The file
will be dealt with through normal truncation processing.
Immediate Truncation can also be enabled on a file-by-file basis by using
the fschfiat command: fschfiat -t i filename...
Daily Truncation The fs_tierman TSM daemon kicks off policy-based truncations each
day after midnight.
In this case the call is: fspolicy -t -c <class> -m <class-
trunc-min-time> -z 1
This processes each defined policy class within StorNext until all policy
classes have been completed. After the fspolicy has been run against
all policy classes, the daemon waits until the next day to run them
again.
Space Management
The two main space management cycles are described below. They will
continue to run as long as one of the conditions for a particular cycle is
met. Both the LOSPACE and "Emergency Space" conditions are handled
by the fs_space TSM daemon.
LOSPACE Cycle This cycle is activated when the disk usage of one or more file systems
exceeds the percentage full defined by the high-water value. When
reached, LOSPACE policies are executed in an attempt to reach the low-
water mark on each affected file system.
By default, the policies are executed once in this order on all affected file
systems:
• relocation policy
• truncation policy
The high-water and low-water values are displayed by the GUI File
System Monitor and can be modified via the StorNext GUI. By default,
these values are set to 85% and 75%, respectively.
In contrast to the Emergency policies described in the next section,
what's different in the behavior of the LOSPACE policies is that
MINTRUNCTIME and MINRELOCTIME are not ignored. Only files that
can be truly relocated and truncated are affected.
First, the relocation policy is executed and it continues until there are no
more relocation candidates available at which time it terminates.
The call made to perform the LOSPACE relocation is:
fspolicy -r -y <mountpoint>
If the file system usage still exceeds the high-water mark, the truncation
policy is executed and it truncates all candidates until no further
candidates are available, at which time it terminates.
The call made to perform the LOSPACE truncation is:
fspolicy -t -y <mountpoint> -z <mintruncsize>
At this time the LOSPACE Space Cycle is complete for this file system. All
other affected file systems are then processed in the same manner, first
by running the relocation policy and then the truncation policy, if
needed.
After all file systems have been processed, if any of them still exceed the
high-water mark, a new LOSPACE cycle is started after a one-minute
wait.
Thus, the low-water percentage may or may not be reached on any
given file system. It depends solely on whether there are enough
candidates available for relocation and/or truncation for that file system.
Emergency Cycle Emergency policies are executed when either of the following
conditions is met for a file system:
1 When a file system encounters the NOSPACE event, i.e. a file write
has failed because of lack of space.
2 When the file system usage is greater than 99%.
By default, the policies are executed once in this order:
1 emergency truncation policy
2 emergency relocation policy
3 emergency store policy
The emergency truncation policy finds up to the 3000 largest files that
can be truncated, ignoring MINTRUNCTIME, and performs the
truncation. This is executed once each time the NOSPACE condition is
reached.
The call made to perform this emergency truncation is:
fspolicy -t -y <mountpoint> -e
If the file system usage has not dropped below 100% after the
emergency truncation, the emergency relocation policy is now run.
When the emergency relocation policy is run, it finds all files that can be
relocated, ignoring MINRELOCTIME, and performs the relocation. As
with the emergency truncation policy, this is executed once each time
the EMERGENCY condition is reached.
The call made to perform the emergency truncation is:
fspolicy -r -y <mountpoint> -e
If the file system usage is still not below 100% after the emergency
relocation, an emergency store policy on the file system is performed.
An emergency store means that the request is placed first in the queue,
and that any files in the file system which can be stored will be stored
regardless of policy. As with the other emergency policies, it is run only
once.
The call made to perform the emergency store is:
fspolicy -s -y <mountpoint> -e
At this point the Emergency Space Cycle is complete.
Disabling Truncation
There are two ways to disable truncation: by using truncation feature
locking, and by running commands that disable truncation.
Truncation Feature Truncation operations can be locked, i.e. prevented from running, by
Locking using the fsschedlock command.
The feature name for each truncation operation is:
• mintime: Daily truncation
• lospace: LoSpace Cycle
Disable Truncation Truncation can be disabled for files by using one of the following
Commands commands:
fschfiat -t e <filename>
Common Problems
This section describes some common truncation problems and how to
address them.
Files Are Not Truncated Even if the truncation mintime requirement is met, files may not be
as Expected truncated. Files are truncated only to keep the file system below the
low-water mark (by default 75% of the capacity). When the daily
truncation policies run, the oldest files are truncated first in an effort to
bring the file system usage below the low-water mark. Thus, files may
remain on disk even though their truncation mintime requirement has
been met if the disk space is not required.
You can use the StorNext GUI to adjust the low-water and high-water
marks on each file system if more free disk space is desired. A temporary
option to free up disk space is to run the following command:
fspolicy -t -c <policyclass> -o <goal>
The goal argument is a disk usage percentage. For example specifying
"-o 65" will truncate files until the disk usage either reaches 65% or
there are no more valid truncation candidates, i.e. the mintime
requirement has been satisfied.
"Old" Files Not Truncation uses the file access time to determine if the truncation
Truncating According to mintime requirement has been satisfied. If any application changes the
Policy Class access time of a file, the aging of the file restarts.
An example of this is where image files are listed using the thumbnail
mode on an Apple Macintosh. This causes the OS to read the file to
present this thumbnail, and the access time of the file gets updated to
the current time. This in turn results in StorNext determining this file has
not satisfied the truncation mintime requirement.
Emergency truncation ignores mintime so the files could still be
truncating if the file system fills up. However, the best solution is to
modify the way files are accessed so as to not update the access time.
In the above example, this would mean not using the thumbnail view.
Small Files Not If a policy class has been configured to use stub files, files with a size
Truncating that is less then or equal to the stub size will not get truncated.
StorNext Security
There are two predominate security models in modern file systems:
POSIX and Access Control Lists (ACLs). ACLs are actually “Lists”
composed of Access Control Entries. These lists may be quite simple or
quite complicated, depending on the user's requirements.
The POSIX model is the older and less flexible of the two, having just
three main security groups: “User,” “Group” and “Other,” and three
operation categories: “Read,” “Write,” and “Execute”. For a directory,
“Execute” translates to the ability to change into that directory, while
“Read” and “Write” control directory listings and file creation and
deletion.
POSIX permissions are kept in the file's inode information and are read
from the file system on Unix/Linux systems by calls to stat().
In order to know what kind of restriction to place on a file or directory,
the OS first has to be able to track users and groups so it can later be
matched up with its associated information in files and directories. On
Windows, all users have two unique Security Identifiers (SIDs): one for
their user identification and one for the groups they belong to. On Unix/
Linux and Mac OS X, every user has a User IDentifier (UID) and that user
is assigned to a group which has its own Group IDentifier (GID).
This is the model that's built into StorNext and used by all StorNext
clients on all operating systems unless it's overridden by the use of ACLs.
ACLs are currently supported only on Windows and Mac OS X. ACLs give
fine-grained control over file access and do things POSIX permissions
can't, such as allow for writes to a file while not allowing the file to be
deleted. ACLs also offer the benefit of “inheritance”, which allows a
directory to specify the default set of ACLs for all files created inside of
it.
ACLs are kept in the Extended Attributes for a file, which is an internal
data structure attached to the file's first inode that contains additional
information associated with the file. Only operating systems that know
to ask for the extended information with the proper key will understand
these ACLs. Currently, only Mac OS X and Windows know to use this
information.
The StorNext File System implements both the Unix POSIX model, and
on its Windows clients it implements the Windows Security Reference
Model (SRM) to a level compatible with Microsoft's NTFS file system.
Quantum attempts to marry the two models in a very simplistic way to
allow a common user to bridge file objects between Unix and Windows.
StorNext does not implement any of the Unix ACLs models or the NFSv4
ACLs model.
ACLs on Windows Each mapped drive, file, or folder on Windows contains a Windows
Security Descriptor. This descriptor contains the owner, primary group,
DACLs, and SACLs. Windows uses the Security Descriptor to control
access to each object. Windows Administrators and Users typically use
Windows Explorer to view, change, and create ACLs on files. This is done
in Explorer by first selecting the file or folder, displaying its properties,
and then clicking on the Security tab.
Each file/folder can have zero or more ACLs that specify how a user or
group can access or not access the file or folder. The possible controls in
each ACE are:
Folders Files
Full control (all of the following) Full control (all of the following)
Delete Delete
Each Item can be selected as: Allow, Deny, or not selected. If Full Control
is selected as Allow or Deny, all the other attributes are set to either
Allow or Deny.
In addition, each ACE on Windows is indicated to apply as follows:
• Folder
• This folder only
• This folder, subfolders, and files
• This folder and subfolders
• This folder and files
• Subfolder and files only
• Subfolder only
• Files only
• File
• This object only
An individual object can also be set to disallow or allow inheritable ACLs
from a parent, parent's parent, etc.
A folder can be created and it can be marked such that all of its ACLs
will pass to any children. This process is called propagation. Individual
ACLs on a folder can be propagated as indicated in the above list. File
and sub-folders of a folder can have all or some of the “inherited” ACLs
removed.
The propagation/inheritance information is contained in the Windows
Security Descriptor. Users and administrators on Windows platforms use
this capability extensively.
ACEs are ordered in an ACL. Explicit ACEs come first. An explicit ACE is
one that is not inherited. Explicit ACEs which deny come before explicit
ACEs which allow. Inherited ACEs are ordered such that the closer the
parent, the sooner they appear. Each level of inherited ACEs contain
deny before allow.
All file and folder access is determined by matching a user and group to
the DACL of the object being accessed. The SACL is not used to perform
the access check. The ACEs in the DACL are compared in order with the
accessing user and group for the requesting access mode. If a “deny
ACE” matches, access is denied. If an “allow ACE” matches all requested
access bits, access is allowed. It is possible to have a “deny ACE”
inherited after an “allow ACE” which will not take effect. This can
happen because explicit ACEs take precedence as do inherited ACEs
from a closer parent. See the examples in the Microsoft document “How
Security Descriptors and Access Control Lists Work.”
There is an “everyone ACL” that can be added to objects such that all
users are granted certain privileges if they do not match any other ACE.
When a Windows user creates a file on SNFS the Security Descriptor (SD)
is kept as an attribute of the file object. The SD contains a primary SID, a
group SID and a list of discrete ACLS (also know as the DACL). The SNFS
file object also contains the Unix UID, GID and permissions fields. By
default SNFS inserts user identifier “nobody” into both UID and GID
containers. Then it modifies the Unix mode (permissions) container
based on the following two rules.
1 If the file object contains the Windows access control entry (ACE) for
the everyone SID (which equals S-1-1-0, equivalent to “world” or
“all others” on Unix), then it will apply specific permissions using
the following guidelines. If the object is a container object
(directory) and the FILE_LIST_DIRECTORY access bit is set, mode
O+R (4) is set, else it is clear.
a If the object is a container object and the FILE_TRAVERSE
access bit is set, mode O+X (1); otherwise it is clear.
b If the object is a container object and the DELETE bit is set,
mode O+W (2) is set; otherwise it is clear.
c If the object is a file and the FILE_READ_DATA bit is set, mode
O+R (4) is set; otherwise it is clear.
d If the object is a file and the FILE_WRITE_DATA bit is set, mode
O+W (2) is set; otherwise it is clear.
e If the object is a file and the FILE_EXECUTE bit is set, mode
O+X (1) is set; otherwise it is clear.
2 If there is no everyone ACE, the Unix permissions for the file object
will be NONE (---------).
If it is an existing file, when a user changes the Security Descriptor on a
file or directory, the change can affect Posix Permissions as well:
If the owner of the file or directory is not being changed, then SNFS
checks for a missing DACL or Everyone ACE within the DACL.
If there is no DACL, set the new mode to allow the owner to read/write/
execute.
If there is a DACL, scan the ACEs for the “Everyone” SID, either in the
Allow or Deny state:
1 Check the ACE mask to see if READ_DATA/WRITE_DATA/
EXECUTE_FILE is set, and adjust the Other mode of the Posix
permissions accordingly.
2 The User and Group mode bits are left untouched.
3 The Unix*FileCreationOnWindows configuration options are
ignored for the Everyone SID
If the owner is changing:
1 map the SID owner to unix User/Group ownership via active
directory - store this for later application
• If the SID does not have a UID associated with it, map the UID to
the value of the MDCs configuration option,
UnixNobodyUidOnWindows.
• If the SID does not have a GID associated with it, map the GID
to the value of the MDCs configuration option,
UnixNobodyGidOnWindows.
2 Convert the mode bits for the Group and User - apply the
Unix*CreationModeOnWindows config option masks to these.
3 Apply the Everyone bits per step 1.2 above - again note that the
Everyone ACE conversion to Posix Permissions ignores the
Unix*CreationModeOnWindows configuration options
4 Check to see if the DOSMODE READONLY flag is set, and mask out
the User/Group/Owner write bits if it is.
5 If the UID is different from what is currently stored, change it (it is
possible to have multiple SIDs mapped to the same UID)
6 If the GID is different from what is currently stored, change it (it is
possible to have multiple SIDs mapped to the same GID)
Note: The Standard Posix Permissions Other bits get set via the
Everyone ACE regardless of the
UnixFileCreationModeOnWindows and
UnixDirectoryCreationModeOnWindows settings.
ACLs on Mac OS X With Mac OS X 10.3 (Tiger), ACLs were introduced. This ACL
implementation is very close to the Windows ACLs implementation.
The chmod(1) and ls(1) commands have been modified to handle ACLs.
There is also a library API for applications, acl(3) that allows programs to
operate on ACLs.
For a detailed description of Mac OS X ACLs, see “Security Overview:
Permissions” from Apples web sites and click on ACLs.
ACLs take precedence over regular UNIX permissions. If no ACE match is
found for a user's requested access, UNIX permissions are checked.
Therefore, a user may not match any ACE but still have access if UNIX
permissions allow.
Each ACE on Mac OS X has the same 13 possible permission bits as a
Windows ACE:
Directories Files
Windows Mac OS X
“Central Control” With StorNext 4.0, there is now support for cluster-wide central control
to restrict the behavior of SNFS cluster nodes (fsm server, file system
client and cvadmin client) from a central place. A central control file,
nss_cctl.xml, is used to specify the desired controls on the cluster
nodes. This file resides under /usr/cvfs/config on an nss
coordinator server.
This control file is in xml format and has a hierarchical structure. The top
level element is “snfsControl”. It contains the control element
“securityControl” for certain file systems. If you have different
controls for different file systems, each file system should have its own
control definition. A special virtual file system “#SNFS_ALL#” is used as
the default control for file systems not defined in this control file. It is
also used to define the cvadmin related controls on clients.
Controls
Currently seven controls are supported. Each control has this format:
<control value="true|false"/>
The “value” can be either “true” or “false”. The control is one of the
following controls:
mountReadOnly
Controls whether the client should mount the given file system as read
only. Value “true” means the file system is mounted as read only. Value
“false” means the file system is mounted as read/write. If this control
is not specified, the default is read/write.
mountDlanClient
Controls whether the client can mount the given file system via proxy
client. Value “true” means the file system is allowed to mount via proxy
client. Value “false” means the file system is not allowed to mount via
proxy client. The default is “mount via proxy client not allowed”.
takeOwnership
Controls whether users on a Windows client are allowed to take
ownership of files or directories of the file system. Value “true” means
Windows clients are allowed to take ownership of files or directories.
Value “false” means Windows clients are not allowed to take ownership
of files or directories. The default is that “take ownership is not
allowed”.
snfsAdmin
Controls whether cvadmin running on a host is allowed to have super
admin privilege to run privileged commands such as starting or
stopping a file system. Value “true” means the host is allowed to run
privileged commands. Value “false” means the host is not allowed to
run privileged commands. If this control is not specified, the default is
that super admin privilege is not honored.
snfsAdminConnect
Controls whether cvadmin running on a client is allowed to connect to
another fsm host via “-H” option. Value “true” means the client is
allowed to connect to another fsm host. Value “false” means the client
is not allowed to connect to another fsm host. The default is that “-H” is
not allowed.
exec
Controls whether binary files on the file system are allowed to be
executed. Value “true” means their execution is allowed. Value “false”
means their execution is not allowed. The default is that their execution
is allowed.
suid
Controls whether set-user-identifier bit is allowed to take effect. Value
“true” means the set-user-identifier bit is honored. Value “false” means
the set-user-identifier bit is not honored. The default is that suid bit is
honored.
<controlEntry>
<client type=”netgrp”>
<network value=”192.168.1.0”/>
<maskbits value=”24”/>
</client>
<controls>
<takeOwnership value=”true”/>
<mountReadOnly value=”true”/>
</controls>
</controlEntry>
</securityControl>
<securityControl fileSystem=”#SNFS_ALL#”>
<controlEntry>
<client type=”host”>
<hostName value=”linux_ludev”/>
</client>
<controls>
<snfsAdmin value=”true”/>
<snfsAdminConnect value=”true”/>
</controls>
</controlEntry>
</securityControl>
</snfsControl>
Config (.cfg) File The StorNext config file has the following options that relate directly or
Options indirectly to security or permissions:
• GlobalSuperUser
• Quotas
• UnixDirectoryCreationModeOnWindows
• UnixFileCreationModeOnWindows
• UnixIdFabricationOnWindows
• UnixNobodyGidOnWindows
• UnixNobodyUidOnWindows
• WindowsSecurity
GlobalSuperUser defines whether or not the global super user (root)
privileges on the file system. It allows the administrator to decide if any
user with super-user privileges may use those privileges on the file
system. When this variable is set to “Yes”, any super-user has global
access rights on the file system. This may be equated to the maproot=0
directive in NFS. When the GlobalSuperUser variable is set to “No”, a
super-user may modify files only where he has access rights as a normal
user. This value may be modified for existing file systems.
Quotas has an indirect relationship with security in that it requires a
Windows Security Descriptor (SD) to track the owner of a file to
correctly maintain their quota allotment. Currently quotas in StorNext
File System-only systems work correctly in either all-Windows or all-non-
Windows environments. This is because of the way quotas are tracked;
when the meta-data server is deciding how an allocation should be
charged, it uses either the SD, if one exists, or the UID/GID.
Files created on Windows with WindowsSecurity ON always have an
SD. Files created on non-Windows never have an SD. If a file that was
created and allocated on a non-Windows platform is simply viewed on
Windows, it gets assigned an SD as described above. At that point the
quota will be wrong. Subsequent allocations of that file will be charged
to the SD and not the UID/GID.
To fix this problem, the UID/GID “space” and SD “space” must be
consolidated into one “space”.
UnixDirectoryCreationModeOnWindows controls which initial
permissions directories have. Typically this is set to 755, but might be set
to 700 to prevent access by anyone other than the owner on Unix
systems, and on Windows require the use of ACLs to allow the directory
to be accessed by anyone other than the owner.
UnixFileCreationModeOnWindows controls which initial
permissions files have. Typically this is set to 644, but might be set to
600 to prevent access by anyone other than the owner on Unix systems,
and on Windows require the use of ACLs to allow the file to be accessed
by anyone other than the owner.
Note: This procedure is only for StorNext file systems that do not
have the Tertiary Storage Manager (TSM) component installed.
1 Unmount the file system from all the client systems using it.
2 Stop the file system in cvadmin.
3 Run cvfsck with the following parameters:
cvfsck -jfile_system_name
cvfsck -nfile_system_name
where file_system_name is the actual name of your file system.
Make sure that cvfsck says that the file system is clean.
4 Do one of the following:
* If cvfsck detects no file system errors, go to the next step.
* If cvfsck detects file system errors, run it in a "fix" mode
cvfsckfile_system_name
5 Rename the file_system_name.cfg file and edit the fsmlist file to
reflect the new file system name.
By default, these files reside in the /usr/cvfs/config directory
on UNIX systems and in the C:\SNFS\config folder on Windows
systems.
6 Run cvfsck in interactive mode to remake icb by typing cvfsck
without double quotation marks. You will be asked which file
system to check.
Question: Does StorNext support dynamic file system growth? How can
I grow a file system?
Answer: StorNext does not support dynamic growth. You can grow a
file system by adding stripe groups, but this is a manual process.
The process takes only a few minutes, with the exception of running the
cvfsck command, which can take a long time depending on the size
and number of files in the existing file system.
Use this procedure to grow a file system.
1 Unmount all clients.
2 On the Metadata Controller (MDC) go into cvadmin and stop the
active file system.
3 Run a cvfsck command in active mode.
4 Label new disks.
5 Create a new Stripe Group at the bottom of the existing Stripe
Group Section in your file_system_name.cfg file.
Caution: Make sure you put the new stripe group at the bottom
of the configuration file. Putting the new stripe group
elsewhere in the file can cause either data loss or data
corruption.
Question: How much data is reserved for StorNext disk labels, and what
is the process for recovering damaged labels?
Answer: StorNext reserves the first 1 MB of the disk for the label.
• For VTOC disk labels, the critical area of the label is the first 1,536
bytes (three 512-byte sectors).
VTOC is the only label type used by StorNext Version 2.6 and earlier,
and is the default type used for LUNs with less than 2GB sectors by
StorNext Version 2.7.
• For EFI disk labels, the critical area of the label varies with the disk
sector size:
Question: Umount hangs or fails for StorNext File Systems even though
the fuser shows nothing. What’s going on?
Answer: If a process opens a UNIX domain socket in a StorNext File
System and does not close it, umount hangs or fails even though fuser
does not show anyone using the file system.
Use the "lsof -U" command to show the UNIX domain socket. The
process can be killed with the socket open.
Answer: In this situation you might receive the following error message
after the write fails and the replaced drive has been placed off-line:
fs_resource[6609]: E1004(4)<00000>:{3}: Drive 2
SN:1310012345 has invalid device path and is being
taken off-line.
The message indicates that the device configuration was not updated
and the replacement tape drive has a different serial number. Compare
the serial numbers of the configured tape drives against which tape
drives are seen by the operating system.
In this example, the system has two tape drives configured. Run the -
fsconfig command and check the command output. It should look
similar to the following:
Component ID: V0,1
------------------------------------------------------
-------------------------
Device pathname: /dev/sg6
Compression: On
User Alias: scsi_archive1_dr1
Component Type: DRIVE
Device serial #: 1310999999
Drive Type: LTO
Drive ID: 1
Compare the results with the output of the 'fs_scsi -p' command.
ADICA0C012345_LLA | Scalar i500 | medium changer | /
dev/sg5
1310999999 | ULTRIUM-TD4 | sequential access | /
dev/sg6 <--- scsi_archive_dr1
1310888888 | ULTRIUM-TD4 | sequential access | /
dev/sg4 <--- device path and serial number is not
known.
A new tape drive has a new serial number. If a drive is replaced, the
special device file (/dev/sg2) is no longer valid.
To resolve the issue, follow these steps:
• Delete the original removed tape drive using the StorNext GUI. (See
the StorNext User Guide for more information about this
procedure.)
• Add the replacement tape drive using the StorNext GUI. (Again,
consult the StorNext User Guide for more information about this
procedure.)
If the issue persists, contact Quantum Technical Support. Refer to the
Worldwide Service and Support page for more information.
Question: I’ve discovered that StorNext cannot see all disks when
running Red Hat Linux. What should I do?
Answer: StorNext File System cannot see all the drives you want to
configure in Red Hat Linux. When Linux is installed, it defaults to only 40
disk devices when it boots up.
To address this limitation, modify the CONFIG_SD_EXTRA_DEVS setting
in your Linux config file (or use xconfig under the SCSI Support tab).
Then, rebuild the kernel and reboot the system.
If you require assistance rebuilding a Linux kernel, contact the Linux
support resources for your installation.
Question: What does the 'heartbeat lost' message from a Solaris client
mean?
Answer: On a Solaris client, you may see the following error:
fsmpm[3866]: [ID 702911 daemon.warning] NSS: Name
Server 'StorNext hostname' (xxx.xxx.xxx.xxx) heartbeat
lost, unable to send message.
In StorNext, the metadata controller and clients use an Ethernet
network to exchange file system metadata. The fsmpm is a portmapper
daemon residing on each StorNext File System client and server
computer. Its purpose is to register an RPC identifier to the system's
portmap daemon. The fsmpm publishes a well-known port where the
file system (fsm) daemons register their file system name and port
access number. All clients then talk to their local fsmpm to discover
access information for their associated service.
Because of the importance of maintaining this connection, a heartbeat
is performed over the metadata network, so if this connection is lost, a
message is sent indicating a network communication problem to the
fsnameserver (xxx.xxx.xxx.xxx).
Portmapper messages are logged in the nssdbg.out log file located in
/usr/cvfs/debug.
System administrators should monitor the log files to make sure that
connectivity is maintained.
Question: Why does StorNext fail to write to an LTO-4 tape drive and
varies media to suspect on my Red Hat 5 and SuSE 10 system?
Answer: StorNext Storage Manager fails to write to a tape drive and
marks the medium as 'suspect'.
Note: This is applicable only to Red Hat RHEL 5 and SuSE SLES 10
operating systems and StorNext 3.1.x (not to 3.5.0).
Substitute the settings as seen below or add them to the startup script
after the shell declaration ( #!/bin/sh) and the initial comments.
if echo RedHat50AS_26x86 | egrep "RedHat5|SuSE10" > /
dev/null; then
echo 1 > /proc/scsi/sg/allow_dio
echo 524288 > /proc/scsi/sg/def_reserved_size
echo 1 > /sys/module/sg/parameters/allow_dio
echo 524288 > /sys/module/sg/parameters/
def_reserved_size
fi
If the issue persists after making the above changes, contact Quantum
Technical Support from the Worldwide Service and Support page.
Question: What conditions trigger the voting process for StorNext file
system failover?
Answer: Either a StorNext File System client or a Node Status Service
(NSS) coordinator (the systems listed in the fsnameservers file) can
initiate a vote.
An SNFS client triggers a vote when its TCP connection to a File System
Manager (FSM) is disconnected. In many failure scenarios this loss of
TCP connectivity is immediate, so it is often the primary initiator of a
vote.
On Windows systems, StorNext provides a configuration option called
Fast Failover that triggers a vote as a result of a 3 second FSM heartbeat
loss. Occasionally, this is necessary because TCP disconnects can be
delayed. There is also an NSS heartbeat between members and
coordinators every half second. The NSS coordinator triggers a vote if
the NSS heartbeat is absent for an FSM server for three seconds.
Because the client triggers usually occur first, the coordinator trigger is
not commonly seen.
Question: Why does the Primary MDC keep running without the
HaShared file system failing over and without an HA Reset when I pull
its only Ethernet cable? The HA Cluster appears to be hung.
MDC 1:
Hostname Shasta
10.35.1.110
MDC 2:
Hostname Tahoe
10.35.1.12
Tahoe:/usr/cvfs/config # cvadmin
StorNext Administrator
Enter command(s)
For command help, enter "help" or "?".
List FSS
File System Services (* indicates service is in control of
FS):
1>*HAFS[0] located on tahoe:50139 (pid 13326)
Answer: The reason the failover and HA Reset did not occur is because
the HaShared FSM on Shasta continues to be active, and this was
detected in the ARB block through the SAN by the FSM on Tahoe.
Here's why. When the LAN connection is lost on Shasta, its active
HaShared FSM continues to have one client: the Shasta MDC itself. On
Tahoe, an election is held when the LAN heartbeats from Shasta's HAFS
FSM stop, and Tahoe's FSM gets one vote from the client on Tahoe. The
Tahoe FSM is told to activate, but cannot usurp the ARB with a 1-to-1
tie. So, it tries five times, then exits, and a new FSM is started in its
place. You can observe this by running the cvadmin command and
watching the FSM's PID change every 20 seconds or so.
After the reboot, CVFS will restart and attempt to elect its HaShared
FSM because it is not getting heartbeats from its peer. However, these
activation attempts fail to cause a second reset because the HaShared
FSM never has enough votes to have a successful usurpation. (You can
watch it repeatedly fail to usurp if you can get on the console and run
the cvadmin command).
But what about the HaManaged Reno3 FSM? HaManaged FSMs are not
started until the HaShared FSM activates and puts the MDC in Primary
status. You can observe these blocked HaManaged FSMs with the
cvadmin 'fsmlist' command, which displays the local FSMPM's internal
FSM and process table. A remote FSMPM's table can also be viewed with
'fsmlist on <MDC name or address>'.
restarted in the read, write, restart-timer ARB branding loop. After five
seconds the timers would expire and reset the MDC. However, there is a
second method for resetting the timers that uses the LAN.
So why does the reset occur after 30-40 seconds? After this delay, the
HBA returns errors to the FSM, and the FSM quits. When the HaShared
FSM quits with the file system mounted locally, an HA Reset occurs to
protect databases for the Storage Manager etc.
StorNext supports multiple ACSLS servers, but only one library on each
server.
Troubleshooting Other This section contains troubleshooting suggestions for general StorNext
Issues issues and other issues which do not fall into another category.
Question: How can I find the Product Serial Number?
Answer: The serial number for StorNext Storage Manager is physically
located on the front side of the media kit box. In addition, the
administrator initially responsible for the software may have received
the serial number through email when either he or she requested license
information.
Both StorNext Storage Manager and StorNext File System have serial
numbers in the format S/N SN000123 (for example, SN02728).
Answer: Check the error logs for possible causes of backup failure.
Errors in logs might look similar to this:
*
Error while executing select statement; code:
1069, message: HY000 ERR# 1069 : [Relex][Linter ODBC
Driver] SQLExecDirect: native error #1069, STMT:
select count(*).
*
2008-10-25-23:00:24: ERR: /usr/adic/DSM/bin/
snmetadump error at line 1490 of snmetadump.c: System
call failed (1): File exists (17): Failed to acquire
snmetadump lock.
*
ERR: snmetadump returned exit status 2.
To remedy the situation, remove all the lock files under “/usr/adic/
TSM/internal/locks/”. .