Setting Up Storage For Unstructured Data
Setting Up Storage For Unstructured Data
- A storage system holds data, such as unstructured content for records, backups, raw
XML files, ingestion logs, and so on.
- Existing storage systems are displayed in a table that contains the following
information:
1. In the IA Web App, on the Storage tab, click +. The Create Storage System page is
displayed.
2. Select a Storage Type
3. Proceed with the steps in one of the following sections, depending on the type of
storage being used:
• Archive Center
• Core Archive
4. Click the Test Connection button to ensure the connection works. See the table
above for which storage types support this feature.
The following procedure is only applicable for file storage systems (local file systems
and PowerScale).
1. Use the IA Web App to edit the existing defaultFileSystemRoot in the Administration
> Storage > defaultFileSystemRoot > Folder path to the desired new path (for
example, from the data\root directory to the c:\backup\data\ root directory).
2. Manually copy the content of the source folder (entire directory structure exactly
under the file system root path) to the target location (for example, from \data\root
to c:\backup\data\root).
3. Verify that the corresponding application's space and stores point to the new target
location.
4. Verify that the retrieval and ingestion operations have not been impacted by the
change
1.1.3 Amazon S3 storage
- Path-style URLs, however, are being deprecated, InfoArchive now only supports
virtual-hosted-style URLs for Amazon S3 storage.
- In virtual-hosted-style URL, the bucket name is part of the domain name in the URL.
- This section illustrates how to install and ingest content into Amazon S3 Storage
(hereafter referred to as S3).
- To use Amazon S3, download the S3 SDK for Java. Refer to the latest set of release
notes to learn what versions of S3 SDK are supported.
- Previously, when a user applied retention to packages present in Amazon S3 stores via the IA
Web App, retention was applied only at the InfoArchive-level. Even though retention was
applied to the packages, those packages can actually be deleted via the Amazon S3 console.
- If you do want to use S3 retention, enable the Object Lock feature while creating Amazon S3
buckets. If the bucket has been created and the Object Lock feature was not enabled, it is
impossible to enable it.
- While applying retention to any object, provide a date in the future. Once we apply retention
with this date, it is not possible to delete the object until the retention period has expired.
Refer to the Amazon documentation (https://
docs.aws.amazon.com/AmazonS3/latest/dev/object-lock.html) to learn more about the
Object Lock feature.
- When creating a bucket for Amazon S3 in the IA Web App, you have the option to enable the
Amazon S3 Object Lock feature via the Push retention at the bucket level checkbox
- If you create a bucket and do not enable the Push retention at the bucket level checkbox,
the bucket will be created without the Object Lock feature. If the box is selected, however,
the Amazon S3 bucket is created with the Object Lock feature enabled.
- Once the Amazon S3 bucket has been created, you still have the option Do not push
retention at the hardware level for the store.
- Suppose the bucket has the capability to support retention at the hardware-level (the Object
Lock feature has been enabled), but you do not want to apply S3 retention to a particular
store. Then, on the Amazon S3 store creation screen, enable the Do not push retention at
the hardware level checkbox. Then, even if any retention policy is applied to packages or
objects, retention is only applied at the InfoArchive-level and not at Amazon S3-level.
However, if you decide to apply retention at the hardwarelevel also (S3 retention), ensure
that the Do not push retention at the hardware level checkbox is not checked at the store-
level screen.
- If you have not enabled the object retention feature at the bucket-level, the Do not push
retention at the hardware level checkbox on the Store screen is checked and you cannot
change it
- The default Amazon S3 bucket location is US East (N. Virginia). Refer to the Amazon
S3 documentation (https://docs.aws.amazon.com/AmazonS3/latest/dev/
UsingBucket.html) to learn more about buckets.
• Standard
• Standard-IA
• Intelligent-Tiering
• One Zone-IA
- You can override the storage class on the Amazon S3 store creation screen (store
level). If you have defined any particular storage class at the Amazon S3 store-level, it
will take priority over the storage class defined at the Amazon S3 bucket-level. The
objects will then be saved in the storage class defined at the Amaozn S3 store-level.
Refer to the Amazon S3 documentation
(https://aws.amazon.com/s3/storageclasses/) to learn more about Amazon S3
storage classes.
- While creating a transition for an Amazon S3 lifecycle management rule, there are
five storage classes to choose from:
• Standard-IA
• Intelligent-Tiering
• One Zone-IA
• Glacier
- Glacier and Glacier Deep Archive are offline storage classes. If you select either one
of these as the destination storage class, three more fields are displayed in the IA
Web App:
• Restoration Rules
• S3 Duration
- If the object is transferred to the Glacier or Glacier Deep Archive storage classes,
those objects cannot be downloaded directly. Instead, a restoration procedure needs
to be initiated. The restoration configurations require Restoration Rules and S3
Duration, which are entered during Amazon S3 store creation.
- For the Restoration Rules, if the To dropdown (destination storage class is Glacier),
there are three values to choose from:
• Standard (3-5 hours)
- For the Restoration Rules, if the To dropdown (destination storage class is Glacier
Deep Archive), there are two values to choose from:
- To use Amazon S3, download the S3 SDK for Java. Refer to the latest set of release
notes to learn which versions of S3 SDK are supported.
- Download free tools such as CloudBerry to connect to Amazon S3 and verify the
existence of the buckets.
- Navigate inside the bucket and verify that the AIP and BLOB objects are stored in the
bucket:
- The Administrator can use a proxy server setup to connect to Amazon Web Services
(AWS) S3 for production deployment.
- In case a proxy is required, the following demonstrates how to configure InfoArchive
to use Amazon S3 storage with a proxy:
• When you add or edit the storage system, ensure following information is entered:
a. Check the Enable Proxy box.
b. Enter a Proxy URL for the proxy server. If other users will access the proxy
server:
c. Enter a Access Key.
d. Enter a Secret Key.
e. If desired, enter a Proxy User Name.
f. If desired, enter a Proxy User Password.