0% found this document useful (0 votes)
42 views

Setting Up Storage For Unstructured Data

1. The document discusses setting up storage for unstructured data using InfoArchive, specifically adding and configuring Amazon S3 storage. It covers adding a new storage system, changing an in-use file storage path, and provides details on configuring Amazon S3 storage including location, storage class, lifecycle management rules, and hardware-level retention policies. 2. Configuring Amazon S3 storage involves selecting a location and storage class for objects, and allows defining lifecycle management rules to transition objects between classes after a certain number of days. Hardware-level retention policies can also be applied by enabling the S3 Object Lock feature when creating buckets. 3. Details are provided on supported storage classes, enabling the S

Uploaded by

JAYDIP
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Setting Up Storage For Unstructured Data

1. The document discusses setting up storage for unstructured data using InfoArchive, specifically adding and configuring Amazon S3 storage. It covers adding a new storage system, changing an in-use file storage path, and provides details on configuring Amazon S3 storage including location, storage class, lifecycle management rules, and hardware-level retention policies. 2. Configuring Amazon S3 storage involves selecting a location and storage class for objects, and allows defining lifecycle management rules to transition objects between classes after a certain number of days. Hardware-level retention policies can also be applied by enabling the S3 Object Lock feature when creating buckets. 3. Details are provided on supported storage classes, enabling the S

Uploaded by

JAYDIP
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

1.

1 Setting up storage for unstructured data


- Storage refers to a storage configuration object that contains a list of properties for
target storage configuration.
- Storage can be one of the following types:

- A storage system holds data, such as unstructured content for records, backups, raw
XML files, ingestion logs, and so on.

1.1.1 Adding a storage system using IA Web App

- Existing storage systems are displayed in a table that contains the following
information:

To add a storage system using IA Web App:

1. In the IA Web App, on the Storage tab, click +. The Create Storage System page is
displayed.
2. Select a Storage Type
3. Proceed with the steps in one of the following sections, depending on the type of
storage being used:

• Configuring the storage class for Microsoft Azure Blob Storage

• Configuring Amazon S3 storage

• Configuring AWS S3 with Amazon Glacier for an application

• Configuring ECS Storage for an application

• Configuring Centera storage for an application

• Configuring custom storage for an application

• Archive Center

• Core Archive

4. Click the Test Connection button to ensure the connection works. See the table
above for which storage types support this feature.

- Once the Test Connection button is pressed, InfoArchive tries to establish a


connection with the storage system. The system will inform you if the
connection could be made or not.
- If the connection is not successfully established, the error message indicates
the reason why the connection failed. Make the necessary changes to the
fields indicated in the error message and click the Test Connection button
again.
5. Click Create.
1.1.2 Changing an In Use file storage path using IA Web App

The following procedure is only applicable for file storage systems (local file systems
and PowerScale).

1. Use the IA Web App to edit the existing defaultFileSystemRoot in the Administration
> Storage > defaultFileSystemRoot > Folder path to the desired new path (for
example, from the data\root directory to the c:\backup\data\ root directory).

2. Manually copy the content of the source folder (entire directory structure exactly
under the file system root path) to the target location (for example, from \data\root
to c:\backup\data\root).

3. Verify that the corresponding application's space and stores point to the new target
location.

4. Verify that the retrieval and ingestion operations have not been impacted by the
change
1.1.3 Amazon S3 storage

- Amazon S3 supports virtual hosted-style and path-style URLs to access a bucket.

- Path-style URLs, however, are being deprecated, InfoArchive now only supports
virtual-hosted-style URLs for Amazon S3 storage.

- In virtual-hosted-style URL, the bucket name is part of the domain name in the URL.

- The virtual hosted-style method requires the bucket name to be DNS-compliant.


Refer to Virtual Hosting of Buckets (https://docs.aws.amazon.com/AmazonS3/latest/
dev/VirtualHosting.html) for more information.

- This section illustrates how to install and ingest content into Amazon S3 Storage
(hereafter referred to as S3).

- To use Amazon S3, download the S3 SDK for Java. Refer to the latest set of release
notes to learn what versions of S3 SDK are supported.

1.2.3.1 Configuring hardware retention for Amazon S3


- Amazon S3 allows you to provide hardware-level retention to stored objects.

- Previously, when a user applied retention to packages present in Amazon S3 stores via the IA
Web App, retention was applied only at the InfoArchive-level. Even though retention was
applied to the packages, those packages can actually be deleted via the Amazon S3 console.

- Currently, once retention is applied to any objects/packages at the InfoArchive-level,


retention is also applied at the Amazon S3 (hardware) level. Therefore, you can no longer
delete those objects from the Amazon S3 console.

- The S3 retention policy is a retention mechanism applied at Amazon S3 (hardware level).

- If you do want to use S3 retention, enable the Object Lock feature while creating Amazon S3
buckets. If the bucket has been created and the Object Lock feature was not enabled, it is
impossible to enable it.
- While applying retention to any object, provide a date in the future. Once we apply retention
with this date, it is not possible to delete the object until the retention period has expired.
Refer to the Amazon documentation (https://
docs.aws.amazon.com/AmazonS3/latest/dev/object-lock.html) to learn more about the
Object Lock feature.

- When creating a bucket for Amazon S3 in the IA Web App, you have the option to enable the
Amazon S3 Object Lock feature via the Push retention at the bucket level checkbox

- If you create a bucket and do not enable the Push retention at the bucket level checkbox,
the bucket will be created without the Object Lock feature. If the box is selected, however,
the Amazon S3 bucket is created with the Object Lock feature enabled.

- Once the Amazon S3 bucket has been created, you still have the option Do not push
retention at the hardware level for the store.

- Suppose the bucket has the capability to support retention at the hardware-level (the Object
Lock feature has been enabled), but you do not want to apply S3 retention to a particular
store. Then, on the Amazon S3 store creation screen, enable the Do not push retention at
the hardware level checkbox. Then, even if any retention policy is applied to packages or
objects, retention is only applied at the InfoArchive-level and not at Amazon S3-level.
However, if you decide to apply retention at the hardwarelevel also (S3 retention), ensure
that the Do not push retention at the hardware level checkbox is not checked at the store-
level screen.

- If you have not enabled the object retention feature at the bucket-level, the Do not push
retention at the hardware level checkbox on the Store screen is checked and you cannot
change it

1.2.3.2 Configuring a Location When creating a bucket


- You can configure the Location while creating an Amazon S3 bucket. Once a bucket is
created in Amazon S3, it becomes impossible to change its location. The user can see
the Location of the bucket in View Bucket screen, but cannot change it.

- The default Amazon S3 bucket location is US East (N. Virginia). Refer to the Amazon
S3 documentation (https://docs.aws.amazon.com/AmazonS3/latest/dev/
UsingBucket.html) to learn more about buckets.

1.2.3.3 Configuring the Storage Class


- Once an object is stored in Amazon S3 storage, the object is saved as a particular
Storage Class. InfoArchive supports the following storage classes:

• Standard

• Standard-IA

• Intelligent-Tiering

• One Zone-IA

• Glacier Instant Retrieval (IR)


- The default storage class for an object is always Standard. While creating an Amazon
S3 bucket, you can define the Storage Class in which the objects are to be saved.
Once the bucket has been created, if you want to change the storage class defined at
the Amazon S3 bucket-level, change the access tier on the View Bucket screen.

- You can override the storage class on the Amazon S3 store creation screen (store
level). If you have defined any particular storage class at the Amazon S3 store-level, it
will take priority over the storage class defined at the Amazon S3 bucket-level. The
objects will then be saved in the storage class defined at the Amaozn S3 store-level.
Refer to the Amazon S3 documentation
(https://aws.amazon.com/s3/storageclasses/) to learn more about Amazon S3
storage classes.

1.2.3.4 Configuring a lifecycle management rule for a store


- While creating an Amazon S3 store, you can also configure a lifecycle management
rule for the store in a bucket. In lifecycle management, objects can be transferred
from one storage class to another for automatic cost savings after a certain number
of days have passed (also known as Transition). As of this release, for the lifecycle
management rule, InfoArchive supports one transition. You can configure the Rule
Name, To (the destination storage class), and Age (days), which indicates the
number of days since the object was created, as a part of that rule.

- While creating a transition for an Amazon S3 lifecycle management rule, there are
five storage classes to choose from:

• Standard-IA

• Intelligent-Tiering

• One Zone-IA

• Glacier

• Glacier Deep Archive

- Glacier and Glacier Deep Archive are offline storage classes. If you select either one
of these as the destination storage class, three more fields are displayed in the IA
Web App:

• A checkbox Offline Content (a read-only field)

• Restoration Rules

• S3 Duration

- If the object is transferred to the Glacier or Glacier Deep Archive storage classes,
those objects cannot be downloaded directly. Instead, a restoration procedure needs
to be initiated. The restoration configurations require Restoration Rules and S3
Duration, which are entered during Amazon S3 store creation.

- For the Restoration Rules, if the To dropdown (destination storage class is Glacier),
there are three values to choose from:
• Standard (3-5 hours)

• Expedited (1-5 minutes)

• Bulk (5-12 hours)

- For the Restoration Rules, if the To dropdown (destination storage class is Glacier
Deep Archive), there are two values to choose from:

• Standard (within 12 hours)

• Bulk (within 48 hours)

1.2.3.5 Installing Amazon S3 SDK (AWS SDK)

- To use Amazon S3, download the S3 SDK for Java. Refer to the latest set of release
notes to learn which versions of S3 SDK are supported.

1. Install InfoArchive distribution.


2. Download S3 SDK for Java from http://aws.amazon.com/sdk-for-java. Prior to the
next step, ensure that the InfoArchive server is not running.
3. Copy the jar file (aws-java-sdk-{version number}.jar) from the 'lib' folder of the
downloaded S3 SDK into external directory of the downloaded InfoArchive
distribution (/infoarchive/lib/iaserver/external/).
4. Start the IA Server.
5. For testing purposes, it is possible to configure a sample application with AWS S3
storage. For example, you may install the PhoneCalls (package-based application) or
Tickets (table-based application) applications from OOTB samples and then, with the
help of IA Web App, reconfigure the storage to use Amazon S3.

1.2.3.6 Verification of Amazon S3 storage usage

- Download free tools such as CloudBerry to connect to Amazon S3 and verify the
existence of the buckets.
- Navigate inside the bucket and verify that the AIP and BLOB objects are stored in the
bucket:

- Alternatively, use the AWS console to see the bucket information:


1.2.3.7 Configuring a proxy server to connect to Amazon

- The Administrator can use a proxy server setup to connect to Amazon Web Services
(AWS) S3 for production deployment.
- In case a proxy is required, the following demonstrates how to configure InfoArchive
to use Amazon S3 storage with a proxy:
• When you add or edit the storage system, ensure following information is entered:
a. Check the Enable Proxy box.
b. Enter a Proxy URL for the proxy server. If other users will access the proxy
server:
c. Enter a Access Key.
d. Enter a Secret Key.
e. If desired, enter a Proxy User Name.
f. If desired, enter a Proxy User Password.

You might also like