IDA 644 UserGuide en
IDA 644 UserGuide en
6.4.4
User Guide
Informatica Data Archive User Guide
6.4.4
January 2018
© Copyright Informatica LLC 2003, 2023
This software and documentation contain proprietary information of Informatica LLC and are provided under a license agreement containing restrictions on use and
disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any
form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC. This Software may be protected by U.S. and/or
international Patents and other Patents Pending.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as
provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013©(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III),
as applicable.
The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to
us in writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,
PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica
On Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging,
Informatica Master Data Management, and Live Data Map are trademarks or registered trademarks of Informatica LLC in the United States and in jurisdictions
throughout the world. All other company and product names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights
reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rights
reserved. Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © Meta
Integration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems Incorporated.
All rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All rights
reserved. Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights reserved.
Copyright © Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights reserved. Copyright
© Information Builders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved. Copyright Cleo
Communications, Inc. All rights reserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-technologies GmbH. All
rights reserved. Copyright © Jaspersoft Corporation. All rights reserved. Copyright © International Business Machines Corporation. All rights reserved. Copyright ©
yWorks GmbH. All rights reserved. Copyright © Lucent Technologies. All rights reserved. Copyright © University of Toronto. All rights reserved. Copyright © Daniel
Veillard. All rights reserved. Copyright © Unicode, Inc. Copyright IBM Corp. All rights reserved. Copyright © MicroQuill Software Publishing, Inc. All rights reserved.
Copyright © PassMark Software Pty Ltd. All rights reserved. Copyright © LogiXML, Inc. All rights reserved. Copyright © 2003-2010 Lorenzi Davide, All rights reserved.
Copyright © Red Hat, Inc. All rights reserved. Copyright © The Board of Trustees of the Leland Stanford Junior University. All rights reserved. Copyright © EMC
Corporation. All rights reserved. Copyright © Flexera Software. All rights reserved. Copyright © Jinfonet Software. All rights reserved. Copyright © Apple Inc. All rights
reserved. Copyright © Telerik Inc. All rights reserved. Copyright © BEA Systems. All rights reserved. Copyright © PDFlib GmbH. All rights reserved. Copyright ©
Orientation in Objects GmbH. All rights reserved. Copyright © Tanuki Software, Ltd. All rights reserved. Copyright © Ricebridge. All rights reserved. Copyright © Sencha,
Inc. All rights reserved. Copyright © Scalable Systems, Inc. All rights reserved. Copyright © jQWidgets. All rights reserved. Copyright © Tableau Software, Inc. All rights
reserved. Copyright© MaxMind, Inc. All Rights Reserved. Copyright © TMate Software s.r.o. All rights reserved. Copyright © MapR Technologies Inc. All rights reserved.
Copyright © Amazon Corporate LLC. All rights reserved. Copyright © Highsoft. All rights reserved. Copyright © Python Software Foundation. All rights reserved.
Copyright © BeOpen.com. All rights reserved. Copyright © CNRI. All rights reserved.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and/or other software which is licensed under various
versions of the Apache License (the "License"). You may obtain a copy of these Licenses at http://www.apache.org/licenses/. Unless required by applicable law or
agreed to in writing, software distributed under these Licenses is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
or implied. See the Licenses for the specific language governing permissions and limitations under the Licenses.
This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software
copyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under various versions of the GNU Lesser General Public License
Agreement, which may be found at http:// www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any
kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.
The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California,
Irvine, and Vanderbilt University, Copyright (©) 1993-2006, all rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and
redistribution of this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.
This product includes Curl software which is Copyright 1996-2013, Daniel Stenberg, <[email protected]>. All Rights Reserved. Permissions and limitations regarding this
software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or
without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://www.dom4j.org/ license.html.
The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to
terms available at http://dojotoolkit.org/license.
This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations
regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.
This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at
http:// www.gnu.org/software/ kawa/Software-License.html.
This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & Wireless
Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.
This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software
are subject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.
This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at
http:// www.pcre.org/license.txt.
This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http:// www.eclipse.org/org/documents/epl-v10.php and at http://www.eclipse.org/org/documents/edl-v10.php.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://
www.stlport.org/doc/ license.html, http://asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://
httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/
release/license.html, http://www.libssh2.org, http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/
license-agreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/
licence.html; http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt; http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/
Consortium/Legal/2002/copyright-software-20021231; http://www.slf4j.org/license.html; http://nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/
license.html; http://forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/
software/tcltk/license.html, http://www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.slf4j.org/license.html; http://www.iodbc.org/dataspace/
iodbc/wiki/iODBC/License; http://www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/
index.html; http://www.net-snmp.org/about/license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt;
http://www.schneier.com/blowfish.html; http://www.jmock.org/license.html; http://xsom.java.net; http://benalman.com/about/license/; https://github.com/CreateJS/
EaselJS/blob/master/src/easeljs/display/Bitmap.js; http://www.h2database.com/html/license.html#summary; http://jsoncpp.sourceforge.net/LICENSE; http://
jdbc.postgresql.org/license.html; http://protobuf.googlecode.com/svn/trunk/src/google/protobuf/descriptor.proto; https://github.com/rantav/hector/blob/master/
LICENSE; http://web.mit.edu/Kerberos/krb5-current/doc/mitK5license.html; http://jibx.sourceforge.net/jibx-license.html; https://github.com/lyokato/libgeohash/blob/
master/LICENSE; https://github.com/hjiang/jsonxx/blob/master/LICENSE; https://code.google.com/p/lz4/; https://github.com/jedisct1/libsodium/blob/master/
LICENSE; http://one-jar.sourceforge.net/index.php?page=documents&file=license; https://github.com/EsotericSoftware/kryo/blob/master/license.txt; http://www.scala-
lang.org/license.html; https://github.com/tinkerpop/blueprints/blob/master/LICENSE.txt; http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/
intro.html; https://aws.amazon.com/asl/; https://github.com/twbs/bootstrap/blob/master/LICENSE; https://sourceforge.net/p/xmlunit/code/HEAD/tree/trunk/
LICENSE.txt; https://github.com/documentcloud/underscore-contrib/blob/master/LICENSE, and https://github.com/apache/hbase/blob/master/LICENSE.txt.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and
Distribution License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary
Code License Agreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php), the new BSD License (http://
opensource.org/licenses/BSD-3-Clause), the MIT License (http://www.opensource.org/licenses/mit-license.php), the Artistic License (http://www.opensource.org/
licenses/artistic-license-1.0) and the Initial Developer’s Public License Version 1.0 (http://www.firebirdsql.org/en/initial-developer-s-public-license-version-1-0/).
This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab.
For further information please visit http://www.extreme.indiana.edu/.
This product includes software Copyright (c) 2013 Frank Balluffi and Markus Moeller. All rights reserved. Permissions and limitations regarding this software are subject
to terms of the MIT license.
DISCLAIMER: Informatica LLC provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of noninfringement, merchantability, or use for a particular purpose. Informatica LLC does not warrant that this software or documentation is error free. The
information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation
is subject to change at any time without notice.
NOTICES
This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software
Corporation ("DataDirect") which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OF THE POSSIBILITIES
OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH OF CONTRACT, BREACH
OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
Chapter 1: Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Data Archive Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Benefits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Data Archive Use Cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Data Archive for Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Data Archive for Compliance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Data Archive for Retirement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Table of Contents
Check Indexes for Segmentation Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Clean Up After Merge Partitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Clean Up After Segmentation Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Collect Data Growth Stats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Compress Segments Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Copy Data Classification Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Create Archive Cycle Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Create Optimization Indexes Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Create Audit Snapshot Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Create Archive Folder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Create History Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Create History Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Create Indexes on Data Vault. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Create Materialized Views Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Create Seamless Data Access Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Create Seamless Data Access Script Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Create Seamless Data Access Physical and Logical Script. . . . . . . . . . . . . . . . . . . . . . . . 35
Copy Application Version for Retirement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Copy Source Metadata Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Define Staging Schema Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Delete Indexes on Data Vault. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Detach Segments from Database Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Disable Access Policy Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Drop History Segments Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Drop Interim Tables Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Enable Access Policy Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Encrypt Data in Data Vault Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Encryption Report Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Export Data Classification Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Export Informatica Data Vault Metadata Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Data Vault Loader Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Generate Explain Plan Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Get Table Row Count Per Segment Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
IBM DB2 Bind Package. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Import Data Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Import Informatica Data Vault Metadata Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Load External Attachments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Merge Archived History Data While Segmenting Production Job. . . . . . . . . . . . . . . . . . . . . 47
Merge Partitions Into Single Partition Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Migrate Data Archive Metadata Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Move External Attachments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Move From Default Segment to History Segments Job. . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Table of Contents 5
Move From History Segment to Default Segments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Move Segment to New Storage Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Purge Expired Records Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Recreate Indexes Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Reindex on Data Vault Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Refresh Materialized Views Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Refresh Schema for Salesforce Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Refresh Selection Statements Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Replace Merged Partitions with Original Partitions Job. . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Replace Segmented Tables with Original Tables Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Restore External Attachments from Archive Folder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Sync with LDAP Server Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Test Email Server Configuration Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Test JDBC Connectivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Test Scheduler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Turn Segments Into Read-only Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Unpartition Tables Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Scheduling Jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Job Logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Delete From Source Step Log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Monitoring Jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Pausing Jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Delete From Source Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Searching Jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Quick Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Advanced Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6 Table of Contents
Chapter 7: Salesforce Archiving. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Salesforce Archiving Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Salesforce Archiving Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Step 1. Install the Salesforce Accelerator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Step 2. Configure Salesforce Permissions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Step 3. Import Salesforce Metadata and Configure Entities. . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Step 4. Create a Salesforce Source Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Step 5. Create the Salesforce Archive Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Refresh Schema for Salesforce Standalone Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Salesforce Limitations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Table of Contents 7
Accounting Document Header. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Accounting Document Line Item. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Business Partner. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
SAP Archives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Troubleshooting SAP Application Retirement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Message Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Frequently Asked Questions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8 Table of Contents
Step 1. Download the Reports and Catalog. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Step 2. Edit the Catalog Connection Details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Step 3. Create a Report Folder and Publish. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Step 4. Create a Report Folder in the Target File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Step 5. Copy the Reports and Catalog from the Source to Target Systems. . . . . . . . . . . . . . . . 136
Step 6. Map the Folder to the Target File System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Step 7. Select the Migrated Target Connection in the Patient Archives. . . . . . . . . . . . . . . . . . . 139
Troubleshooting Retirement Archive Projects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Chapter 10: Integrated Validation for Archive and Retirement Projects. . . . . 141
Integrated Validation Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Integrated Validation Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Row Checksum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Column Checksum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Validation Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Validation Review User Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Validation Report Elements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Enabling Integrated Validation for a Retirement Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Enabling Integrated Validation for an Archive Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Reviewing the Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Running the Validation Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Integrated Validation Standalone Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Running the Integrated Validation Standalone Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Table of Contents 9
Retention Policy Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Record Selection for Retention Policy Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Retention Period Definition for Retention Policy Changes. . . . . . . . . . . . . . . . . . . . . . . . 166
Update Retention Policy Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Viewing Records Assigned to an Archive Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Changing Retention Policies for Archived Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Purge Expired Records Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Purge Expired Records Job Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Running the Purge Expired Records Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Retention Management Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Retention Management when Archiving to EMC Centera. . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
10 Table of Contents
Chapter 14: Data Discovery Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Data Discovery Portal Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Search Data Vault. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Examples of Search Queries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Searching Across Applications in Data Vault. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Search Within an Entity in Data Vault. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Search Within an Entity Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Saved Criteria. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Searching Within an Entity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Search Within an Entity Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Search Within an Entity Export Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Browse Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Browse Data Search Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Searching with Browse Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Export Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Legal Hold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Legal Hold Groups and Assignments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Creating Legal Hold Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Applying a Legal Hold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Deleting Legal Hold Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Adding Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Updating Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Removing Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Searching Data Vault Using Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Browsing Data Using Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Troubleshooting the Data Discovery Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Table of Contents 11
Deleting a User or Access Role From a Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Assigning Permissions for Multiple Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Deleting Permissions for Multiple Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Running and Exporting a Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Deleting a Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Copying Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Copying SAP Application Retirement Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Designer Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
12 Table of Contents
General Journal by Account Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Supplier Customer Totals by Account Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
General Journal Batch Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
GL Trial Balance Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
GL Chart of Accounts Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Accounts Receivable Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Open Accounts Receivable Summary Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Invoice Journal Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
AR Print Invoice Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Sales Order Management Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Print Open Sales Order Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
Sales Ledger Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
Print Held Sales Order Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Inventory Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Item Master Directory Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Address Book Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
One Line Per Address Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Saving the Reports to the Archive Folder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Running a Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Table of Contents 13
Chapter 19: Smart Partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Smart Partitioning Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Smart Partitioning Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Dimensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Segmentation Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Data Classifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Segmentation Policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Access Policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Smart Partitioning Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
14 Table of Contents
Creating a Segment Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Table of Contents 15
Appendix D: Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
16 Table of Contents
Preface
The Informatica Data Archive User Guide is written for the Database Administrator (DBA) that performs key
tasks involving data backup, restore, and retrieval using the Informatica Data Archive user interface. This
guide assumes you have knowledge of your operating systems, relational database concepts, and the
database engines, flat files, or mainframe systems in your environment.
Informatica Resources
Informatica Network
Informatica Network hosts Informatica Global Customer Support, the Informatica Knowledge Base, and other
product resources. To access Informatica Network, visit https://network.informatica.com.
To access the Knowledge Base, visit https://kb.informatica.com. If you have questions, comments, or ideas
about the Knowledge Base, contact the Informatica Knowledge Base team at
[email protected].
Informatica Documentation
To get the latest documentation for your product, browse the Informatica Knowledge Base at
https://kb.informatica.com/_layouts/ProductDocumentation/Page/ProductDocumentSearch.aspx.
If you have questions, comments, or ideas about this documentation, contact the Informatica Documentation
team through email at [email protected].
17
Informatica Product Availability Matrixes
Product Availability Matrixes (PAMs) indicate the versions of operating systems, databases, and other types
of data sources and targets that a product release supports. If you are an Informatica Network member, you
can access PAMs at
https://network.informatica.com/community/informatica-network/product-availability-matrices.
Informatica Velocity
Informatica Velocity is a collection of tips and best practices developed by Informatica Professional
Services. Developed from the real-world experience of hundreds of data management projects, Informatica
Velocity represents the collective knowledge of our consultants who have worked with organizations from
around the world to plan, develop, deploy, and maintain successful data management solutions.
If you are an Informatica Network member, you can access Informatica Velocity resources at
http://velocity.informatica.com.
If you have questions, comments, or ideas about Informatica Velocity, contact Informatica Professional
Services at [email protected].
Informatica Marketplace
The Informatica Marketplace is a forum where you can find solutions that augment, extend, or enhance your
Informatica implementations. By leveraging any of the hundreds of solutions from Informatica developers
and partners, you can improve your productivity and speed up time to implementation on your projects. You
can access Informatica Marketplace at https://marketplace.informatica.com.
To find your local Informatica Global Customer Support telephone number, visit the Informatica website at
the following link:
http://www.informatica.com/us/services-and-training/support-services/global-support-centers.
If you are an Informatica Network member, you can use Online Support at http://network.informatica.com.
18 Preface
Chapter 1
Introduction
This chapter includes the following topics:
Features
• Enterprise archive engine for use across multiple databases and applications
• Streamlined flow for application retirement
• Analytics of data distribution patterns and trends
• Multiple archive formats to meet different business requirements
• Seamless application access to archived data
• Data Discovery search of offline archived or retired data
• Accessibility or access control of archived or retired data
• Retention policies for archived or retired data
• Purging of expired data based on retention policies and expiration date
• Catalog for the retired applications to see retired application details
• Complete flexibility to accommodate application customizations and extensions
Benefits
• Improve production database performance
• Maintain complete application integrity
• Comply with data retention regulations
• Retire legacy applications
• High degreee of compression reduces application storage footprint
• Enable accessibility to archived data
19
• Immutability ensures data authenticity for compliance audits
• Reduce legal risk for data retention compliance
With Enterprise Data Manager, accelerators for Oracle E-Business Suite, PeopleSoft, and Siebel applications
are available to “jump start” your compliance implementation. In addition, you can use Enterprise Data
Manager for additional metadata customization.
20 Chapter 1: Introduction
Chapter 2
Port Number
Name of the Data Archive web application, typically Informatica. Required if you do not use embedded
Tomcat. Omit this portion of the URL if you use embedded Tomcat.
When you launch the URL, the Data Archive login page appears. Enter your user name and password to log in.
A user can log in again from Home > Login Again and log out from Home > Logout.
21
Update User Profile
View and edit your user profile from Home > User Profile. You can update your name, email and password.
You can also view the system-defined roles assigned to you.
Note: The Standard edition of Enterprise Data Management Suite allows the user to select only a single ERP,
whereas the Enterprise edition encompasses all supported platforms.
Organization of Menus
The functionality of Data Archive is accessible through a menu system consisting of seven top-level menus
and respective submenus as shown in the following figure:
Note: Some of these menu options might not be available in the Standard edition of Data Archive. System-
defined roles restrict the menu paths that are available to users. For example, only users with the discovery
user role have access to the Data Discovery menu.
23
Predefined List of Values (LOVs)
Most tasks performed from the Data Archive user interface involve selecting a value from a list. Such lists
are abbreviated as LOVs and their presence is marked by the LOV button, usually followed by an asterisk sign
(*) to indicate that it is mandatory to specify a value for the preceding text box.
The list is internally generated by a query constructed from Constraints specified on a number of tables
through the Enterprise Data Manager.
Mandatory Fields
Throughout the application, red asterisks * indicate mandatory information that is essential for subsequent
processes.
Clicking Edit usually leads you to the Create/Edit page for the corresponding element in the section, while
clicking Delete triggers a confirmation to delete the element permanently.
Navigation Options
The breadcrumb link in each page denotes the current location in the application.
When the number of elements in a section, represented as rows (for example, Roles i.e. “elements” in the
Role Workbench i.e. “section”) exceed ten in number, navigation options are introduced, to navigate to the
first, previous, next, and last page respectively.
The user can change the number of rows to be displayed per page.
The number of elements (represented as rows) in a section are specified at the end of a page.
Additionally, for each row, a Collapse All button to the left suggests that further detail can be extracted by
clicking on it.
Online Help
The Data Archive user interface is enabled with a context sensitive online help system which can be
accessed by clicking on the Help button on the top right corner of the screen.
Scheduling Jobs
This chapter includes the following topics:
Standalone Jobs
A standalone job is a job that is not linked to a Data Archive project. Standalone jobs are run to accomplish a
specific task.
You can run standalone jobs from Data Archive. Or, you can use a JSP-API to run standalone jobs from
external applications.
Before you run this job, run the Get Table Row Count Per Segment job to estimate the size of the segments
that the smart partitioning process will create. Then run the Allocate Datafiles to Mount Points job before you
create a segmentation policy.
25
Provide the following information to run this job:
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the datafiles you want to allocate. Choose from available
segmentation groups.
File Size(MB)
Increment by which the datafiles will automatically extend. Default is 1000 MB.
Max Size(MB)
Maximum size of the datafiles you want to allocate. Default is 30000 MB.
IndexCreationOption
Type of index that will be created for the segment tablespaces. Select global, local, or global for unique
indexes.
In the job step log on the Monitor Jobs page, you can view a row count report for the records that were
successfully loaded to Data Vault.
The date field in the digital record must be in the format yyyy-MM-dd-HH.mm.ss.SSS. For example:
2012-12-27-12.11.10.123.
After you run the job, assign a Data Vault access role to the entity. Then, assign users to the Data Vault
access role. Only users that have the same role assignment as the entity can use Browse Data to view the
structured digital records in Data Discovery.
Metadata File
The metadata file is an XML file that contains the details of the entity that the job creates. The file also
contains the structure of the digital records, such as the column and row separators, and the table columns
and datatypes.
The following example shows a sample template file for the structured metadata file:
<?xml version="1.0" encoding="UTF-8"?>
<ATTACHMENT_DESCRIPTOR>
<ENTITY>
<APP_NAME>CDR</APP_NAME>
<APP_VER>1.10</APP_VER>
<MODULE>CDR</MODULE>
<ENTITY_NAME>CDR</ENTITY_NAME>
<ENTITY_DESC>CDR Records</ENTITY_DESC>
You can also indicate whether or not a column is nullable. Use the following syntax: <COLUMN NULLABLE="N">
</COLUMN> or <COLUMN NULLABLE="Y"></ COLUMN>.
For example:
<TABLE NAME="SAMPLE">
</TABLE>
The following table describes the job parameters for the Archive Structured Digital Records job:
Parameter Description
Metadata File XML file that contains the details of the entity that the job creates and the structure of the digital
records, such as the column and row separators, and the table columns and datatypes.
Target Archive Destination where you want to load the structured digital records. Choose the corresponding Data
Store Vault target connection. The list of values is filtered to only show Data Vault target connections.
Purge After Load Determines whether the job deletes the attachments from the staging directory that is configured
in the Data Vault target connection.
- Yes. Deletes attachments from the directory.
- No. Keeps the attachments in the directory. You may want to select no if you plan to manually
delete the attachments after you run the standalone job.
The attach segments job does not move the .dmp and database files for the transportable table space. When
you run the job to reattach the segments, you must manually manage the required .dmp and database files
for the table space.
Standalone Jobs 27
Provide the following information to run this job:
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the segments you want to attach. Choose from available
segmentation groups.
SegmentSetName
Name of the segment set that you want to attach to the database.
DmpFileFolder
DmpFileName
Table indexes can help optimize smart partitioning performance and significantly reduce smart partitioning
run time. Typically smart partitioning uses the index predefined by the application. If multiple potential
indexes exist in the application schema for the selected table, the Check Indexes for Segmentation job
determines which index to use. You must register the index after you add the table to the segmentation
group.
Before you run the Check Indexes for Segmentation job, you must configure a segmentation policy and
generate metadata for the segmentation group. Save the segmentation policy as a draft and run the job
before you run the segmentation policy.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to check indexes for. Choose from available segmentation
groups.
Related Topics:
• “Smart Partitioning Segmentation Policies Overview” on page 282
• “Segmentation Policy Process” on page 283
• “Creating a Segmentation Policy” on page 287
Only run the clean up job after you have merged all of the segments that you want to merge for a data
classification. After you run the clean up after merge partitions job, you cannot replace the merged partitions
with the original partitions.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
DataClassificationName
Name of the data classification that contains the segments that you merged. Choose from available data
classifications.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you ran a segmentation policy on. Choose from available segmentation
groups.
DropManagedRenamedTables
Default is No.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group containing the set of segments you want to compress. Choose from
available segmentation groups.
Standalone Jobs 29
SegmentCompressOption
RemoveOldTablespaces
Default is yes.
SegmentSetName
Name of the segment set you want to compress. Choose from available segments sets.
LoggingOption
The option to log your changes. If you select logging, you can revert your changes.
Default is NOLOGGING.
ParallelDegree
Default is 4.
Source connection where the data classification you want to copy exists. Choose from available source
connections.
Data Classification
The data classification that you want to copy to a different source connection. Choose from available data
classifications.
To SourceRep
The source connection that you want to copy the data classification to. Choose from available source
connections.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to create an optimization index for. Choose from available
segmentation groups.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to audit. Choose from available segmentation groups.
Description
CountHistory
Default is no.
• Destination Repository. Archive folder you want to load the archive data to. Choose the archive folder
from the list of values. The target connection name appears before the archive folder name.
• Output location of generated script file. The output location of the generated script file.
• Source repository. Database with production data.
• Target repository. Database with archived data.
Standalone Jobs 31
Create History Table
The Create History Table job creates table structures for all tables that will be involved in seamless data
access for a data source and data target. Run this job prior to running the Create Seamless Data Access job.
Note: If additional tables are moved in a second archive job execution from the same data source, seamless
data access will not be possible for the archived data, unless the Create History Tables job is completed.
• Output location of generated script file. The output location of the generated script file.
• Source repository. Database with production data.
• Target repository. Database with archived data.
If you add or remove columns from the search index, delete the search indexes first. Then run the Create
Indexes on Data Vault to create the new indexes. Alternatively, you can run the Reindex on Data Vault
standalone job, which runs the Delete Indexes on Data Vault job and the Create Indexes on Data Vault job
back-to-back.
If the job fails while creating the indexes, some table records are indexed multiple times when you resume
the job. To remove the duplicate indexing, run the Delete Indexes on Data Vault job and select the connection
and table from the list of values. After the Delete Indexes on Data Vault job is complete, run the Create
Indexes on Data Vault job again with the same connection and table selected for the job.
Destination Repository
Entity
Optional. The name of the entity for the table with the columns you want to add or remove from the
search index.
Table
Optional. The name of the table with the columns you want to add or remove from the search index.
Verify that you have selected the columns for indexing in the Enterprise Data Manager. If a table does
not have any columns selected for indexing, the job enters a warning state.
Destination Repository
The archive folder in the Data Vault where you retired the application data. Click the list of values button and
select from the available folders.
Entity
View
The name of the view. If you selected a value for the entity parameter, the entity value overrides the view
parameter. This field is optional.
When you run the job, the job creates a script and stores the script in the location that you specified in the job
parameters. The job uses the following naming convention to create the script file:
SEAMLESS_ACCESS_<Job ID>.SQL
The job uses the parameters that you specify to create the script statements. The script can include
statements to create one of the following seamless access views:
If you want to create both the combined view and the query view, run the job once for each schema.
Standalone Jobs 33
Create Seamless Data Access Script Job Parameters
Provide the following information to run the Seamless Data Access Script job:
Source Repository
Archive source connection name. Choose the source connection for the IBM DB2 database that stores
the source or production data.
Destination Repository
Archive target connection name. Choose the target connection for the IBM DB2 database that stores the
archived data.
Schema in which the script creates the seamless access view of both the production and the archived
data.
Note that the job creates statements for one schema only. Configure either the Combined Schema Name
parameter or the Query Schema Name parameter.
Schema in which the script creates the view for the archived data.
Note that the job creates statements for one schema only. Configure either the Combined Schema Name
parameter or the Query Schema Name parameter.
• Source
• Destination
The combined and query schemas may exist on the source database if a low percentage of the source
application data is archived and a high percentage of data remains on the source.
The combined and query schema may exist on the target location if a high percentage of the source
application data is archived and a low percentage of data remains on the source.
Generate Script
Determines if the job generates a script file. Always choose Yes to generate the script.
Database Link
Default is NONE. Do not remove the default value. If you remove the default value, then the job might fail.
Script Location
Location in which the job saves the script. Enter any location on the machine that hosts the ILM
application server.
SourceRep
The IBM DB2 for AS/400 source connection that stores the source or production data.
DestRep
The IBM DB2 for AS/400 target connection that stores the archived data.
Script Location
Location where you want to save the scripts that the job creates. This location must be accessible to the ILM
application server.
After the job completes successfully, download the script files from the Monitor Jobs page. To download the
script files, expand the Job ID and click the Download Physical and Logical Script link. Then extract the
downloaded file, which is named ScriptGeneration_xx.zip.
Within the extracted folder, the bin folder contains a connections properties file. Update the parameters to be
specific to your environment. The lib_name and file_name parameters should be eight characters or less.
After you edit the connections file, run the GenerateScriptForDb2AS400.bat file on a Microsoft Windows
operating system or the GenerateScriptForDb2AS400.sh file on a UNIX/Linux operating system. When the
shell script or batch file runs, it creates the seamless access scripts in the location that you specified in the
job parameters.
For more information, see the Informatica Data Archive Administrator Guide.
When you run the Copy Application Version for Retirement job, the job creates a customer-defined application
version. The job copies all of the metadata to the customer-defined application version. The job copies the
table column and constraint information. After you run the job, customize the customer-defined application
version in the Enterprise Data Manager. You can create entities or copy pre-packaged entities and modify the
entities in the customer-defined application version. If you upgrade at a later time, the upgrade does not
change customer-defined application versions.
• Product Family Version. The pre-packaged application version that you want to copy.
• New Product Family Version Name. A name for the application version that the job creates.
Standalone Jobs 35
Copy Source Metadata Job
The copy source metadata job copies the metadata from one ILM repository to another ILM repository on a
cloned source connection.
You might need to clone a production database and still want to manage the database with smart
partitioning. Or, you might want to clone a production database to subset the database. Run the copy source
metadata job if you want to use the same ILM repository information on the cloned database as the original
source connection.
Before you run the job, clone the production database and create a new source connection in the Data
Archive user interface. Then, run the define staging schema standalone job or select the new staging schema
from the Manage Segmentation page.
The staging schema of the source repository that you want to copy metadata from.
To Source Repository
The staging schema of the source repository that you want to copy metadata to.
Source connection for the segmentation group. Choose from available source connections.
DefinitionOption
The choice to define or update the staging schema. If you choose define staging schema, the ILM Engine
performs a check to determine if the staging schema has been configured previously. If you choose update
staging schema, the job updates the staging schema regardless of whether it has been previously configured.
Alternatively, you can run the Reindex on Data Vault standalone job, which runs the Delete Indexes on Data
Vault job and the Create Indexes on Data Vault job back-to-back.
To run the Delete Indexes on Data Vault job, a user must have one of the following system-defined roles:
• Administrator
• Retention Administrator
• Scheduler
Destination Repository
The Data Vault connection configured to the specific archive folder that you want to delete indexes on.
Choose either the specific connection or "all connections" to delete indexes on all available Data Vault
connections.
Entity
Table
If you run the detach segments from database job, you can no longer run the replace segmented tables with
original tables job. You cannot detach or reattach the default segment.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the segments you want to detach. Choose from available
segmentation groups.
SegmentSetName
Name of the segment set that you want to detach from the database.
DmpFileFolder
Directory where you want to save the detached segment data files.
DmpFileName
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to disable the access policy on. Choose from available
segmentation groups.
Standalone Jobs 37
Drop History Segments Job
The drop history segments job drops the history segments that the smart partitioning process creates when
you run a segmentation policy. If you want to create a subset of history segments for testing or performance
purposes, you must drop the original history segments.
Before you create a subset of the history segments, you must clone both the application database and the
ILM repository. You must also create a copy of the ILM engine and edit the conf.properties file to point to
the new ILM repository. Then you can create a segment set that contains the segments you want to drop and
run the drop history segments job.
Provide the following information to run this job:
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the history segments you want to drop. Choose from available
segmentation groups.
SegmentSetName
Name of the segment set that contains the history segments you want to drop.
GlobalIndexTablespaceName
EstimatePercentage
Default is 10.
ParallelDegree
The degree of parallelism the ILM Engine uses to drop history segments.
Default is 4.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the interim tables you want to drop. Choose from available
segmentation groups.
Source connection for the segmentation group. Choose from available source connections.
Name of the segmentation group with the access policy you want to enable. Choose from available
segmentation groups.
DbPolicyType
Type of the row-level security policy on the application database. Choose from context sensitive, dynamic, or
shared context sensitive.
Typically when you archive or retire data to the Data Vault, you enable data encryption when you create the
archive or retirement project. If you archive or retire to the Data Vault without enabling data encryption, you
can use the encrypt data in Data Vault job to encrypt the data at a later time. You can also use the job to
rotate encryption keys for data that has already been encrypted, either during the initial project run or with
this standalone job.
Access Roles
To be able to view and schedule the encrypt data in Data Vault job, a user must have both the encryption user
role as well as a role that allows a user to schedule jobs, such as the scheduler role. The encryption user role
is assigned to the Administrator role by default. For more information on system-defined roles, see the
chapter "Security" in the Data Archive Administrator Guide.
Conf.properties File
The following table describes the properties that you can configure in conf.properties:
Property Description
informia.encryption.RegistrationThreadCount Defines the number of registration threads used to encrypt the Data
Vault data. Default value is 5.
informia.encryption.WorkerThreadCount Defines the number of worker threads used to encrypt the Data Vault
data. Default value is 10.
informia.encryption.SCTRegistrationType Defines whether the SQL or admin registers the encrypted .SCT files
in Data Vault. Valid property values are SQL and admin. Default is
SQL.
Standalone Jobs 39
Property Description
informia.encryption.DeleteOldDataFiles Specifies whether or not to delete the original unencrypted data files
during the encryption job. Valid property values are Y and N. Default
value is N.
If you set this property to N, the encrypt job unregisters but does not
delete the original unencrypted .SCT files. The job also creates new,
encrypted data files and registers those files. Because the original
files are still present in their physical location, the total size of
the .SCT files will double.
If you set this property to Y, the job deletes the original files during
the encryption job.
There is no automated process to validate the encrypted data
against the original data. You must manually validate the encrypted
data against the original data. After you validate the data, refer to
the steps in the "Restoring and Deleting Old .SCT files" section below
for cleanup instructions.
Informatica recommends setting this property to N, so that you can
manually validate the encrypted data.
Default is N.
informia.encryption.IDVDeBugMode Specifies whether or not Data Vault runs the encryption job in debug
mode. Valid property values are Y and N.
Default value is N.
Debug mode provides more troubleshooting information if the
encryption job fails.
For more information on the conf.properties file, see the Data Archive Administrator Guide.
Job Properties
Archive Store
The Data Vault target connection that contains the data that you want to encrypt. The job encrypts all of the
tables in the archive folder that is defined in the target connection that you select. To encrypt the data on all
available Data Vault connections, you can also choose "All Connections."
Encryption Key
Type of encryption key used to encrypt the data. You can choose either the random key generator provided by
Informatica, or a third-party key generator. When you choose the third-party key generator, you must configure
the property "informia.encryptionkey.command" in the conf.properties file. For more information on
random and third-party encryption keys, see the chapters "Creating Retirement Archive Projects" and "Creating
Data Archive Projects" in the Data Archive User Guide.
Rotate Keys
Select Yes or No. When you select yes, the job generates a new key used to encrypt the data. The data is
encrypted with the new key.
Parallel Jobs
The encrypt data in Data Vault job is a single-thread job. You cannot run two encrypt jobs in parallel on the
same Data Vault connection, nor can you run a job to encrypt all available connections and then start a
second job on any connection. You can run two encrypt jobs in parallel provided they are running on different
Data Vault connections. This design preserves the consistency of Data Vault metadata. When the encrypt
data in Data Vault job runs, it unregisters the original unencrypted .SCT files in the Data Vault and registers
the new encrypted files with the suffix "_ENC" appended to the file name. If you run the job to rotate
encryption keys, the original encrypted .SCT files are unregistered.
In rare circumstances you may have the same Data Vault archive folder available on two different Data Vault
connections. Because the underlying Data Vault database is the same, you cannot run parallel encryption
jobs on this archive folder even though it exists on different connections.
In addition to parallel encryption jobs, you cannot simultaneously run any job that updates Data Vault
metadata while an encryption job is running on the same connection. If you try to run any job that updates
Data Vault metadata while an encryption job is running on the same connection, or all connections, the
second job fails with an error. You can run a job that updates Data Vault metadata in parallel with an
encryption job only if the encryption job is running on a different connection and archive folder than the
second job.
The following list contains the specific jobs that you cannot run while an encryption job runs on the same
target connection:
• Archive crawler
• Move external attachments
• Update retention policy
• Delete expired records
• Purge expired records
• Add tagging
• Update tagging
• Remove tagging
• Apply legal hold
• Remove legal hold
• Load external attachments
• Restore external attachments from archive folder
• Audit log archive loader
• Archive structured digital records
• Restore file archive
• Move external attachments
• Create materialized views
• Refresh materialized views
• Release of information index record
The encryption job creates logs inside of the <ILM_HOME>/logs/ folder with the following directory structure:
<ILM_HOME>/logs/encryption_logs/<Job ID>/<Repository ID>/<Logs_entry_ID>. The <Logs_entry_ID>
directories contain a log for each table in the archive folder that was encrypted during the job. To find the
table associated with a particular entry ID, you can query the ILM repository table
"xa_encryption_job_status." The table contains the column ENTRY_ID for each table in the encryption job.
The table also has a STATUS column to indicate the encryption status of a particular table. The STATUS
column contains either C for complete, P for pending, or E for error.
Standalone Jobs 41
An encryption job status report is also available in the Data Archive user interface, on the Monitor Jobs page.
Expand the encryption job and then the job details to display the link that launches the PDF status report. The
report is dynamic, so you can launch the report and track the progress of the job as it runs.
Restore Process
The job generates restore scripts every time you run it. The scripts are stored in the following location:
<ILM_HOME>/webapp/file_archive/restore/. The job creates one folder for each table, with the folder
structure JOBID/repid/tableid.
If necessary, you can restore the previous .SCT files after the encryption job is complete. To restore the
previous .SCT files, run the following commands from the Plugin folder:
source the fasplugin.sh
go to <ILM_HOME>/webapp/file_archive
. ./fasplugin.sh <ILM_HOME>/webapp/file_archive
go to <ILM_HOME>/webapp/file_archive
ssaencrypt -G -F
<ILM_HOME>/webapp/file_archive/restore/JOBID/repid/tableid/
<Restore_YYYY_MM_DD_hh_mm_ss.adm>
-P <PORT> -Q <AdminPort> -H <IDV Hostname> -u
<IDV user/password>
The command will unregister the new encrypted .SCT files, register the old .SCT files, and delete the new
encrypted .SCT files.
Note: When the .SCT files are located on external storage, the files will not be deleted if they are under
retention.
Note: When the .SCT files are located on external storage, the files will not be deleted if they are under
retention.
The Encryption Report job does not require you to provide any information. You must have the Encryption
User role to access the encryption report.
Source connection where the data classification exists. Choose from available source connections.
Data Classification
Name of the data classification you want to export. Choose from available data classifications.
The directory location where you want to export the data classification XML file to.
Required. Name of the source Data Vault archive folder that you want to migrate.
Required. Full path of the location in the target Data Vault system that stores the SCT data files.
Required. Full path of the temporary directory on the source Data Archive system that stores the run-time files,
the log file, and the .tar file. This directory must be accessible to the source Data Archive system.
Optional for a local file system. Required when the -n directory is an external storage directory. Full path of
the location in the target external storage system that stores the SCT data files that you want to import to the
external storage system.
Required. If you select Yes, the job keeps the original SCT data file location.
Required. If you select yes, the SCT files are copied to export file directory along with the run-time files, log
file, and .tar file.
Export Materialized View Statements
Required. If you select yes, any materialized view statements on the source will be exported during the job run.
When you run the Import Informatica Data Vault Metadata job later in the process, the materialized views will
be recreated on the target.
Standalone Jobs 43
Data Vault Loader Job
When you publish an archive project to archive data to the Data Vault, Data Archive moves the data to the
staging directory during the Copy to Destination step. You must run the Data Vault Loader standalone job to
complete the move to the Data Vault. The Data Vault Loader job executes the following tasks:
• Archive Job ID. Job ID generated after you publish an archive project. Locate the archive job ID from the
Monitor Job page.
Note: Data Vault supports a maximum of 4093 columns in a table. If the Data Vault Loader tries to load a
table with more than 4093 columns, the job fails.
You must create a segmentation group in the Enterprise Data Manager and configure a segmentation policy
in the Data Archive interface before you run the Generate Explain Plan job. Save the segmentation policy as a
draft and run the job before you run the segmentation policy.
If you enabled interim table processing to pre-process business rules for the segmentation group, you must
run the Pre-process Business Rules job before you run the Generate Explain Plan job.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group you want to generate the explain plan for. Choose from available
segmentation groups.
SelectionStatementType
OutFileFullName
Default is no.
If you enabled interim table processing to pre-process business rules for the segmentation group, you must
run the Pre-process Business Rules job before you run the Get Table Row Count Per Segment job.
Provide the following information to run this job:
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to estimate segment row counts for. Choose from available
segmentation groups.
SelectionStatementType
• DB2 Host Address. Name or IP address of the server that hosts the IBM DB2 source.
• DB2 Port Number. Port number of the server that hosts the IBM DB2 source.
• DB2 Location / Database Name. Location or name of the database.
• User ID. User that logs in to the database and runs the bind process on the source. Choose a user that has
the required authorizations.
• User Password. Password for the user ID.
To successfully import a data classification, you must create identical dimensions in the Enterprise Data
Manager for both ILM repositories. The dimensions must have the same name, datatype, and dimension type.
Source connection that you want to import the data classification to. Choose from available source
connections.
The complete file path, including the XML name, of the data classification XML file that you want to import.
Standalone Jobs 45
Import Informatica Data Vault Metadata Job
Run the Import Informatica Data Vault Metadata job as part of the application migration process. The job
imports metadata to the target Data Vault instance and registers the SCT files. For more information about
the application migration process, see the "Creating Retirement Archive Projects" chapter.
Required. The connection that links to the target Data Vault instance.
The Export Data Vault Metadata job ID (from the source Data Archive environment) and the Data Vaualt folder
name that you want to migrate to the target Data Vault.
Required. The directory of the .tar file exported by the Export Informatica Data Vault Metadata job. This
directory must be accessible to the target Data Archive instance.
The job creates an application module in EDM, called EXTERNAL_ATTACHMENTS. Under the application
module, the job creates an entity and includes the AM_ATTACHMENTS table. You specify the entity in the job
parameters. The entity allows you to use Browse Data to view the external attachments in Data Discovery
without a stylesheet.
After you run the job, assign a Data Vault access role to the entity. Then, assign users to the Data Vault
access role. Only users that have the same role assignment as the entity can use Browse Data to view the
attachments in Data Discovery. A stylesheet is still required to view attachments in the App View.
You can run the job for attachments that you did not archive yet or for attachments that you archived in a
previous release. You may want to run the job for attachments that you archived in a previous release to use
Browse Data to view the attachments in Data Discovery.
If the external attachments you want to load have special characters in the file name, you must configure the
LANG and LC_ALL Java environment variables on the operating system where the ILM application server
runs. Set the LANG and LC_ALL variables to the locale appropriate to the special characters. For example, if
the file names contain United States English characters, set the variables to en_US.utf-8. Then restart the
ILM application server and resubmit the standalone job.
The following table describes the job parameters for the Load External Attachments job:
Parameter Description
Attachment Entity name that the job creates. You can change the default to any name. If the entity already
Entity Name exists, the job does not create another entity.
Default is AM_ATTACHMENT_ENTITY.
Target Archive Data Vault destination where you want to load the external attachments. Choose the
Store corresponding archive target connection.
For upgrades, choose the target destination where you archived the attachments to.
Purge After Load Determines whether the job deletes the attachments from the directory that you provide as the
source directory.
- Yes. Deletes attachments from the directory.
- No. Keeps the attachments in the directory. You may want to select no if you plan to manually
delete the attachments after you run the standalone job.
For example, a previously archived table may contain accounts receivable transactions from 2008 and 2009.
You want to merge this data with a segmentation group you have created that contains accounts receivable
transactions from 2010, 2011, and 2012, so that you can create segments for each of the five years. Run this
job to combine the archived data with the accounts receivable segmentation group and create segments for
each year of transactions.
The merge archived history data job applies the business rules that you configured to the production data.
Run smart partitioning on the history data separately before you run the job.
If you choose not to merge all of the archived history data into its corresponding segments in the production
database, the Audit Compare Snapshots job returns a discrepancy that reflects the missing rows in the row
count report.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to merge with archived data. Choose from available
segmentation groups.
LoggingOption
Option to log your changes. If you select logging, you can revert your changes.
Default is no.
Standalone Jobs 47
AddPartColumnToNonUniqueIndex
Default is no.
IndexCreationOption
Type of index the smart partitioning process creates for the merged and segmented tables.
Default is global.
BitmapIndexOption
Type of bitmap index the smart partitioning process creates for the merged and segmented tables.
GlobalIndexTablespaceName
Name of the tablespace you want to the global index to exist in. This field is optional.
CompileInvalidObjects
Default is no.
ReSegmentFlag
Default is no.
CheckAccessPolicyFlag
Default is yes.
ParallelDegree
Degree of parallelism used when the smart partitioning process creates segments.
Default is 4.
DropTempTablesBeforeIndexRebuild
Drops the temporary tables before rebuilding the index after segmentation.
Default is no.
DropTempSelTables
Default is yes.
ArchiveRunIdView
ArchiveRunIdColumn
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the segments that you want to merge. Choose from available
segmentation groups.
SegmentSetName
LoggingOption
The option to log your changes. If you select logging, you can revert your changes.
Default is NOLOGGING.
ParalellDegree
FinalMergeJob
Parameter to designate whether or not the job is the final merge job that you want to run for a segmentation
group. If you have multiple segment sets that you want to merge in a segmentation group, select "No" unless
the job is the final segment set that you want to merge in that group. Select "Yes" if the job is the last merge
job that you want to run for the segmentation group.
MergedPartitionTablespaceName
Leave empty to use the existing tablespace. You can enter a new tablespace for the merged segment, but you
must create the tablespace before running the job.
Required. The product family version/application version that you want to migrate.
Required. If you select no, the job migrates only the retirement entity-related roles. If you select yes, the job
migrates users related to the retirement entity-related roles, in addition to other roles related to the users.
Standalone Jobs 49
Migrate Export Migration Status Table Only
Required. If you select yes, the job migrates only the export migration status table.
• You enabled Move Attachments in Synchronous Mode on the Create or Edit an Archive Source page.
• The entities support external attachments.
• Add on URL. URL to the Data Vault Service for external attachments.
• Archive Job ID. Job ID generated after you publish an archive project. Locate the archive job ID from the
Monitor Job page.
If you enabled interim table processing to pre-process business rules for the segmentation group, you must
run the Pre-process Business Rules job before you run the Move from Default Segment to History Segments
job.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the default and history segments that you want to move
transactions between. Choose from available segmentation groups.
SegmentSetName
Name of the segment set that you want to move from default to history.
BatchSize
Default is 100000.
RefreshSelectionStatements
Refreshes the metadata constructed as base queries for the partitioning process. If you have updated table
relationships for the segmentation group in the Enterprise Data Manager, select Yes to refresh the base
queries.
Default is no.
CheckAccessPolicyFlag
Default is yes.
Calls the Oracle API to collect database statistics at the end of the job run.
Default is no.
DropTempSelTables
Drops the temporary selection tables after the transactions are moved.
Select yes.
If you enabled interim table processing to pre-process business rules for the segmentation group, you must
run the Pre-process Business Rules job before you run the Move from History Segment to Default Segments
job.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the history segments you want to move data from. Choose
from available segmentation groups.
SegmentSetName
Name of the segment set that you want to move from history to default.
BatchSize
Default is 100000.
RefreshSelectionStatements
Refreshes the metadata constructed as base queries for the partitioning process. If you have updated table
relationships for the segmentation group in the Enterprise Data Manager, select Yes to refresh the base
queries.
Default is no.
CheckAccessPolicyFlag
Default is yes.
AnalyzeTables
Calls the Oracle API to collect database statistics at the end of the job run.
Default is no.
DropTempSelTables
Drops the temporary selection tables after the transactions are moved.
Default is yes.
Standalone Jobs 51
Move Segment to New Storage Class
The move segment to new storage class job moves the segments in a segmentation group from one storage
classification to another.
When you move a segment to a new storage classification, some data files might remain in the original
storage classification tablespace. You must remove these data files manually.
If the segment and new storage classification are located on an Oracle 12c database, the job moves the
segment without taking the tablespace offline.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
TablespaceName
Name of the tablespace that contains the new storage classification. Choose from available tablespaces.
StorageClassName
Name of the storage classification that you want to move the segments to. Choose from available storage
classifications.
Run the Purge Expired Records job to perform one of the following tasks:
• Generate the Retention Expiration report. The report shows the number of records that are eligible for
purge in each table. When you schedule the purge expired records job, you can configure the job to
generate the retention expiration report, but not purge the expired records.
• Generate the Retention Expiration report and purge the expired records. When you schedule the purge
expired records job, you can configure the job to pause after the report is generated. You can review the
expiration report. Then, you can resume the job to purge the eligible records.
When you run the Purge Expired Records job, by default, the job searches all entities in the Data Vault archive
folder for records that are eligible for purge. To narrow the scope of the search, you can specify a single
entity. The Purge Expired Records job searches for records in the specified entity alone, which potentially
decreases the amount of time in the search stage of the purge process.
To determine the number of rows in each table that are eligible for purge, generate the detailed or summary
version of the Retention Expiration report. To generate the report, select a date for Data Archive to base the
report on. If you select a past date or the current date, Data Archive generates a list of tables with the number
of records that are eligible for purge on that date. You can pause the job to review the report and then
schedule the job to purge the records. If you resume the job and continue to the delete step, the job deletes
all expired records up to the purge expiration date that you used to generate the report. If you provide a future
date, Data Archive generates a list of tables with the number of records that will be eligible for purge by the
future date. The job stops after it generates the report.
Lists the tables in the archive folder or, if you specified an entity, the entity. Shows the total number of
rows in each table, the number of records with an expired retention policy, the number of records on
legal hold, and the name of the legal hold group. The report lists tables by retention policy.
Lists the tables in the archive folder or, if you specified an entity, the entity. Shows the total number of
rows in each table, the number of records with an expired retention policy, the number of records on
legal hold, and the name of the legal hold group. The report does not categorize the list by retention
policy.
The reports are created with the Arial Unicode MS font. To generate the reports, you must have the font file
ARIALUNI.TTF saved in the <Data Archive installation>\webapp\WEB-INF\classes directory.
To purge records, you must enable the purge step through the Purge Deleted Records parameter. You must
also provide the name of the person who authorized the purge.
Note: Before you purge records, use the Search Within an Entity in Data Vault search option to review the
records that the job will purge. Records that have an expired retention policy and are not on legal hold are
eligible for purge.
When you run the Purge Expired Records job to purge records, Data Archive reorganizes the database
partitions in the Data Vault, exports the data that it retains from each partition, and rebuilds the partitions.
Based on the number of records, this process can increase the amount of time it takes for the Purge Expired
Records job to run. After the Purge Expired Records job completes, you can no longer access or restore the
records that the job purged.
Note: If you purge records, the Purge Expired Records job requires staging space in the Data Vault that is
equal to the size of the largest table in the archive folder, or, if you specified an entity, the entity.
Archive Store The name of the archive folder in the Data Vault that contains the records that Required.
you want to delete.
Select a folder from the list.
Purge Expiry The date that Data Archive uses to generate a list of records that are or will be Required.
Date eligible for delete.
Select a past, the current, or a future date.
If you select a past date or the current date, Data Archive generates a report
with a list of all records eligible for delete on the selected date. You can pause
the job to review the report and then schedule the job to purge the records. If
you resume the job and continue to the delete step, the job deletes all expired
records up to the selected date.
If you select a future date, Data Archive generates a report with a list of
records that will be eligible for delete by the selected date. However, Data
Archive does not give you the option to delete the records.
Report Type The type of report to generate when the job starts. Required.
Select one of the following options:
- Detail. Generates the Retention Expiration Detail report.
- None. Does not generate a report.
- Summary. Generates the Retention Expiration Summary report.
Pause After Determines whether the job pauses after Data Archive creates the report. If you Required.
Report pause the job, you must resume it to delete the eligible records.
Select Yes or No from the list.
Standalone Jobs 53
Parameter Description Required or
Optional
Entity The name of the entity with related or unrelated tables in the Data Vault archive Optional.
folder.
To narrow the scope of the search, select a single entity.
Purge Deleted Determines whether to delete the eligible records from the Data Vault. Required.
Records Select Yes or No from the list.
Purge Approved The name of the person who authorized the purge. Required if you
By select a past
date or the
current date for
Purge Expiry
Date.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to re-create indexes for. Choose from available segmentation
groups.
AddPartColumnToNonUniqueIndex
Default is no.
AnalyzeTables
Default is no.
IndexCreationOption
Default is global.
GlobalIndexTablespaceName
CompileInvalidObjects
Default is no.
Default is 4.
RecreateAlreadySegmentedIndexes
If you upgrade to Oracle E-Business Suite Release 12 from a previous version, some of the segmented local
indexes might become unsegmented. When you run the recreate indexes standalone job, the job re-creates
and segments the indexes that became unsegmented during the upgrade. You have the option of choosing
whether or not you want to re-create the indexes that did not become unsegmented during the upgrade.
To re-create the indexes that are still segmented when you run the recreate indexes job, select Yes from the
list of values for the parameter.
If you do not want to re-create the already segmented indexes, select No from the list of values for the
parameter.
Default is yes.
Do not set this parameter to yes if you set the IndexCreationOption parameter to global.
When you run the Reindex on Data Vault job, you specify the Data Vault connection and archive folder that
you want to reindex. You can also choose to reindex all of the available Data Vault connections. You cannot
run the reindex job on a folder if no indexes exist for the selected archive folder.
You might need to run the Reindex on Data Vault job if you update the legal hold on a row, or perform some
operation that updates the row ID. When the row ID is updated in Data Vault, you can no longer see the details
for the record in the Details pane when you perform a keyword search on the Data Vault. In this scenario, run
the Reindex on Data Vault job to reindex the data so that you can view the record details in the search results.
If you upgrade from a previous version of Data Archive, Informatica recommends running the Reindex on
Data Vault job on all connections.
Destination Repository
The Data Vault connection configured to the specific archive folder that you want to re-index on. Choose
either the specific connection or "all connections" to reindex on all available Data Vault connections.
Archive folder in the Data Vault where you retired the application data. Click the list of values button and
select from the available folders.
Standalone Jobs 55
Schema
Entity
Name of the entity that contains the view or views you want to refresh.This field is optional.
Determines whether to refresh all of the materialized views. If you do not know the entity or name of the view
you need to refresh, click the list of values button and select Yes. If you select No, you must provide the entity
or the name of the view you want to refresh.
View
Name of view you want to refresh. Click the list of values button and select the view. This field is optional.
Repository Name
Use the refresh selection statements job if you modify a segmentation group parameter after the group has
been created. For example, you may want to add a new index or query hint to a segmentation group and
validate the selection statement before you run the segmentation policy.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that you want to update selection statements for. Choose from available
segmentation groups.
SelectionStatementType
Type of selection statement used in the segmentation group. Choose from base, ongoing, or history.
• Base. Choose the base type if you plan to use the split segment method of periodic segment creation for
this segmentation group.
• Ongoing. Choose the ongoing type if you plan to use the new default or new non-default method of
periodic segment creation for this segmentation group.
• History. Choose the history type if you plan to merge the segmentation group data with history data.
You cannot replace the merged partitions with the original partitions if you have already run the clean up after
merge partitions job for a segment set.
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the merged segments that you want to replace with the
originals. Choose from available segmentation groups.
SegmentSetName
Name of the segment set that contains the merged segments that you want to replace.
DropBackupTables
Default is No.
If you run the replace segmented tables with original tables job and then run a segmentation policy on the
tables a second time, the segmentation process will fail if any of the table spaces were marked as read-only
in the first segmentation cycle. Ensure that the table spaces are in read/write mode before you run the
segmentation policy again.
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the segmented tables you want to replace. Choose from
available segmentation groups.
AddPartColumnToNonUniqueIndex
Default is no.
Standalone Jobs 57
Provide the following information to run this job:
Job ID
Job ID of the Archive Structured Digital Records job. Locate the job ID from the Monitor Job page.
Specifies whether the job will restore the external attachments when an attachment with the same name
already exists in the source directory.
If you select Yes, the job restores the external attachments from Data Vault even if an attachment or
attachments with the same name exists in the source directory.
If you select No, the job will error out if any attachment selected for restore has the same name as an
attachment already in the source directory.
Default is No.
Specifies whether the job purges the attachments from the Data Vault after the job restores them.
Specifies whether the job pauses after the step "Restore External Attachments from Archive Folder" completes
successfully.
If you select Yes, the job pauses after the "Restore External Attachments from Archive Folder" step, so that
you can validate that all of the attachments have been restored to the source directory.
If you select No, the job proceeds to the "Purge External Attachments from Archive Folder" step without
pausing after the restore step.
Default is No.
If you enable LDAP authentication, you must create and maintain users in the LDAP directory service and use
the job to create user accounts in Data Archive. Run the job once for each group base that you want to
synchronize.
When you run the job, the job uses the LDAP properties that are configured in the conf.properties file to
connect to the LDAP directory service. If you specify the group base and the group filter in the job
parameters, the job finds all of the users within the group and any nested groups. The job compares the
users to users in Data Archive. If a user is in the group, but not in Data Archive, then the job creates a user
account in Data Archive.
If you enabled role assignment synchronization, the job checks the security groups that the user is assigned
to, including nested groups. The job matches the security group names to the names of the system-defined
or Data Vault access role names. If the names are the same, the job adds the role to the user account in Data
Archive. Data Archive automatically synchronizes any subsequent changes to security group assignments
when users log in to Data Archive.
After the job creates users in Data Archive, any additional changes to users in the LDAP directory service are
automatically synchronized when users log in to Data Archive. For example, if you change user properties,
such as email addresses, or role assignments.
The Sync with LDAP Server job includes the following parameters:
LDAP System
• Active Directory
• Sun LDAP
The IP address or DNS name of the machine that hosts the LDAP directory service.
The port on the machine where the LDAP directory service runs.
User
User that logs in to the LDAP directory service. You can use the administrator user. Or, you can use any
user that has privileges to access and read all of the LDAP directories and privileges to complete basic
filtering.
Password
Search Base
The search base where the LDAP definition starts before running the filter.
User Filter
A simple or complex filter that enables Data Archive to identify individual users in the LDAP security
group.
• objectClass=inetOrgPerson
• objectClass=Person
• objectClass=* where * indicates that all entries in the LDAP security group should be treated as
individual users.
Group Base
Optional. Sets the base entry in the LDAP tree where you can select which groups you want to use to
filter users from the user filter.
If you do not specify a group base, then the job synchronizes all users in the LDAP directory service.
Standalone Jobs 59
For example, OU=Application Access,OU=Groups,DC=mycompany,DC=com.
Group Filter
Optional. Determines which groups are selected. After the user filter returns the result set to the
application, those users are compared to users in the selected groups only. Then, only true matches are
added to Data Archive.
When you run the Test Email Server Configuration job, the job uses the mail server properties defined in the
system profile to connect to the mail server. If the connection is successful, the job sends an email to the
email address that you specify in the job parameters. Data Archive sends the email from the mail server user
address defined in the system properties. If the connection is not successful, you can find an error message
in the job log stating that the mail server connection failed.
Email recipient security is determined by the mail server policies defined by your organization. You can enter
any email address that the mail server supports as a recipient. For example, if company policy allows you to
send emails only within the organization, you can enter any email address within the organization. You
cannot send email to addresses at external mail servers.
• Connection Retention Time. The length of time the Test JDBC Connectivity job maintains a connection to
the repository.
• Repository Name. Source or target database.
Test Scheduler
The Test Scheduler job checks whether the inbuilt scheduler is up and running. You can run this job at
anytime.
Source connection for the segmentation group. Choose from available source connections.
Name of the segmentation group that contains the history segments you want to drop. Choose from available
segmentation groups.
SourceSegmentSetName
SourceRep
Source connection for the segmentation group. Choose from available source connections.
SegmentationGroupName
Name of the segmentation group that contains the tables you want to unpartition. Choose from available
segmentation groups.
DataTablespaceName
Name of the tablespace where you want the job to create the unpartitioned tables. Enter the name of an
existing tablespace.
IndexTablespaceName
Name of the tablespace where you want the job to create the unpartitioned index. Enter the name of an
existing tablespace.
LoggingOption
The option to log your changes. If you select logging, you can revert your changes.
Default is NOLOGGING.
Scheduling Jobs
To schedule a standalone job, select the type of job you want to run and enter the relevant parameters.
Scheduling Jobs 61
8. Click Schedule.
Job Logs
Data Archive creates job logs when you run a job. The log summarizes the actions the job performs and
includes any warning or error messages.
You access the job log when you monitor the job status. For archive and retirement jobs, the jobs include a
log for each step of the archive cycle. Expand the step to view the log. When you expand the step, you can
view a summary of the job log for that step. The summary includes the last few lines of the job log. You can
open the full log in a separate window or download a PDF of the log.
For Oracle sources, the job log includes a real-time status of record deletion. You can use the real-time status
to monitor the job progress. For example, to determine the amount of time that the job needs to complete the
record deletion.
The real-time status is available when the following conditions are true:
The log includes the number of parallel threads that the job creates and the total number of rows that the job
needs to delete for each table. The log includes a real-time status of record deletion for each thread. The job
writes a message to the log after each time a thread commits rows to the database. The log displays the
thread number that issued the commit and the number of rows the thread deleted in the current commit. The
log also displays the total number of rows the thread deleted in all commits and the number of rows that the
thread must delete in total. The log shows when each thread starts, pauses, resumes, and completes.
Monitoring Jobs
Current jobs can be viewed from Jobs > Monitor Jobs and future jobs can be viewed from Jobs > Manage
Jobs.
The Monitor Jobs page lists jobs with their ID, status, user, items to execute, start date and time, and
completion date and time.
• Completed
• Error
• Not Executed
Click the job name, wherever an arrow button appears to the left of it, to display logging information of a
particular job. The last few lines from the logging information are displayed inline. To change the number of
lines for the display, select a value from the combo box, for “view lines.” Click the View Full Screen to display
the entire log in a new window. View Log, Summary Report or Simulation Report displays relevant reports as
PDF files.
Options are also available to resume and terminate a job. To terminate a job, click Terminate Job and to
resume a job, click Resume Job. Jobs can be resumed if interrupted either by the user or due to external
reasons like power failure. When a job corresponding to a Data Archive project errors out, you should
terminate that Job before scheduling the same project.
Pausing Jobs
You can pause archive jobs during specific steps of the archive job. For example, you can pause the job at
the delete from source step for Oracle sources. To pause the job from a specific step, open the status for the
step. When you resume the job, the job starts at the point where it paused.
For SAP retirement projects, Data Archive extracts data from special SAP tables as packages. When you
resume the job, the job resumes the data extraction process, starting from the point after the last package
was successfully extracted.
You can pause and resume an archive job during the delete from source step when the following conditions
are true:
You may want to pause a job during the delete from source step if you delete from a production system and
you only have a limited amount of time that you can access the system. If the job does not complete within
the access time, you can pause the job and resume it the next time you have access.
Pausing Jobs 63
When the job pauses depends on the delete commit interval and the degree of parallelism configuration for
the archive source. The pause occurs at the number of threads multiplied by the commit interval. When you
pause the job, each thread completes the number of rows in the current commit and then pauses. The job
pauses after all threads complete the current number of rows in the commit interval.
For example, if you configure the source to use a delete commit interval of 30,000 and two delete parallel
threads, then the job pauses after the job deletes 60,000 rows. The job writes a message to the log that the
threads are paused.
You can pause the job when you monitor the job status. When you expand the delete from source step, you
can click on a link to pause the job. You can resume the job from the step level or at the job level. When you
resume the job, the job continues at the point from which you paused the job. The log indicates when the
threads pause and resume.
Searching Jobs
To search jobs, you can conduct a quick search or an advanced search facility.
Quick Search
The generated results in quick search are filtered for the currently logged-in user.
My Previous Error / Paused Job Last Job that raised an Error or was Paused.
All My Yesterday’s Jobs All Jobs that were scheduled for the previous day.
All my Error Jobs All Jobs that were terminated due to an Error.
All my Paused Jobs All Jobs that were Paused by the currently logged-in user.
Search For A drop down list is available here, to search Jobs by their Job ID, Job Name,
or Project name.
Advanced Search
Advanced Search is available for the following parameters, each of which is accompanied by a drop down list
containing “Equals” and “Does not Equal”, to include or exclude values specified for each parameter.
Project/Program When you search for a project, you can also specify a stage in the data archive job
execution, for example, Generate Candidates.
Searching Jobs 65
Chapter 5
All repositories are displayed in a list, with information such as the database size (in GB), and the date it was
last analyzed.
After the DGA_DATA_COLLECTION job is complete, the dashboard reflects an entry for the respective
database as shown in the following figure.
Click the Repository link to view the graphical representation of Modules, Tablespaces, and Modules Trend.
Note: To view information about a repository in the dashboard, you must run the DGA_DATA_COLLECTION
standalone program from the Schedule Job page (against that repository).
66
Scheduling a Job for DGA_DATA_COLLECTION
To run DGA_DATA_COLLECTION:
5. Select the Source Repository by clicking the LOV button to select from the LOVs.
6. Specify schedule time as Immediate from the Schedule section.
7. Setup notification information such as Email ID for messages on completion of certain events
(Completed, Terminated, and Error).
8. Click Schedule to run the DGA_DATA_COLLECTON job.
• Source and target repository for the data. If you archive to a database, the source database and the target
database must be the same type.
• The project action, which specifies whether to retain the data in the source database after archiving.
The following table describes the project actions:
Action Description
Archive Only Data is extracted from the source repository and loaded into the target repository.
Purge Only Data is extracted from the source repository and removed from the source repository.
Archive and Purge Data is extracted from the source repository, loaded into the target repository, and data is
removed from the source repository.
• Specification of entities from the source data and imposition of relevant retention policies defined earlier.
• Configuration of different phases in the archival process and reporting information.
To view archive projects, select Workbench > Manage Archive Projects. To create an archive project, click
New Archive Project in the Manage Archive Projects page.
69
Specifying Project Basics
As a first step to archiving, the information to be specified is described in the following tables.
General Information
Field Description
Action A mandatory value for Data Archive Action can be Archive Only, Purge Only, and
Archive and Purge.
Source
Field Description
Connection name Source connection that you want to archive from. This field is mandatory.
Analyze Interim Oracle sources only. Determines when the interim table is analyzed for table structure
and data insertion.
- After Insert. Analyzer runs after the interim table is populated.
- After Insert and Update. Analyzer runs after the interim table is populated and
updated.
- After Update. Analyzer runs after the interim table is updated.
- None. Analyzer does not run at all.
Default is After Insert and Update. To optimize performance for best results, use the
default value.
Delete Commit Interval Number of rows per thread that the archive job deletes before the job issues a commit
to the database.
Default is 30,000. Performance is typically optimized within the range of 10,000 to
30,000. Performance may be impacted if you change the default value. The value you
configure depends on several variables such as the number of rows to delete and the
table sizes.
Use the following rules and guidelines when you configure the commit interval:
- If you configure an Oracle source connection to use Oracle parallel DML for deletion,
the archive job creates a staging table that contains the row IDs for each commit
interval. The archive job truncates the table and reinserts rows for each commit
interval. Populating the staging table multiple times can decrease performance. To
increase performance, configure the commit interval to either zero or a very high
amount, such as 1,000,000, to generate a single commit.
- If you pause the archive job when the job deletes from Oracle sources, the commit
interval affects the amount of time that the job takes to pause. The job pauses after
all threads commit the active transaction.
- As the commit interval decreases, the performance may decrease. The amount of
messages that the job writes to the log increases. The amount of time that the
archive job takes to pause decreases.
- As the commit interval increases, performance may increase. The amount of
messages that the job writes to the log decreases. The amount of time that the
archive job takes to pause increases.
Insert Commit Interval Number of database insert transactions after which a commit point should be created.
Commit interval is applicable when the archive job uses JDBC for data movement.
If your source data contains the Decfloat data type, set the value to 1.
Delete Degree of Parallelism Number of concurrent threads that are available for parallel deletion. The number of
rows to delete for each table is equally distributed across the threads.
Default is 1.
Insert Degree of Parallelism The number of concurrent threads for parallel insertions.
When the Oracle Degree of Parallelism for Insert is set to more than ‘1’, the java
parallel insert of tables is not used.
Reference Data Store Required when you use the history database as the source and you only want to purge
Connection from the history database.
Connection that contains the original source reference data for the records that you
archived to the history database. The source includes reference data that may not
exist in the history database. The archive job may need to access the reference data
when the job purges from the history database.
Target
When you create an archive project, you choose the target connection and configure how the job archives
data to the target. You can archive to the Data Vault or to another database.
Archive to the Data Vault to reduce storage requirements and to access the archived data from reporting
tools, such as the Data Discovery portal. Archive to another database to provide seamless data access to the
archived data from the original source application.
Property Description
Connection Name List of target connections. The target connection includes details on how the archive
job connects to the target database and how the archive job processes data for the
target.
Choose the Data Vault target that you want to archive data to. This field is mandatory.
Include Reference Data Select this option if you want the archive project to include transactional and reference
data from ERP tables.
Reference Data Store Required when you use a history database as the source.
Connection Connection that contains the original source reference data for the records that you
archived to the history database. The source includes reference data that may not
exist in the history database. The archive job may need to access the reference data
when the job archives or purges from the history database.
Enable Data Encryption Select this option to enable data encryption on the compressed Data Vault files during
load. If you select this option, you must also choose to use a random key generator
provided by Informatica or your choice of a third-party key generator to create an
encryption key. Data Archive stores the encrypted key in the ILM repository as a hidden
table attribute. When you run the archive definition, the key is passed to Data Vault as
a job parameter and is not stored in Data Vault or any log file. The encrypted key is
unique to the archive definition and is generated only once for a definition. If you run
an archive definition more than once, the same encryption key is used.
If you have enabled data encryption for both the selected target connection and the
archive definition, Data Archive uses the archive definition encryption details. If you do
not configure data encryption at the definition level, then the job uses the details
provided at the target connection level.
Use Random Key Generator Option to use a random key generator provided by Informatica when data encryption is
enabled. When you select this option, the encryption key is generated by a random key
generator provided by Informatica (javax.crypto.KeyGenerator).
Use Third Party Option to use a third-party key generator when data encryption is enabled. If you select
this option, you must configure the property "informia.encryptionkey.command" in the
conf.properties file. Provide the command to run the third-party key generator.
Property Description
Connection name List of target connections. The target connection includes details on how the archive
job connects to the target database and how the archive job processes data for the
target.
Choose the database target that you want to archive data to.
Seamless Access Required Select this option if you want to archive to a database repository at the same time as
you archive to the Data Vault server.
Allows seamless access to both production and archived data.
Seamless Access Data Repository that stores the views that are required for enabling seamless data access.
Repository
Commit Interval (For Insert) Number of database insert transactions after which the archive job creates a commit
point.
The available fields depend on the type of archive cycle you select. After you configure the target, you can
continue to configure the project details or save the project.
Click Save Draft to save a draft of the project. You can edit the project and complete the rest of the
configuration at a later time.
Note: If your source database is Oracle, you can archive tables that contain up to 998 columns.
Click Add Entity. A popup window displays available entities in the source database for selection.
On selecting the desired entity and clicking Select, a new entity is added to the Data Archive definition as
shown in the following figure, based on the Data Archive action selected.
Candidate Determines the report that the project generates during the candidate generation step.
Generation - Summary: The report lists counts of purgeable and non-purgeable records for each entity, and
Report Type counts of non-purgeable records by exception type.
A record marked for archiving is considered as purgeable. An exception occurs when purging a record
causes a business rule validation to fail.
- Detail: In addition to the summary report content, if records belong to related entities, this report
contains information about each exception.
- Summary Exclude Non Exceptions: Similar to detail, the report contains detailed information where
an exception did not occur.
- None: No report is generated.
Enabled Determines whether to exclude the entity from archiving. Disable to exclude the entity from archiving.
Default is enabled.
Policy When the target is the Data Vault, displays a list of retention policies that you can select for the
entity. You can select any retention policy that has general retention. You can select a retention
policy with column level retention if the table and column exist within the entity. The retention policy
list appends the table and column name to the names of retention policies with column level
associations. If you do not select a retention policy, the records in the entity do not expire.
Role Data Vault access role for the archive project if the target is the Data Vault. The access role
determines who can view data from Data Discovery searches.
If you assign an access role to an archive or retirement project, then access is restricted by the
access role that is assigned to the project. Data Discovery does not enforce access from the access
roles that are assigned to the entity within the project. Only users that have the Data Discovery role
and the access role that is assigned to the project can access the archive data.
If you do not assign an access role to an archive or retirement project, then access is restricted by
the access role that is assigned to the entity in the project. Only users that have the Data Discovery
role and the access role that is assigned to the entity can access the archive data.
Operators Depending on the list of values that were specified for columns in tables belonging to the entity, a list
and Values of operators is displayed, whose values must be defined.
Internally, during Data Archive job execution, data is not directly copied from data source to data target.
Instead, a staging schema and interim (temporary) tables are used to ensure that archivable data and
associated table structures are sufficiently validated during Archive and Purge.
• Generate Candidates. Generates interim tables based on entities and constraints specified in the previous
step.
• Build Staging. Builds the table structure in the staging schema for archivable data.
• Copy to Staging. Copies the archivable rows from source to the staging tablespace.
This report gives information on affected ROWS for all selected stages in the process. ROWCOUNT(s) will be
logged for the Steps: Generate Candidates and Copy to Staging for this Data Archive job execution.
Note: Whenever a Row Count Report is requested for the Generate Candidates stage in Data Archive job
execution, a Simulation Report will also be generated, with information for table Size, Average Row Size and
Disk Savings for that particular Data Archive job execution.
Running Scripts
You can also specify JavaScripts or Procedures to run Before or After a particular stage under the Run Before
and Run After columns.
Notification Emails
After selecting a check box under the Notify column, a notification email with relevant status information is
sent to the user when the Data Archive process is aborted due to an Error or Termination event.
Note: While adding a Data Source, if the Use Staging option is disabled, then the stages: “Build Staging” and
“Copy to Staging” will not be included in the Data Archive job execution.
The Project is saved and the Schedule Job page is displayed, when Publish & Schedule is clicked.
On the other hand, the Project gets listed in the Manage Data Archive Projects page when either Publish or
Save Draft is clicked. From a user’s perspective, the former indicates that the Project is ready for scheduling,
the latter means modifications are still required.
When you publish an archive project to archive data to the Data Vault, Data Archive moves the data to the
staging directory during the Copy to Destination step. You must run the Data Vault Loader standalone job to
complete the move to the Data Vault. The Data Vault Loader job executes the following tasks:
Note: Data Vault supports a maximum of 4093 columns in a table. If the Data Vault Loader tries to load a
table with more than 4093 columns, the job fails.
You will need the archive project ID to run the Data Vault Loader job. To get the archive project ID, click
Workbench > Manage Archive Projects. The archive project ID appears in parenthesis next to the archive
project name.
Verify that you enabled the row count report for the Copy to Destination step in the archive project.
Verify that the tables in each schema have different names. If two tables in different schemas have the same
name, the job will fail.
Performance
For retirement projects, the Data Vault Loader process runs for days.
When you build retirement entities, do not use one entity for all of the source tables. If an entity includes a
large number of tables, the Data Vault Loader may have memory issues. Create one entity for every 200-300
tables in the source application.
You can use a wizard to automatically generate entities. Or, you can manually create entities if you want the
entities to have special groupings, such as by function or size.
When you create entities with smaller amounts of tables, you receive the following benefits:
The archive job fails due to known limitations of the Oracle JDBC drivers. Or, the archive job succeeds,
but produces unexpected results. The following scenarios are example scenarios that occur due to
Oracle JDBC driver limitations:
To resolve the errors, use a different Oracle JDBC driver version. You can find additional Oracle JDBC
drivers in the following directory:
You may receive an out of memory error if several tables include LOB datatypes.
• Lower the JDBC fetch size in the source connection properties. Use a range between 100-500.
• Reduce the number of Java threads in the conf.properties file for the
informia.maxActiveAMThreads property. A general guideline is to use 2-3 threads per core.
- iostat-m 1
- top
- ps -fu 'whoami' -H
• Use multiple web tiers to distribute the extraction.
To use multiple web tiers, perform the following steps:
1. Create a copy of the web tier.
2. Change the port in the conf.properties file.
3. Start the web tier.
When I use multiple web tiers, I receive an error that the archive job did not find any archive folders.
Run the Data Vault Loader and Update Retention jobs on the same web tier where you ran the Create
Archive Folder job.
The archive job fails because the process did not complete within the maximum amount of time.
You may receive an error that the process did not complete if the archive job uses IBM DB2 Connect
utilities for data movement. When the archive job uses IBM DB2 Connect utilities to load data, the job
The archive job fails at the Copy to Staging step with the following error: Abnormal end unit of work condition occurred.
The archive job fails if the source database is IBM DB2 and the data contains the Decfloat data type.
The archive job fails at the Build Staging or Validate Destination step.
If the source database is Oracle and a table contains more than 998 columns, the archive job fails.
This issue occurs because of an Oracle database limitation on column number. Oracle supports a
maximum number of 1000 columns in a table. The archive job adds two extra columns during execution.
If the table contains more than 998 columns on the source database, the archive job fails because it
cannot add any more columns.
Salesforce Archiving
This chapter includes the following topics:
The Salesforce accelerator is licensed separately from Data Archive and you must install the accelerator
after you install or upgrade Data Archive. After you install the accelerator, you might have to configure the
startApplimation.bat/startApplimation.sh file or the internet browser to connect to Salesforce, if the ILM
application server or the Enterprise Data Manager client is running behind a proxy server. For more
information about proxy server settings, see the chapter "Starting Data Archive" in the Data Archive
Administrator Guide. You must also configure certain permissions in the Salesforce profiles, users, and
objects in order to archive or purge the data.
You can then import metadata from Salesforce through the Enterprise Data Manager and if necessary create
custom Salesforce entities for archiving. The accelerator includes two standard entities, the Task and Event
entities, that contain standard Salesforce tables that you might want to archive.
When you have configured the Salesforce entities that you want to archive, create a source connection to
Salesforce in Data Archive and a target connection where the archived data will reside. Then you can
configure the archive job and define the parameters for each entity before scheduling the job to run. When
you define the parameters for the entities, you designate the Salesforce data to be archived. For example, you
can archive data that was created before a certain date or created by a certain Salesforce user.
81
After the data is archived, you can use features like legal hold, tagging, and retention on the archived data.
You can also search for the data in the Data Discovery Portal if it is archived to the Data Vault, or run reports
on the data with the Data Visualization feature.
If you are an Informatica Cloud customer, you can view the archived Data Vault data from the Salesforce user
interface. For more information about viewing data archived in the Data Vault through the Salesforce user
interface, see the H2L "Viewing Data Vault Data in Salesforce with Informatica Cloud."
For more information about installing the Salesforce accelerator, see the chapter "Salesforce Accelerator
Installation" in the Data Archive Installation Guide.
For more information on configuring Salesforce permissions, see the "Salesforce Archiving Administrator
Tasks" in the Data Archive Administrator Guide.
By default, the Salesforce application and an application version called "Sales" are visible in the Enterprise
Data Manager application hierarchy. You can use the Sales application version to import metadata from
Before you can configure entities, you must connect to Salesforce to import metadata from Salesforce, just
as you would for any ERP or custom application. You enter Salesforce connection details, such as the
Salesforce username, password, security token, and host URL. You also provide the name and (optionally) the
location of a database created by the JDBC driver for internal use. Next, you select the schema from which to
import the metadata. There are special considerations for extracting and importing metadata from child
tables in Salesforce entities, due to circular relationships between the tables that make automatic entity
creation impossible. For more information about importing Salesforce metadata and configuring entities, see
the "Salesforce Accelerator" chapter of the Enterprise Data Manager Guide.
When you create a Salesforce connection in Data Archive, you configure connection details and decide
whether or not to include records that are archived in Salesforce.
Salesforce has an internal archive functionality that archives eligible records from Salesforce objects. You
can choose to include these records archived by Salesforce in your archive job by selecting the Include
Salesforce Archived Records check box in the source connection. If you do not select this check box when
you configure the source connection, any records that have been archived by Salesforce are not archived with
the rest of the data.
This feature is enabled by default for both task and event objects in Salesforce. If you want to archive or
purge the Task and Event entities, you must select this check box in the source connection.
When you select the Include Salesforce Archived Records check box in the source connection, the driver gets
the records that are archived by Salesforce in addition to the soft-deleted records. If you use the same
connection to archive data from any custom entities, it is your responsibility to filter out the soft-deleted
records by appending "IsDeleted=false" in the entity's select, insert and delete query.
For more information about configuring a Salesforce source connection, see the "Source Connections"
chapter in the Data Archive Administrator Guide.
The process of creating a Salesforce archive project is the same as the process for archiving other
applications. You name the project; select whether you want to archive, purge, or archive and purge the data;
and select the Salesforce source connection that you created as well as a target connection.
When you identify the entities that you want to include in the archive project, you provide values for the entity
parameters to identify the data that you want to archive. For example, you can archive all the data created
before a certain date, or all of the data created by a certain Salesforce user. For more information about the
entity parameters, see the "Salesforce Accelerator" chapter in the Enterprise Data Manager Guide.
When you configure the source connection initially, all of the configuration files are created in the local driver
database. If any metadata in Salesforce has changed, run the Refresh Schema for Salesforce standalone job.
For example, after you create a Salesforce connection, a new Salesforce object might be added into the
Salesforce organization. To add the metadata for the new object into the local database, run the Refresh
Schema for Salesforce job.
For more information on configuring the Refresh Schema for Salesforce standalone job, see the "Scheduling
Jobs" chapter of the Data Archive User Guide.
Salesforce Limitations
Archiving from Salesforce has the following known limitations:
1. Salesforce marks the "querybale" attribute to "false" for some of the objects. This means that external
applications cannot query the object, so the Enterprise Data Manager cannot import metadata from
these objects. For more information about importing metadata from Salesforce, see the chapter
"Salesforce Accelerator" in the Enterprise Data Manager Guide.
2. Salesforce marks the "deletable" attribute to "false" for some of the objects. This means that external
applications cannot delete the records in the object, so Data Archive cannot delete records from these
objects.
The FeedPollChoice and FeedPollVote tables that are a part of the standard accelerator Task and Event
entities have this limitation. The Task or Event entity does not delete the records from these tables.
However, because of the cascade delete functionality in Salesforce, Salesforce deletes the records in
these tables when Data Archive deletes the records in the parent table.
For more information about the FeedPollChoice and FeedPollVote tables, see the chapter "Salesforce
Accelerator" in the Enterprise Data Manager Guide.
3. Salesforce does not allow triggers on objects to be updated through an API. Because of this Data
Archive cannot deactivate or activate a trigger.
Workaround: You can use the Force.com Migration Tool, the Salesforce user interface, or the Force.com
IDE to activate or deactivate a trigger.
4. The "Bookmark" information of a Feed Item in Salesforce cannot be archived, because bookmark
information is not exposed in any Salesforce object.
Salesforce Limitations 85
Chapter 8
In SAP applications, you can use the Archive Development Kit (ADK) to archive data. When you archive within
SAP, the system creates ADK files and stores the files in an external file system. When you retire an SAP
application, you also retire data stored in ADK files.
After you retire the application, you can use the Data Validation Option to validate the retired data. You can
use the Data Discovery portal or other third-party query tools to access the retired data.
1. Install and apply the SAP transports. For more information, see the "SAP Application Chapter" of the
Informatica Data Archive Administrator Guide.
2. Install and configure the SAP Java Connector. For more information, see the "SAP Application Chapter"
of the Informatica Data Archive Administrator Guide.
86
3. Copy the sapjco3.jar file to the webapp\web-inf\lib directory in Data Archive.
4. If the SAP application is installed on a Microsoft SQL Server database, enable the following property in
the conf.properties file by setting the value to "Y": informia.sqlServerVarBinaryAsVarchar=Y
5. Restart Data Archive.
6. Create a customer-defined application version under the SAP application in the Enterprise Data Manager.
7. Import metadata from the SAP application to the customer-defined application version.
8. Optionally, run the SAP smart retirement standalone job. For more information, see the topic "SAP Smart
Retirement" in this chapter.
9. Create retirement entities.
Use the generate retirement entity wizard in the Enterprise Data Manager to automatically generate the
retirement entities. When you create the entities, you can specify a prefix or suffix that will help you
identify the entities as retirement entities. If you ran the SAP smart retirement job, select "Group by SAP
Module" when you configure the retirement entities.
10. Create a source connection.
For the connection type, choose the database in which the SAP application is installed. Configure the
SAP application login properties. You must correctly enter the SAP host name so that the SAP datatypes
are correctly mapped to Data Vault datatypes during the retirement process.
11. Create a target connection to the Data Vault.
12. Run the Create Archive Folder job to create archive folders in the Data Vault.
13. If the SAP application is installed on a Microsoft SQL Server database, enable the following property in
the conf.properties file by setting the value to "Y": informia.sqlServerVarBinaryAsVarchar=Y
14. Create and run a retirement project.
When you create the retirement project, add the retirement entities and the pre-packaged SAP entities.
Add the attachment link entities as the last entities to process. Optionally, add the Load External
Attachments job to the project if you want to move attachments to the Data Vault.
15. Create constraints for the imported source metadata in the ILM repository.
Constraints are required for Data Discovery portal searches. You can create constraints manually or use
one of the table relationship discovery tools to help identify relationships in the source database.
The SAP application retirement accelerator includes constraints for some tables in the SAP ERP system.
You may need to define additional constraints for tables that are not included in the application
accelerator.
16. Copy the entities from the pre-packaged SAP application version to the customer-defined application
version.
17. Create entities for Data Discovery portal searches.
Use the multiple entity creation wizard to automatically create entities based on the defined constraints.
For business objects that have attachments, add the attachment tables to the corresponding entities.
The SAP application retirement accelerator includes entities for some business objects in the SAP ERP
system. You may need to create entities for business objects that are not included in the application
accelerator.
18. Create stylesheets for Data Discovery portal searches.
Stylesheets are required to view attachments and to view the XML transformation of SAP encoded data.
19. To access the SAP Archives, verify that your login user is assigned the SAP portal user role.
When you run the SAP smart retirement standalone job, you provide the name of a source connection from
the list of available connections. Based on the source connection properties, the job identifies the SAP
application version as it appears in the Enterprise Data Manager. The job then updates the ILM repository
with metadata from the SAP source connection. This metadata includes the type of table and whether or not
the table contains data.
Note: The SAP smart retirement job updates the metadata for a single source connection imported in the
Enterprise Data Manager. Do not export or import the pre-packaged accelerators entities from one
environment to another, for example from a development to a production environment.
If you run the SAP smart retirement job and then generate the retirement entities in the Enterprise Data
Manager, the resulting entities are grouped together by a common naming convention. The naming
convention identifies the type of tables and whether or not the tables contains data.
When you configure the retirement entities, you can specify a prefix or suffix that will be appended to the
entities when the Enterprise Data Manager creates them. The prefix or suffix can help you to identify the
entity as a retirement entity. After you generate the retirement entities, the entities appear in the Enterprise
Data Manager appended with the specified prefix or suffix. For example, the entity "V2_1CDOC_194" is a
change documents entity that has the prefix "V2," which you specified when you created the entities.
The Enterprise Data Manager also groups the entities by the SAP module. The entity name reflects the
module name as an abbreviation. For example, the entity "SD_001" is in the Sales and Distribution module,
while the entity "FI_04" is in the Finance module. When you configure the entities, you specify the maximum
amount of tables in the entity. The Enterprise Data Manager creates the required number of entities and
names them according to the module. For example, if you have 1,000 Sales and Distribution tables and you
specify a maximum of 200 tables per entity, the Enterprise Data Manager creates five different SD entities.
Naming Description
Convention
You can use this information to decide whether or not you want to retire the entity when you create the
retirement project. If you want to reduce the number of tables in the retirement project, you can exclude
certain types of tables, such as workflow tables, or tables that do not contain any data, as designated by the
"1NODT" in the naming convention.
The Enterprise Data Manager groups special tables in their own entities. Special tables are identified by the
suffix "_SAP."
Retirement Job
The retirement job uses a combination of methods to access data from SAP applications. The job accesses
data directly from the SAP database for normal physical tables. The job uses the SAP Java Connector to log
in to the SAP application to access data in transparent HR and STXL tables, ADK files, and attachments.
When you run a retirement job, the job uses the source connection configured in the retirement project to
connect to the source database and to log in to the SAP system. The job extracts data for the following
objects:
Physical Tables
Physical tables store data in a readable format from the database layer. The retirement job extracts data
directly from the database and creates BCP files. The job does not log in to the SAP application.
Transparent HR PCL1-PCL5 and STXL tables store data that is only readable in the SAP application layer.
The retirement job logs in to the SAP system. The job cannot read the data from the database because
the data is encoded in an SAP format.
The job uses the source connection properties to connect to and log in to the SAP application. The job
uses an ABAP import command to read the SAP encoded data. The job transforms the SAP encoded
data into XML format and creates BCP files. The XML transformation is required to read the data after
the data is retired. The XML transforms occurs during the data extraction.
ADK Files
ADK files store archived SAP data. The data is only readable in the SAP application layer. The retirement
job uses the source connection properties to connect to and log in to the SAP application. The job calls a
function module from the Archive Development Kit. The function module reads and extracts the archived
ADK data and moves the data to the BCP staging area as compressed BCP files.
Retirement Job 91
Attachments
Attachments store data that is only readable in the SAP application layer. The retirement job uses the
SAP Java Connector to log in to the SAP application and to call a function module. The function module
reads, decodes, and extracts the attachments that are stored in the database. The job moves the
attachments to the attachment staging area that you specify in the source connection. Depending on
your configuration, the job may also move attachments that are stored in an external file system.
When the retirement job creates BCP files, the job compresses and moves the files to a BCP staging area. For
physical tables, the job stores the BCP files in the staging directory that is configured in the Data Vault target
connection. For transparent HR and STXL tables, ADK files, and attachments, the job uses an SAP function
module to generate the BCP files. The SAP function module stores the BCP files in the staging directory that
is configured in the source connection.
The job uses the BCP file separator properties in the conf.properties file to determine the row and column
separators. The job uses the Data Vault target connection to determine the maximum amount of rows that
the job inserts into the BCP files. After the job creates the BCP files, the Data Vault Loader moves the files
from the BCP staging area to the Data Vault.
Attachment Retirement
You can archive CRM, ERP, SCM, SRM, and GOS (Generic Object Services) attachments. The attachments can
exist in the SAP database and in an external file system. Data Archive retires any objects that are attached to
the business object, such as notes, URLs and files.
By default, the retirement job archives all attachments that are stored in the SAP database and in an external
file system or storage system. If attachments are in a storage system, then archive link must be available.
The job creates a folder in the staging location for each business object ID that includes attachments. The
folder stores all attachments for the business object ID. You specify the staging location in the source
connection. The job uses the following naming convention to create the folders:
<SAP client number>_<business object ID>
The job downloads all attachments for the business objects to the corresponding business object ID folder.
The job downloads the attachments in the original file format and uses a gzip utility to compress the
attachments. The attachments are stored in .gz format.
For attachments that are stored in the SAP database, the retirement job uses an ABAP import command to
decode the attachments before the job downloads the attachments. The SAP database stores attachments in
an encoded SAP format that are only readable in the SAP application layer. The job uses the SAP application
layer to decode the attachments so you can read the attachments after you retire the SAP application.
For attachments that are stored in a file system, the job downloads the attachments from the external file
system or storage to the staging location. Optionally, you can configure the attachment entity to keep
attachments in the original location.
After the job creates the folder for the business object ID, the job appends the business object number as a
prefix to the attachment name. The prefix ensures that attachment has a unique identifier and can be
associated with the correct business object.
The job uses the following naming convention for the downloaded attachment names:
<business object number>_<attachment name>.<file type>.gz
After you run the retirement job, you choose the final storage destination for the attachments. You can keep
the attachments in the staging file system, move the attachments to another file system, or move the
Attachment Storage
Attachment storage options depend on the location of the attachments in the source application. You can
keep the attachments in the original location or move the attachments to a different location. It is mandatory
to move attachments that are stored in the SAP database to a different storage destination. It is optional to
move attachments that are stored in an external file system.
You can keep attachments in the original storage for attachments that are stored in an external file
system. By default, the retirement job moves all attachments to the file system that is configured in the
source connection.
To keep the attachments in the original file system, configure the run procedure parameter for the
attachment entity.
By default, the retirement job moves attachments that are stored in the SAP database and in an external
file system to a staging file system. The staging file system is configured in the source connection.
After the retirement job moves attachments to the staging file system, you can keep the attachments in
this location or you can move the attachments to another file system, such as Enterprise Vault, or move
the attachments to the Data Vault. Store attachments in a file system to optimize attachment
performance when users view attachments.
Data Vault
After the retirement job archives attachments to the staging file system, you can move the attachments
to the Data Vault.
Note: The retirement job compresses the attachments. The Data Vault Service does not add significant
compression. In addition, users may experience slow performance when viewing attachments. To
optimize attachment performance, store attachments in a file system instead.
To move attachments to the Data Vault, run the Load External Attachments job after the retirement job
completes. The Load External Attachments job moves the attachments from the staging file system to
the Data Vault.
Attachment Viewing
After you archive the attachments, you can view the attachments in the Data Discovery portal or with a third-
party reporting tool. The ILM repository link table, ZINFA_ATTCH_LINK, includes information on the
downloaded attachments. The table includes information to link business objects to the corresponding
attachments. The link table is included in the pre-packaged SAP entities for business objects that have
attachments.
To view attachments in the Data Discovery portal, you must configure entities for discovery. For any business
object that may have attachments, you must add the attachment tables to the corresponding entity. If you
store the attachments in a file system, add the ZINFA_ATTCH_LINK table to the entities. If you store the
attachments in the Data Vault, add the ZINFA_ATTCH_LINK and AM_ATTACHMENTS tables to the entities.
To view attachments from the Data Discovery portal, you must create a stylesheet. In the stylesheet, build a
hyperlink to the attachment location. Use the ZINFA_ATTCH_LINK table to build the path and file name. When
users perform searches, business object attachments appear as hyperlinks. When users click a hyperlink, the
Attachment Retirement 93
default browser opens the attachment. The attachments are compressed in gzip format. The client from
which the user runs the Data Discovery portal searches must have a program that can open gzip files, such
as 7zip.
If you use the Reports and Dashboards window to access the reports, you must first copy the report
templates to the folder that corresponds to the archive folder in the Data Vault target connection. The archive
folder is equivalent to a Data Vault database. When you copy the reports, the system updates the report
schemas based on the target connection configuration. Then you can run the reports.
If you use the SAP Archives to access the reports, all of the reports installed by the accelerator are copied to
the corresponding SAP archive folder the first time that you launch the SAP Archives. If you plan to access
the reports through the SAP Archives, you do not need to manually copy the reports to the archive folder,
though you have the option to do so.
When you install the SAP application retirement accelerator, the installer publishes the reports to the data
visualization server with the following properties:
Property Value
Related Topics:
• “Copying Reports” on page 219
Customer List
The customer list report contains the following tables:
KNA1 Transparent
KNB1 Transparent
T001 Transparent
KNA1 Transparent
KNB1 Transparent
T001 Transparent
BSID Transparent
KNA1 Transparent
T001 Transparent
KNB1 Transparent
KNC1 Transparent
KNA1 Transparent
T001 Transparent
KNA1 Transparent
KNB1 Transparent
T001 Transparent
Vendor List
The vendor list report contains the following tables:
LFA1 Transparent
LFB1 Transparent
T001 Transparent
LFA1 Transparent
LFB1 Transparent
BSIK Transparent
LFA1 Transparent
T001 Transparent
LFC1 Transparent
LFA1 Transparent
LFA1 Transparent
LFB1 Transparent
T001 Transparent
Asset List
The asset list report contains the following tables:
ANLC Transparent
T001 Transparent
ANLA Transparent
ANLP Transparent
ANLA Transparent
T001 Transparent
Asset Depreciation
The asset depreciation report contains the following tables:
ANLB Transparent
ANLC Transparent
T001 Transparent
ANLA Transparent
Asset Transactions
The asset transactions report contains the following tables:
ANLA Transparent
ANLZ Transparent
T001 Transparent
ANLA Transparent
ANLZ Transparent
T001 Transparent
SKA1 Transparent
SKB1 Transparent
T001 Transparent
BSIS Transparent
SKA1 Transparent
T001 Transparent
GLT0 Transparent
T001 Transparent
FAGLFLEXT Transparent
T001 Transparent
FAGLFLEXT Transparent
FAGLFLEXP Transparent
FAGLFLEXA Transparent
T001 Transparent
SKAT Transparent
BKPF Transparent
T001 Transparent
BSEG Special
BSET Special
SKAT Transparent
BKPF Transparent
T001 Transparent
BSEG Special
T001 Transparent
Business Partner
The business partner report contains the following tables:
BUT000 Transparent
BUT020 Transparent
BUT0BK Transparent
ADRC Transparent
STXH Transparent
ZINFA_STXL Special
SAP Archives
If you have installed the SAP retirement accelerator, you can access the data visualization reports through
the SAP Archives.
The SAP Archives contain all of the reports installed by the SAP application retirement accelerator. After you
install the accelerator and launch the SAP Archives for the first time, Data Archive copies the report
templates from the default folder to the archive folder where the application is retired.
If you have retired the SAP application to one archive folder, you can view all of the available reports when
you launch the SAP Archives. If you have retired the SAP tables to more than one archive folder, you are given
the option to select an archive folder when you launch the SAP Archives.
Message Format
Messages that are related to SAP application retirement convey information about a task that completed
successfully, an error that occurred, a warning, or status updates.
Each message starts with a message ID. The message ID is a unique mix of numbers and letters that identify
the type of message, type of component, and message number.
1. Message type
2. Component code
3. Message number
Message type
• E - Error
Component code
The second, third, and fourth characters in the message ID represent the component code of the
component that triggered the message.
Message number
The last two numbers in the message ID represent the message number.
When I run the retirement job, the job immediately completes with an error.
Perform the following steps to verify the SAP Java Connector installation:
If the installation is not correct, perform the following steps to resolve the error:
When I run the retirement job, I receive an error that the maximum value was exceeded for no results from the SAP
system.
For every package of rows that the job processes, the job waits for a response from the SAP system. The job
uses the response time configured in the conf.properties file and the package size specified in the job
parameters. If the SAP system does not respond within the configured maximum response time, the job fails.
The SAP system may not respond for one or more of the following reasons:
• The maximum amount of SAP memory was exceeded. The SAP system terminated the process. The SAP
system may run out of memory because the package size that is configured in the job is too high.
• The SAP system takes a long time to read the data from the database because of the configured package
size.
• The network latency is high.
To resolve the error, in the SAP system, use transaction SM50 to verify if the process is still running. If the
process is still running, then restart the job in Data Archive.
If the process is not running, perform one or more of the following tasks:
• Use SAP transactions SM37, ST22, and SM21 to analyze the logs.
• Decrease the package size in the job parameters.
• Increase the informia.MaxNoResultsIteration parameter in the conf.properties file.
The job log indicates that the Data Vault Loader received an exception due to an early end of buffer and the load was
aborted.
The error may occur because the SAP application is hosted on Windows.
105
Creating Retirement Archive Projects Overview
A retirement archive project contains the following information:
Comments A section where additional information related to this particular retirement can
be added.
The Add More Attributes button A facility to add any more attributes that you may want to include to the
application details which would serve in the process of application retirement.
Use Data Encryption Select this option to enable data encryption on the compressed Data Vault files
during load. If you select this option, you must also choose to use a random key
generator provided by Informatica or your choice of a third-party key generator to
create an encryption key. Data Archive stores the encrypted key in the ILM
repository as a hidden table attribute in case the job fails and must be resumed.
When you run the retirement job, the key is passed to Data Vault as a job
parameter and is not stored in Data Vault or any log file. If the retirement job is
successful, the key is deleted from the ILM repository. The encrypted key is
unique to the retirement definition and is generated only once for a definition.
If you have enabled data encryption for both the selected target connection and
the retirement definition, Data Archive uses the retirement definition encryption
details. If you do not configure data encryption at the definition level, then the
job uses the details provided at the target connection level.
Use Random Key Generator Option to use a random key generator provided by Informatica when data
encryption is enabled. When you select this option, the encryption key is
generated by a random key generator provided by Informatica
(javax.crypto.KeyGenerator).
Use Third Party Option to use a third-party key generator when data encryption is enabled. If you
select this option, you must configure the property
"informia.encryptionkey.command" in the conf.properties file. Provide the
command to run the third-party key generator.
Clicking the Next button advances to the section on defining source and target repositories. The Save Draft
button saves the project for future modifications and scheduling.
Define a connection to the application that you want to retire. Select the application version. Then, create or
choose a source connection.
Important:
• If the source connection is a Microsoft SQL Server, clear the Compile ILM Functions check box on the
Create or Edit an Archive Source page before you start a retirement job. This ensures that the staging
database user has read-only access and will not be able to modify the source application.
• Clear the Use Staging check box on the Create or Edit an Archive Source page before you start a
retirement job.
Adding Entities
Add the entities from the application that you want to retire.
Field Description
Policy Retention policy that determines the retention period for records in the entity. You can create a
retention policy or select a policy. You can select any retention policy that has general retention. If
you do not create or select a retention policy, the records in the entity do not expire.
Name Name of the retention policy. Required if you create a retention policy.
Retention The retention period in years or months. Data Archive uses the retention period to determine the
Period expiration date for records in the entity. If you select or create a policy with general retention, the
expiration date equals the retirement archive job date plus the retention period.
The following table describes the fields in the Access Roles group:
Field Description
Access Roles Data Vault access role for the retirement archive project. You can create an access role or select a
role. The access role determines who can view data from Data Discovery searches.
When you click the Publish and Schedule button, the project is saved and the Schedule Job page appears.
When you click the Publish button or the Save Draft button, Data Archive lists the project in the Manage
Retirement Projects page. Publishing the project indicates that the project is ready for scheduling. Saving a
draft means that modifications are still required.
Clicking the Next button opens the Review and Approve page. The Save Draft button saves the project for
future modifications and scheduling.
Retired Object Name Application module that is being retired/ selected by user.
Click the Application Info tab to view the detailed information of the selected application and click the Source
& Target tab to view the configured Source and Target details. Detailed information is displayed by clicking
each of the three sections.
1. Generate Candidates. Generates Interim tables based on Entities and constraints specified in the
previous step.
2. Validate Destination. Validates table structure and Data in the Target repository to generate DML for
modifying Table structure and/or adding Rows of data.
3. Copy to Destination. Copies data to Data Destination.
4. Purge Staging. Deletes Interim tables from staging schema.
After the retirement project completes successfully, you cannot edit or re-run the project.
Running Scripts
One can also specify JavaScripts or Procedures to run before or after a particular stage under the Run Before
and Run After columns.
Notification Emails
On selecting a check box under the Notify column, a notification email (with relevant status information) is
sent to the user when the archive process is aborted due to an error or termination event.
When either Publish or Save Draft is clicked, the project gets listed in the Manage Retirement Projects page.
From a user’s perspective, the former indicates that the project is ready for scheduling. The latter means
modifications are still required.
Application Migration
Use the application migration process to migrate an application that has been retired and validated from a
non-production environment (referred to as a source environment in this document) to a production
environment (referred to as a target environment).
After you have retired an application on a non-production environment, typically you would repeat the entire
process, including application retirement, data validation, report creation, compliance, and sign-off, on the
target (production) environment. If you have completed the entire process on a pre-production environment
and the retired data is identical to what you must retire again on the production instance, the application
migration process prevents you from having to repeat the entire process. Instead, you can migrate the
relevant data relevant data from pre-production onto the production instance.
This saves time and effort as opposed to duplicating the pre-production work, as long as you have retired the
entire application in the source instance. During the migration process, relevant application data and
metadata is directly migrated from the source to target environment.
• The application version/product family version that you want to migrate does not already exist in the
target Data Archive and Data Vault environments.
• The archive folder does not exist in the target Data Vault environment.
• The source and target environments must be on the same operating system.
• Both the source and target environments must have the same versions of Data Vault and Data Archive.
• The staging directory and data directory in the target Data Vault connection must be accessible (both read
and write permissions) to both the target Data Vault and Data Archive systems.
1. Add the migration administrator role to the user performing the migration. The default administrator
user, AMADMIN, is assigned the role by default.
2. Run the Export Informatica Data Vault Metadata standalone job.
3. Create a target connection to the ILM repository of the target Data Archive environment.
4. Run the Migrate Data Archive Metadata standalone job.
1. Add the migration administrator role to the user performing the migration. The default administrator
user, AMADMIN, is assigned the role by default
2. Update the connections details to the target Data Vault.
3. Move the SCT data files to the export file directory specified in the Export Informatica Data Vault
Metadata standalone job. Verify that the .tar file is accessible to the target Data Archive instance.
4. If necessary, update the ssa.ini file.
5. Run the Import Informatica Data Vault Metadata standalone job.
6. Recreate the Data Vault indexes.
For more information about system-defined roles, see the "Security" chapter of the Data Archive Administrator
Guide.
Parameter Description
Source IDV Folder Required. Name of the source Data Vault archive folder that you want to migrate.
Name
Target File Archive Required. Full path of the location in the target Data Vault system that stores the SCT data
Data Directory files. If the target file archive data directory is not the same directory that stores the SCT
data files in the source system, provide the path to a different directory in this parameter.
If the target file archive data directory is the same directory that stores the SCT data files
in the source system, provide the path to that directory and select "Yes" for the Target to
Use Same Storage as Source parameter.
Export File Required. Full path of the temporary directory on the source Data Archive system that
Directory stores the run-time files, the log file, and the .tar file. This directory must be accessible to
the source Data Archive system.
Staging Directory Optional for a local file system. Required when the target SCT data file directory is an
for Target External external storage directory. Full path of the location in the target external storage system
Storage that stores the SCT data files that you want to import to the external storage system.
Target to Use Same Required. If you select yes, the job keeps the original SCT data file location between the
Storage as Source source and target environments. If the target file archive data directory is the same
directory that stores the SCT data files in the source system, you must select "Yes" for the
Target to Use Same Storage as Source parameter. If the target file archive data directory
is not the same as the source directory, select "No" in the Target to Use Same Storage as
Source parameter and verify that you have given the directory path in the Target File
Archive Data Directory parameter.
Copy Data Files to Required. If you select yes, the SCT files are copied to the export file directory along with
Export File the run-time files, log file, and .tar file.
Directory
Export Materialized Required. If you select yes, any materialized view statements on the source will be
View Statements exported during the job run. When you run the Import Informatica Data Vault Metadata job
later in the process, the materialized views will be created on the target.
Parameter Description
Target Metadata Required. The target ILM repository connection created in the previous step.
Repository
Product Family Version Required. The product family version/application version that you want to migrate.
Copy Users and Roles Required. If you select no, the job migrates only the retirement entity-related roles. If
you select yes, the job migrates users related to the retirement entity-related roles, in
addition to other roles related to the users.
Migrate Export Required. If you select yes, the job migrates only the export migration status table. If
Migration Status Table you select No, the job migrates all of the corresponding meta tables.
Only
1. From the export file directory that you specified in the Export Informatica Data Vault Metadata
standalone job, open the idvmeta_sctlist.txt file.
2. Follow the instructions in the file to move the SCT data files to the target Data Vault environment. The
file contains a list of the SCT data files that you must copy to the target file archive data directory.
3. Move the .tar file in the export file directory to a folder accessible to the target Data Archive instance.
1. If you chose to use an external storage system for the SCT data files, update the ssa.ini file in both the
file_archive and installation folders. Follow the format of the ssa.ini file in the source environment. For
example:
[HCP_CONNECTION IDV_FILEOUT4]
HCP_AUTH_TOKEN = hcp-ns-auth=dHTuMnVzZXI=:489825A0951C3CF1F22E27B61CEE6143
2. If you exported materialized view statements from the source Data Vault system and want to migrate
them to the target system, add the MVIEWDIR parameter in both the [QUERY] and [SERVER] sections of
the ssa.ini file of the target Data Vault. For example:
[QUERY]
………………
MVIEWDIR=/data/mviewdir
[SERVER]
…………….
MVIEWDIR=/data/mviewdir
3. After you add the entries, restart Data Vault.
Parameter Description
Target IDV Connection Required. The connection that links to the target Data Vault instance.
Export IDV Metadata Job ID The Export Data Vault Metadata job ID (from the source Data Archive environment)
and IDV Folder Name and the Data Vault folder name that you want to migrate to the target Data Vault.
TAR Archive File Directory Required. The directory of the .tar file exported by the Export Informatica Data
Vault Metadata job. This directory must be accessible from the target Data Archive
instance.
Note: If the target file archive data directory in the Export Informatica Data Vault Metadata job is an
external storage directory, and you have manually transferred the SCT data files to the directory defined
as the "staging directory for target external storage" in the export job, the import job automatically
transfers the SCT data files to the target external storage.
There are two types of resources (reports and catalogs) that you can migrate:
Published Resources
Published resources are managed entirely in the JReport Server and support all of the actions available
on the server. You can delete published resources from Data Archive and also the Jinfonet user
interface. Custom reports created with Data Visualization Designer or from the Data Archive user
interface are examples of published resources.
Real path resources are managed in both the server and OS. In the server, when you access a resource
node which is linked with a real path, the local resources in the real path are loaded to the node and
displayed together with other server resources, including the published resources, in the node. You
cannot delete the real path resources directly from the Data Archive or Jinfonet user interface, but you
can remove the resources from the local disk. SAP reports and reports within the Application Retirement
for Healthcare accelerator are examples of real path resources.
Depending on whether the reports you want to migrate are real path resources or published resources,
complete the following tasks to migrate the reports:
1. In JReport Designer, select File > Publish and Download > Download from Server > Download Report
from Server.
The Connect to JReport Server window appears.
2. Enter the connection details to connect to the JReport server and click Connect.
3. Select the directory that contains the reports that you want to download. You can only select one
directory at a time.
4. Click OK.
5. Select the check box next to the catalog and each report that you want to download for migration.
6. As a best practice, save the reports and catalog to a directory in the Data Archive installation with the
same name as the directory that you are downloading from.
For example, if the directory that contains the reports you want to download is called "REPORT_FOLDER,"
append "REPORT_FOLDER" to the Download Resource To path.
Step 1. Download the Reports and Catalog from the Source 119
11. Verify that the resources you downloaded exist in the path that you provided as the Download Resource
To location.
When you create a report from the Reports and Dashboards menu in Data Archive, the system creates a
catalog name automatically. The system-created catalog naming convention is
<TARGET_REPOSITORY_ID><HYPHEN><TARGET_CONNECTION_PFVID>.
For consistency, rename catalog name as the target connection repository ID and target connection
application version/product family version ID. The sequence ID can be copied from the target ILM repository
database.
When you create reports from JReport Designer, you have the option to choose the catalog name. In this case
you can maintain the same name in the target as well.
1. In JReport Designer, click File > Publish and Download > Publish to Server > Publish Report to Server.
The Connect to JReport Server window appears.
2. Enter the connection details for the target JReport server and click Connect.
1. From Data Archive, select Data Visualization > Reports and Dashboards.
The Reports and Dashboards window appears.
2. Select one of the reports within the folder that you migrated and then select Actions > Run Report.
If the migration was successful, the report appears in view mode.
1. In JReport Designer, select File > Publish and Download > Download from Server > Download Report
from Server.
The Connect to JReport Server window appears.
2. Enter the connection details to connect to the JReport server and click Connect.
3. Select the directory that contains the reports that you want to download. You can only select one
directory at a time.
1. In JReport Designer, click File > Publish and Download > Publish to Server > Publish Report to Server.
The Connect to JReport Server window appears.
2. Enter the connection details for the target JReport server and click Connect.
1. Navigate to <DATA_ARCHIVE>\webapp\visualization\ILM_HEALTHCARE.
2. Within ILM_HEALTHCARE, create a new directory with the same name as the directory in JReport server/
target connection ARCHIVE_FOLDER_NAME.
4. Enter the path of the folder that you created in Step 4. for the Resource Real Path and select the "Enable
Resources from Real Paths" check box.
The resources that are present in the file system are recognized by the JReport server and the server can
display the resources in the JReport user interface without publishing. Each resource contains the
description "report from real path." When resources are published to Jinfonet Server, the description field
is blank. If the resources are loaded with the real path concept, then Jinfonet adds the "report from real
path" description. You can use this to identify whether resources are loaded from the file system or
managed by the JReport server.
1. Log in to Data Archive as a user who has permission to access the Patient Archives.
2. Click Data Visualization > Patient Archives.
The Select Application window appears. If the mapping was successful, you are able to view the
application folder that you created.
The archive job fails if the source database is IBM DB2 and the data contains the Decfloat data type.
Step 7. Select the Migrated Target Connection in the Patient Archives 139
4. Complete the remaining steps in the retirement project to retire the data.
You can enable integrated validation when you create an archive or retirement project, before you schedule it
to run. Before the job copies the tables to Data Vault, the integrated validation process uses an algorithm to
calculate a checksum value for each row and column in the table in the entity. After the job copies the tables
to Data Vault, the validation process calculates another checksum value for the rows and columns in Data
Vault. The process then compares the original checksum value of each row and column, to the checksum
value of each corresponding row and column in Data Vault.
When the comparison is complete, you can review the details of the validation. If a deviation occurs in the
number of rows, the value of a row, or the value of a column, between the original checksum value and the
archived or retired checksum value, you can review the details of the deviation. You can select a status for
each deviated table in the archive job or retirement project. You must provide a justification for the status
that you select.
141
After you review the validation, you can generate a validation report that includes the details of each
deviation. The validation report also contains a record of any status that you selected when you reviewed the
deviation, and any justifications that you entered.
You can enable integrated validation for live archive projects that have an Oracle source connection.
You can enable integrated validation for retirement projects that have the following types of source
connections:
• Oracle
• IBM DB2 for Linux, UNIX, and Windows
• IBM DB2 for z/OS
• Microsoft SQL Server
• IBM AS/400 (System i)
• Informatica PowerExchange
Integrated validation lengthens the retirement process by 40% to 60%.
Row Checksum
During the copy to destination step of the archive or retirement project, the validation process calculates a
checksum value for each row in every table in the entity.
The process writes the row checksum values to a BCP file that appends the values to the end of each row.
The process appends the checksum values as a metadata column that is copied to the Data Vault along with
the table. When the Data Vault loader is finished, Data Vault calculates a second checksum value for each
row in the archived or retired tables.
The validation process then compares the checksum value written to the BCP file to the checksum value
calculated by Data Vault. When the checksum comparison is complete, review the validation for any
deviations that occur between the two checksum values for each row.
142 Chapter 10: Integrated Validation for Archive and Retirement Projects
Column Checksum
During the copy to destination step, the validation process calculates a checksum value for each column in
every table in an entity.
The validation process stores the column checksum values in the ILM repository. When the Data Vault loader
is finished, Data Vault calculates another checksum value for each column and compares it to the checksum
value stored in the ILM repository. You can then review the validation for any deviations that occur between
the two checksum values for each column.
Validation Review
Review the validation process for details of any deviations.
When the checksum comparison for a live archive or retirement project is complete, you can review the
validation process for details of any deviations that occur. You can select a status for any deviated table and
provide a justification for the status. If you select rejected for a deviated table, the job remains in a paused
state. You can also place a deviated table on hold.
When you run a live archive project with integrated validation enabled, you may have to review any deviations
that occur before the job is complete. If deviations are found during the integrated validation step of a live
archive project, the integrated validation job step enters a "warning" state and Data Archive pauses the job
before the delete from source step or purge staging step. Before you can resume the job to continue to the
delete from source step or purge staging step, you must review the validation and select the accepted status
for any deviations that exist. If you try to resume the project without accepting the deviations, the job errors
out.
When you select a status of accepted or rejected for a deviated table, it does not change the data in either
the source database or the Data Vault. Statuses and justifications are a method of reviewing and
commenting on deviations that occur. You can also edit a status or justification to change it, for example
from rejected to accepted.
To review the validation, you must have the Operator role. To select a status for a deviated table, you must
have the Administrator role. For more information on system-defined roles, see the Informatica Data Archive
Administrator Guide.
Note: For any validated table, if a difference exists between the number of rows in the source and target, you
cannot review the table for deviations.
1. Pie chart that shows the number of tables in the retirement project that passed validation, in addition to
the number of tables that failed validation.
2. Pie chart that shows the specific actions that you took on the deviated tables. Tables that you have not
yet taken an action on appear as "pending review."
3. Filter box. Enter the entire filter or part of the filter value in the filter box above the appropriate column
heading, and then click the filter icon.
4. Filter icon and clear filter icon. After you specify filter criteria in the fields above the column names, click
the filter icon to display the deviated tables. To clear any filter settings, click the clear filter icon.
5. List of deviated tables. Select the check box on the left side of the row that contains the table you want
to review.
6. Review button. After you have selected a table to review, click the review button.
144 Chapter 10: Integrated Validation for Archive and Retirement Projects
1. Validation details, such as the source connection name, target connection name, and table name.
2. Filter box. Enter the entire filter or part of the filter value in the filter box above the appropriate column
heading, and then click the filter icon.
3. Filter icon and clear filter icon. After you specify filter criteria in the fields above the column names, click
the filter icon to display the deviated tables. To clear any filter settings, click the clear filter icon.
4. List of deviated rows.
5. Radio buttons that assign a status to the table. Select Accept, Reject, or Hold for Clarifications.
For a deviated table, the Review window lists a maximum of the first 100 deviated rows in Data Vault and the
first 10 corresponding rows in the source database. For each deviated table, the Review window displays a
maximum of 10 columns, with a maximum of eight deviated columns and two non-deviated columns.
If all of the columns in a table are deviated, source connection rows are not displayed.
Summary of results
For a retirement project, the summary of results section displays the application name, retirement job ID,
source database name, and Data Vault archive folder name. For an archive project, the summary of
results section displays the job name, job ID, source database name, and Data Vault archive folder name.
The summary of results also displays the login name of the user who ran the archive or retirement
project, and the date and time that the user performed the validation. If the job contained any deviations,
the summary of results lists the review result as “failed.” If the job did not contain any deviated tables,
the review result appears as “passed.”
The summary of deviations section displays pie charts that illustrate the number of deviated tables in
the archive or retirement job and the type of status that you selected. The section also contains a list of
the deviated tables, along with details about the tables, for example the row count in the source
database and in Data Vault.
Detail deviations
The detail deviations section displays details about the deviations that occur in each table included in
the archive or retirement job. For example, the detail deviations section displays the number of row
deviations, number of column deviations, and names of the deviated columns on each table. The section
also displays the number of rows in both the source database and Data Vault, along with any row count
difference.
Appendix
The appendix section contains a list of the tables in the archive or retirement entity that the validation
process verified.
146 Chapter 10: Integrated Validation for Archive and Retirement Projects
5. Complete the remainder of the steps required to run an archive project. For more information, see the
chapter "Creating Data Archive Projects" or the
Archiving to the Informatica Data Vault for Custom Applications H2L.
Note: In the Manage Execution window, certain job steps such as the "archive crawler" and "integrated
validation" job steps appear by default in the archive job steps when integrated validation is enabled. The
exact steps that appear depend on whether you have selected an archive job or an archive and purge job,
and whether or not you are using staging. You do not need to schedule any of these steps separately.
Data Archive runs them as part of the archive process when integrated validation is enabled. You cannot
run integrated validation as a standalone job for an archive cycle.
The
menu that typically allows you to select the order of steps five and six is disabled. When integrated
validation is enabled, Data Archive runs step five first.
6. Schedule the job to run.
7. Click Jobs > Monitor Jobs to monitor the archive job status and review the validation after the process is
complete. If the validation process returns any deviations during an archive and purge job, the job
pauses and you must review and accept any deviations before you can resume the job and continue to
the delete from source job step. When you have reviewed and accepted the deviations and resumed the
job, the job will be in a "Warning" state even after successful completion. You can still use compliance
features, Data Discovery, and Data Visualization as the data has been successfully archived.
148 Chapter 10: Integrated Validation for Archive and Retirement Projects
The Review window appears.
6. To select a status for the deviated table, select the Accept, Reject, or Hold for Clarification option.
7. Enter a justification for the action in the text box.
8. Click Apply.
Note: You can select more than one table at a time and click Review to select the same status for all of
the selected tables. However, if you select multiple tables you cannot review the deviations for each
table.
150 Chapter 10: Integrated Validation for Archive and Retirement Projects
4. Click the View Validation Report link.
The validation report opens as a PDF file. You can print or save the validation report.
To run the integrated validation independent of the retirement flow, delete the integrated validation job step
from the retirement project flow before you schedule the retirement project to run. Before you run the job,
ensure that the source connection is online. Then, when you are ready to run the validation, run the integrated
validation standalone job.
You can run the integrated validation job only once on a retirement definition.
Retention Management
This chapter includes the following topics:
A retention policy is a set of rules that determine the retention period for records in an entity. The retention
period specifies how long the records must remain in the Data Vault before you can delete them. The
retention period can be definite or indefinite. Records with definite retention periods have expiration dates.
You can run the Purge Expired Records job to remove expired records from the Data Vault. Records with
indefinite retention periods do not expire. You cannot delete them from the Data Vault.
Your organization might require that you enforce different retention rules for different records. For example,
you might need to retain insurance policy records for five years after the insurance policy termination date. If
messages exist against the insurance policy, you must retain records for five years after the most recent
message date in different tables.
Data Archive allows you to create and implement different retention rules for different records. You can
create a retention policy to addresses all of these needs.
152
Retention Management Process
You can create and change retention policies if you have the retention administrator role. You can create
retention policies in the workbench and when you create a retirement archive project.
If you create retention policies in the workbench, you can use them in data archive projects that archive data
to the Data Vault. You can also use the policies in retirement archive projects. If you create retention policies
when you create a retirement archive project, Data Archive saves the policies so that you can use them in
other archive projects
After you create retention policies, you can use them in archive projects. When you run the archive job, Data
Archive assigns the retention policies to entities and updates the expiration dates for records. You can
change the retention periods for archived records through Data Discovery. Change the retention period when
a corporate retention policy changes, or when you need to update the expiration date for a subset of records.
For example, an archive job sets the expiration date for records in the ORDERS table to five years after the
order date. If an order record originates from a particular customer, the record must expire five years after
the order date or the last shipment date, whichever is greatest.
To use an entity in a retention policy, ensure that the following conditions are met:
To create and apply retention policies, you might complete the following tasks:
1. Create one or more retention policies in the workbench that include a retention period.
2. If you want to base the retention period for records on a column date or an expression, associate the
policies to entities.
3. Create an archive project and select the entities and retention policies. You can associate one retention
policy to each entity.
4. Run the archive job.
Data Archive assigns the retention policies to entities in the archive project. It sets the expiration date
for all records in the project.
5. Optionally, change the retention policy for specific archived records through Data Discovery and run the
Update Retention Policy job.
Data Archive updates the expiration date for the archived records.
6. Run the Purge Expired Records job.
Data Archive purges expired records from the Data Vault.
Note: If you have the retention viewer role, you cannot create, edit, or assign retention policies. You can
perform Data Discovery searches based on retention policy and retention expiration date, and you can view
the policy details and the table data. For more information on the differences between the retention
administrator and the retention viewer roles, see the Data Archive Administrator Guide.
For example, a large insurance organization acquires a small automobile insurance organization. The small
organization uses custom ERP applications to monitor policies and claims. The large organization requires
The large organization must retain records from the small organization according to the following rules:
• By default, the organization must retain records for five years after the insurance policy termination date.
• If messages exist against the insurance policy, the organization must retain records for five years after
the most recent message date in the policy, messages, or claims table.
• If the organization made a bodily injury payment, it must retain records for 10 years after the policy
termination date or the claim resolution date, whichever is greatest.
• If a record contains a message and a bodily injury payment, the organization must retain the record
according to the bodily injury rule.
The small organization stores policy information in the following tables in the AUTO_POLICIES entity:
POLICY Table
Column Description
... …
MESSAGES Table
Column Description
... …
CLAIMS Table
Column Description
... …
As a retention administrator for the merged organization, you must create and apply these rules. To create
and apply the retention management rules, you complete multiple tasks.
4. Click Save.
Data Archive saves retention policy "5-Year Policy." You can view the retention policy in the Manage
Retention Policies window.
Data Archive saves retention policy "10-Year Policy." You can view the retention policy in the Manage
Retention Policies window.
8. Select retention policy "5-Year Policy POLICY TERM_DATE" for the AUTO_POLICIES entity.
The wizard displays details about the retention policy. The details include the column and table on which
to base the retention period.
9. Click Next.
The wizard prompts you for approval information for the retirement project.
10. Optionally, enter approval information for the project, and then click Next.
The wizard displays the steps in the retirement archive job.
11. Click Publish.
Data Archive saves the retirement project. You can view the project in the Manage Retirement Projects
window.
POLICY LAST_TRANS
MESSAGES LAST_TRANS_DATE
CLAIMS LAST_DATE
8. In the Comments field, enter comments about the retention policy changes.
9. To view the records affected by your changes, click View.
A list of records that the retention policy changes affect appears. Data Archive lists the current
expiration date for each record.
10. To submit the changes, click Submit.
The Schedule Job window appears.
11. Schedule the Update Retention Policy job to run immediately.
12. Click Schedule.
Data Archive runs the Update Retention Policy job. For records with messages, it sets the expiration date
to five years after the last transaction date across the POLICY, MESSAGES, and CLAIMS tables.
13. Repeat steps 1 through 4 so that you can implement the bodily injury rule.
The Modify Assigned Retention Policy window appears.
14. To select records with bodily injury payments, click Add Row and add a row for each of the following
conditions:
• CLAIM_TYPE Equals 2 AND
• CLAIM_AMT Greater Than 0
15. In the New Retention Policy list, select new retention policy "10-Year Policy."
POLICY TERM_DATE
CLAIMS RESOLVE_DATE
17. Repeat steps 8 through 12 to update the expiration date for records with bodily injury payments.
Data Archive sets the expiration date to 10 years after the policy termination date or the claim resolution
date, whichever is greatest.
Note: If a record has a message and a bodily injury payment, Data Archive sets the expiration date according
to the bodily injury rule. It uses the bodily injury rule because you ran the Update Retention Policy job for this
rule after you ran the Update Retention Policy job for the message rule.
Retention Policies
Retention policies consist of retention periods and optional entity associations that determine which date
Data Archive uses to calculate the expiration date for records. You can create different retention policies to
address different retention management needs.
Each retention policy must include a retention period in months or years. Data Archive uses the retention
period to calculate the expiration date for records to which the policy applies. A record expires on its
expiration date at 12:00:00 a.m. The time is local to the machine that hosts the Data Vault Service. The
retention period can be indefinite, which means that the records do not expire from the Data Vault.
After you create a retention policy, you can edit it or delete it. You can edit or delete a retention policy if the
retention policy is not assigned to an archived record or to an archive job that is running or completed. Data
Archive assigns retention policies to records when you run the archive job.
Entity Association
When you create a retention policy, you can associate it to one or more entities. Entity association identifies
the records to which the retention policy applies and specifies the rules that Data Archive uses to calculate
the expiration date for a record.
When you associate a retention policy to an entity, the policy applies to all records in the entity, but not to
reference tables, if you have not selected the checkbox to update the retention to reference tables. For
example, you associate a 10 year retention policy with records in the EMPLOYEE entity so that employee
records expire 10 years after the employee termination date. Each employee record references the DEPT
table in another entity. The retention policy does not apply to records in the DEPT table reference tables.
For entities with unrelated tables, you can only use an absolute-date based retention policy.
Entities used in a retention policy must have a driving table with at least one physical or logical primary key
defined.
When you associate a retention policy to an entity, you can configure the following types of rules that Data
Archive uses to calculate expiration dates:
General Retention
Bases the retention period for records on the archive job date. The expiration date for each record in the
entity equals the retention period plus the archive job date. For example, records in an entity expire five
years after the archive job date.
Bases the retention period for records on a column in an entity table. The expiration date for each record
in the entity equals the retention period plus the date in a column. For example, records in the
EMPLOYEE entity expire 10 years after the date stored in the EMP.TERMINATION_DATE column.
Expression-Based Retention
Bases the retention period for records on a date value that an expression returns. The expiration date for
each record in the entity equals the retention period plus the date value from the expression. For
example, records in the CUSTOMER table expire five years after the last order date, which is stored as an
integer.
General Retention
Configure general retention when you want to base the retention period for records in an entity on the data
archive job date or on the retirement archive job date. The expiration date for each record in the entity equals
the retention period plus the archive job date.
When you configure general retention, you can select an indefinite retention period. When you select an
indefinite retention period, the records do not expire from the Data Vault.
When you configure general retention, all records in the entity to which you associate the policy have the
same expiration date. For example, you create a retention policy with a five year retention period. In a
retirement archive project, you select this policy for an entity. You run the archive job on January 1, 2011.
Data Archive sets the expiration date for the archived records in the entity to January 1, 2016.
You can configure general retention when you select entities in a data archive project or a retirement archive
project. Select the entity, and then select the retention policy.
When you configure column level retention, each record in an entity has a unique expiration date. For
example, you create a retention policy with a 10 year retention period and associate it to entity EMPLOYEE,
table EMP, and column TERM_DATE. In a data archive project, you select this policy for the EMPLOYEE entity.
Each record in the EMPLOYEE entity expires 10 years after the date in the EMP.TERM_DATE column.
You can configure column level retention in the workbench when you manage retention policies or when you
change retention policies for archived records in Data Discovery. You can also configure column level
retention when you select entities in a data archive project or a retirement archive project. Select the entity,
and then select the retention policy. The retention policy list appends the table and column name to the
names of retention policies with column level associations. You cannot configure column level retention for
retention policies with indefinite retention periods.
Expression-Based Retention
Configure expression-based retention when you want to base the retention period for records in an entity on
an expression date. The expiration date for each record in the entity equals the retention period plus the date
that the expression returns.
The expression must include statements that evaluate to a column date or to a single date value in DATE
datatype. It can include all SQL statements that the Data Vault Service supports. The expression can include
up to 4000 characters. If the expression does not return a date or if the expression syntax is not valid, the
Update Retention Policy job fails. To prevent the Update Retention Policy job from failing, provide null-
handling functions such as IFNULL or COALESCE in your expressions. This ensures that a suitable default
value will be applied to records when the expression evaluates to a null or empty value.
The Update Retention Policy job includes the expression in the select statement that the job generates. The
job selects the records from the Data Vault. It uses the date that the expression returns to update the
retention expiration date for the selected records. When you monitor the Update Retention Policy job status,
you can view the expression in the job parameters and in the job log. The job parameter displays up to 250
characters of the expression. The job log displays the full expression.
Use the TO_DATE function to evaluate columns that store dates as integers. The function converts the
integer datatype format to the date datatype format so that the Update Retention Policy job can
calculate the retention period based on the converted date. The TO_DATE function selects from the
entity driving table.
To select a column from another table in the entity, use the full SQL select statement. For example, you
want to set the retention policy to expire records 10 years after the employee termination date. The
entity driving table, EMPLOYEE, includes the TERM_DATE column which stores the termination date as
an integer, for example, 13052006. Create a retention policy, and set the retention period to 10 years.
Add the following expression to the retention policy:
TO_DATE(TERM_DATE,'ddmmyyyy')
When the Update Retention Policy job runs, the job converts 13052006 to 05-13-2006 and sets the
expiration date to 05-13-2016.
Add tags to archived records to capture dates or other values that might not be stored or available in the
Data Vault. Use an expression to evaluate the tag and set the retention period based on the tag value.
For example, you retired a source application that contained products and manufacturing dates. You
want to set the expiration date for product records to 10 years after the last manufacturing date. When
you retired the application, some products were still in production and did not have a last manufactured
date. To add a date to records that were archived without a last manufacturing date, you can add a tag
column with dates. Then, define a retention policy and use an expression to evaluate the date tag.
For more information on how to use a tag to set the retention period, see “Using a Tag Column to Set the
Retention Period ” on page 162.
Use an expression to evaluate table columns across all entities in the Data Vault archive folder. You can
use a simple SQL statement to evaluate data from one column, or you can use complex SQL statements
to evaluate data from multiple columns.
For example, you retired an application that contained car insurance policies. An insurance policy might
have related messages and claims. You want to set the expiration date for insurance policy records to
five years after the latest transaction date from the POLICY, MESSAGES, or CLAIMS tables. If the
insurance policy has a medical or property damage claim over $100,000, you want to set the expiration
date to 10 years.
Note: If you enter a complex SQL statement that evaluates column dates across high-volume tables, you
might be able to increase query performance by changing the retention policy for specific records.
Change the retention policy for records and then run the Update Retention Policy job instead of entering
an expression to evaluate date columns across tables. Data Archive can generate SQL queries which run
more quickly than user-entered queries that perform unions of large tables and contain complex
grouping.
You can configure expression-based retention in the workbench when you manage retention policies or when
you change retention policies for archived records in Data Discovery. The retention policy list in Data
Discovery appends "(Expression)" to the names of retention policies with expression-based associations. You
cannot configure expression-based retention for retention policies with indefinite retention periods.
The following task flow explains how to use a tag column to set an expression-based retention period.
1. Add a tag column to the table that holds the transaction key.
a. Specify the Date data type for the tag.
b. Enter a date for the tag value.
For more information about adding a tag, see “Adding Tags” on page 198.
2. Determine the database name of the tag.
a. Select Data Discovery > Browse Data.
b. Move all the ilm_metafield# columns from the Available Columns list to the Display Columns list.
c. Click Search.
d. Browse the results to determine the name of the tag column that contains the value you entered. For
example, the name of the tag column is ilm_metafield1.
Property Description
Policy name Name of the retention policy. Enter a name of 1 through 30 characters.
The name cannot contain the following special characters:
,<>
Retention period The retention period in months or years. Enter a retention period or enable the Retain
indefinitely option.
Retain indefinitely If enabled, the records to which the retention property applies do not expire from the Data Vault.
Enter a retention period or enable the Retain indefinitely option.
Records Optional string you can enter to associate the retention policy with the policy ID from a records
management ID management system. You can use the records management ID to filter the list of retention
policies in Data Discovery. You can enter any special character except the greater than
character (>), the less than character (<), or the comma character ( ,).
Note: If you select column-level retention, you must select a table in the entity and a column that contains
date values. If you select expression-based retention, you must select a table in the entity and enter an
expression.
For example, you assign a five-year retention policy to automobile insurance policies in a retirement archive
job so that the records expire five years after the policy termination date. However, if messages exist against
the insurance policy, you must retain records for five years after the most recent message date in the policy,
claims, or client message table.
You change the assigned retention policy through Data Discovery. You cannot edit or delete retention policies
in the workbench if the policies are assigned to archived records or to archive jobs that are running or
completed.
Before you change a retention policy for any record, you might want to view the records assigned to the
archive job. When you view records, you can check the current expiration date for each record.
To change the retention policy for records, you must select the records you want to update and define the
new retention period. After you change the retention policy, schedule the Update Retention Policy job to
update the expiration date for the records in the Data Vault.
To select records to update, you must specify the conditions to which the new retention policy applies. Enter
each condition in the format "<Column><Operator><Value>." The Is Null and Not Null operators do not use
the <Value> field.
For example, to select insurance records in an entity that have messages, you might enter the following
condition:
You can specify multiple conditions and group them logically. For example, you want to apply a retention
policy to records that meet the following criteria:
Use the following guidelines when you specify the conditions for record selection:
For example, to select insurance records with policy numbers that begin with "AU-," enter the following
condition:
By default, Data Archive performs case-sensitive string comparisons. To ignore case, enable the Case
Insensitive option.
Use the In operator to specify that a condition is within a range of comma-separated values, for example,
CLAIM_CLASS IN BD1, BD2. Do not enclose the list of values in parentheses.
If you specify multiple conditions, you can group them with multiple levels of nesting, for example, "(A
AND (B OR C)) OR D." Use the Add Parenthesis and Remove Parenthesis buttons to add and remove
opening and closing parentheses. The number of opening parentheses must equal the number of closing
parentheses.
To define the new retention period, select the new retention policy and enter rules that determine which date
Data Archive uses to calculate the expiration date for records. The expiration date for records equals the
retention period plus the date value that is determined by the rules you enter.
Bases the retention period for records on a fixed date. For example, you select a retention policy that has
a 10 year retention period. You enter January 1, 2011, as the relative date. The new expiration date is
January 1, 2021.
Column Level Rule
Bases the retention period for records on a column date. For example, you select a retention policy that
has a five year retention period. You base the retention period on the CLAIM_DATE column. Records
expire five years after the claim date.
Expression-Based Rule
Bases the retention period for records on the date returned by an expression. For example, you select a
retention policy that has a five year retention period. You want to base the retention period on dates in
the CLAIM_DATE column, but the column contains integer values. You enter the expression
TO_DATE(CLAIM_DATE,'ddmmyyyy'). Records expire five years after the claim date.
For example, you have a five-year retention policy for insurance records that retires records five years after
the last transaction date. You want to select the last transaction date from three tables and apply the new
retention policy using that date.
• Policy.LAST_TRANS
• Messages.LAST_TRANS_DATE
• Claims.LAST_DATE
To select the last transaction date across tables, complete the following tasks:
Policy LAST_TRANS
Messages LAST_TRANS_DATE
Claims LAST_DATE
The Update Retention Policy job selects records from the Data Vault based on the conditions you specify
when you change the retention policy. It uses the date value determined by the rules you enter to update the
retention expiration date for the selected records. If you change the retention policy for a record using an
expression-based date, the Update Retention Policy job includes the expression in the SELECT statement that
the job generates.
When you run the Update Retention Policy job, you can monitor the job status in the Monitor Job window. You
can view the WHERE clause that Data Archive uses to select the records to update. You can also view the old
and new retention policies. If you change the retention policy for a record using an expression-based date,
you can view the expression in the job parameters and in the job log. The job parameter displays up to 250
characters of the expression. The job log displays the full expression.
When you monitor the Update Retention Policy job status, you can also generate the Retention Modification
Report to display the old and new expiration date for each affected record. The report is created with the Arial
Unicode MS font. To generate the report, you must have the font file ARIALUNI.TTF saved in the <Data
Archive installation>\webapp\WEB-INF\classes directory. To view the report, you must have the retention
administrator role and permission to view the entities that contain the updated records.
To schedule the Update Retention Policy job, click Submit in the Modify Assigned Retention Policy window.
The Schedule Job window appears. You schedule the job from this window.
Run the Purge Expired Records job to perform one of the following tasks:
• Generate the Retention Expiration report. The report shows the number of records that are eligible for
purge in each table. When you schedule the purge expired records job, you can configure the job to
generate the retention expiration report, but not purge the expired records.
• Generate the Retention Expiration report and purge the expired records. When you schedule the purge
expired records job, you can configure the job to pause after the report is generated. You can review the
expiration report. Then, you can resume the job to purge the eligible records.
When you run the Purge Expired Records job, by default, the job searches all entities in the Data Vault archive
folder for records that are eligible for purge. To narrow the scope of the search, you can specify a single
entity. The Purge Expired Records job searches for records in the specified entity alone, which potentially
decreases the amount of time in the search stage of the purge process.
To determine the number of rows in each table that are eligible for purge, generate the detailed or summary
version of the Retention Expiration report. To generate the report, select a date for Data Archive to base the
report on. If you select a past date or the current date, Data Archive generates a list of tables with the number
of records that are eligible for purge on that date. You can pause the job to review the report and then
schedule the job to purge the records. If you resume the job and continue to the delete step, the job deletes
all expired records up to the purge expiration date that you used to generate the report. If you provide a future
date, Data Archive generates a list of tables with the number of records that will be eligible for purge by the
future date. The job stops after it generates the report.
Lists the tables in the archive folder or, if you specified an entity, the entity. Shows the total number of
rows in each table, the number of records with an expired retention policy, the number of records on
legal hold, and the name of the legal hold group. The report lists tables by retention policy.
Lists the tables in the archive folder or, if you specified an entity, the entity. Shows the total number of
rows in each table, the number of records with an expired retention policy, the number of records on
The reports are created with the Arial Unicode MS font. To generate the reports, you must have the font file
ARIALUNI.TTF saved in the <Data Archive installation>\webapp\WEB-INF\classes directory.
To purge records, you must enable the purge step through the Purge Deleted Records parameter. You must
also provide the name of the person who authorized the purge.
Note: Before you purge records, use the Search Within an Entity in Data Vault search option to review the
records that the job will purge. Records that have an expired retention policy and are not on legal hold are
eligible for purge.
When you run the Purge Expired Records job to purge records, Data Archive reorganizes the database
partitions in the Data Vault, exports the data that it retains from each partition, and rebuilds the partitions.
Based on the number of records, this process can increase the amount of time it takes for the Purge Expired
Records job to run. After the Purge Expired Records job completes, you can no longer access or restore the
records that the job purged.
Note: If you purge records, the Purge Expired Records job requires staging space in the Data Vault that is
equal to the size of the largest table in the archive folder, or, if you specified an entity, the entity.
Archive Store The name of the archive folder in the Data Vault that contains the records that Required.
you want to delete.
Select a folder from the list.
Purge Expiry The date that Data Archive uses to generate a list of records that are or will be Required.
Date eligible for delete.
Select a past, the current, or a future date.
If you select a past date or the current date, Data Archive generates a report
with a list of all records eligible for delete on the selected date. You can pause
the job to review the report and then schedule the job to purge the records. If
you resume the job and continue to the delete step, the job deletes all expired
records up to the selected date.
If you select a future date, Data Archive generates a report with a list of
records that will be eligible for delete by the selected date. However, Data
Archive does not give you the option to delete the records.
Report Type The type of report to generate when the job starts. Required.
Select one of the following options:
- Detail. Generates the Retention Expiration Detail report.
- None. Does not generate a report.
- Summary. Generates the Retention Expiration Summary report.
Pause After Determines whether the job pauses after Data Archive creates the report. If you Required.
Report pause the job, you must resume it to delete the eligible records.
Select Yes or No from the list.
Entity The name of the entity with related or unrelated tables in the Data Vault archive Optional.
folder.
To narrow the scope of the search, select a single entity.
Purge Deleted Determines whether to delete the eligible records from the Data Vault. Required.
Records Select Yes or No from the list.
Purge Approved The name of the person who authorized the purge. Required if you
By select a past
date or the
current date for
Purge Expiry
Date.
Note: The reports are created with the Arial Unicode MS font. To generate the reports, you must have the font
file ARIALUNI.TTF saved in the <Data Archive installation>\webapp\WEB-INF\classes directory.
The Retention Modification Report displays information about records affected by retention policy
changes. The report displays the old and new retention policies, the rules that determine the expiration
date for records, and the number of affected records. The report also lists each record affected by the
retention policy change.
Generate a Retention Modification report from the Monitor Jobs window when you run the Update
Retention Policy job.
The Retention Expiration Summary Report displays information about expired records. The report
displays the total number of expired records and the number of records on legal hold within a retention
policy. It also displays the number of expired records and the number of records on legal hold for each
entity.
Generate a Retention Expiration Summary report from the Monitor Jobs window when you run the Purge
Expired Records job.
If your records belong to related entities, the Retention Expiration Detail report lists expired records so
you can verify them before you purge expired records. The report displays the total number of expired
records and the number of records on legal hold within a retention policy. For each expired record, the
report shows either the original expiration date or the updated expiration date. It shows the updated
expiration date if you change the retention policy after Data Archive assigns the policy to records.
If your records belong to unrelated entities, the Retention Expiration Detail report displays the total
number of expired records and the number of records on legal hold within a retention policy. It does not
list individual records.
Generate a Retention Expiration Detail report from the Monitor Jobs window when you run the Purge
Expired Records job.
You might need to change the default retention period for an archive area for certain Centera versions, such
as Compliance Edition Plus. Some Centera versions do not allow you to delete a file if it is within an archive
area that has indefinite retention. You cannot delete the file even if all records are eligible for expiration.
You can choose immediate expiration or indefinite retention. Specify the retention period in days, months, or
years. The retention period determines when the files are eligible for deletion. The default retention period
applies to any archive area that you create after you configure the property. When you create an archive area,
the Data Vault Service uses the retention period that you configure in the properties file. The Data Vault
Service enforces the record level policy based on the entity retention policy that you configure in Data
Archive.
External Attachments
This chapter includes the following topics:
If you want to archive or retire encrypted external attachments, such as Siebel attachments, to the Data Vault,
you can use the Data Vault Service for External Attachments (FASA) to convert encrypted attachments from
the proprietary format. You can then access the decrypted external attachments through the Data Discovery
portal.
You can move external attachments to the target database synchronously or asynchronously. If you archive
external attachments synchronously, the ILM Engine archives the external attachments with the associated
data when you run an archive job. If you have already archived the associated data, you can move the
external attachments asynchronously from source to target with the Move External Attachments job.
You can archive either encrypted or unencrypted attachments to a database. When you archive encrypted
external attachments, FASA is not required because the files are stored in tables.
Before you run an archive job or a Move External Attachments job, configure the following properties in the
source connection to archive external attachments:
Source location of the external attachments. Enter the directory where the attachments are stored on the
production server. You must have read access on the directory.
173
Target Attachment location
Target location for the external attachments. Enter the folder where you want to archive the external
attachments. You must have write access on the directory.
Moves the attachments synchronously with the archive job. If you want to archive the external
attachments with the associated data when you run an archive job, select this checkbox.
Archive Job Id
Include the Job Id of the archive or restore job to move external attachments asynchronously.
To archive external attachments to the Data Vault, the attachment locations must be available to both the
ILM Engine and the Data Vault Service. If the external attachments are encrypted, such as Siebel
attachments, you must configure FASA to decrypt the attachments. Before you run the archive or retirement
job, configure the source and target connections.
If your environment is configured for live archiving, you must also create and configure an interim table in the
Enterprise Data Manager for the attachments.
If you plan to retire all of the external attachments in the directory, you do not need to configure an interim
table in the EDM. If you want to retire only some of the external attachments in the directory, you must
configure an interim table in the EDM and add a procedure to the Run Procedure in Generate Candidates step
of the archive job.
If you configured the source connection to move the attachments synchronously with the archive or
retirement job, run the Load External Attachments job after the archive job finishes. The Load External
Attachments job loads the attachments to the Data Vault.
If you did not move the attachments synchronously with the archive job, run the Move External Attachments
standalone job after you run the archive job. The Move External Attachments job moves the attachments to
the interim attachment table. You can then load the attachments to the Data Vault with the Load External
Attachments job.
After you archive the external attachments, you can view the attachments in the Data Discovery Portal. To
view the archived external attachments, search the Data Vault for the entity that contains the attachments.
You cannot search the contents of the external attachments.
Create an interim table with additional columns to archive external attachments. Then modify the entity steps
to move the attachments. The archive job moves the attachments during the Generate Candidates step.
The source location of the attachments. If the attachments are not encrypted, this directory must be
accessible to both the ILM Engine and the Data Vault Service.
The target connection used to archive the application data. The Target Archive Store parameter ensures
that the attachments are archived to the same Data Vault archive folder as the application data.
If you want the job to delete the attachments you are loading to the Data Vault, select Yes. The job
deletes the contents of the directory that you specified in the Directory parameter.
FASA URL.
Archive Job Id
The Move External Attachments job moves the attachments from the source directory to the interim
directory in the attachment entity.
3. To move the attachments to the Data Vault, run a Load External Attachments job.
You can restore data from the Data Vault or a database archive. You can restore data from the Data Vault to
an empty or existing production database or to a database archive. You can restore data from a database
archive to an empty or existing database, including the original source.
You can also restore external attachments that have been archived to a database archive or the Data Vault.
Data Archive restores external attachments as part of a restore job from either a database or the Data Vault.
By default, when you restore data from a database archive or the Data Vault, Data Archive deletes the data
from the archive location. If you restore a cycle or transaction, you can choose to keep the restored data at
the archive location.
Restore Methods
You can run a transaction or cycle restore from either the Data Vault or a database archive. You can only
perform a full restore from a database archive.
178
Full restore
A full restore job restores the entire contents of a database archive. You might perform a full restore if
you archived an entire database that must be returned to production. For optimal performance, conduct
a full restore to an empty database. Performance is slow if you run a full restore to the original database.
Data Archive deletes the restored data from the database archive as part of the full restore job.
Cycle restore
A cycle restore job restores an archive cycle. You might restore an archive cycle if a cycle was archived
in error, or if you must change numerous transactions in a cycle. To restore a cycle you must know the
cycle ID. A cycle restore can also restore a specific table. To restore a specific table, use the Enterprise
Data Manager to create an entity containing the table you want to restore.
By default, Data Archive deletes the restored data from the database archive or the Data Vault as part of
the cycle restore job. If you do not want the restore job to delete the data from the archive location, you
can specify that the restore job skip the delete step.
Transaction restore
A transaction restore job restores a single transaction, such as a purchase order or invoice. To restore a
transaction, you must search for it by the entity name and selection criteria. If your search returns
multiple transactions, you can select multiple transactions to restore at the same time.
By default, Data Archive deletes the restored data from the database archive or the Data Vault as part of
the transaction restore job. If you do not want the restore job to delete the data from the archive
location, you can specify that the restore job skip the delete step.
You cannot restore archive only cycles from the Data Vault. Only archive and purge cycles can be restored
from the Data Vault.
When you restore from the Data Vault, determine the type of target for the restore job:
If you restore to an empty database, you must first export the metadata associated with the records to be
restored and run the metadata DDL on the database.
1. In the Data Discovery Portal, search for the records that you want to restore.
2. Export the metadata associated with the records.
3. Run the metadata DDL on the target database.
4. Run the transaction or cycle restore job from the Data Vault to the target database.
If the production database schema has been updated, you must synchronize schemas between the existing
database and the Data Vault. You can then run either a cycle or transaction restore.
Note: When you restore archived data and the records exist in the target database, the restore job will fail if
key target constraints are enabled. If target constraints are disabled, the operation might result in duplicate
records.
Verify that the target connection exists and that you have access to the connection.
1. In the Data Discovery Portal, search for the records that you want to restore.
2. Run the restore job from the Data Vault to the target database.
Schema Synchronization
When you restore data from the Data Vault to an existing database, you might need to synchronize schema
differences between the Data Vault and the existing database.
To synchronize schemas, you can generate metadata in the Enterprise Data Manager and run the Data Vault
Loader to update the Data Vault schema, or you can use a staging database.
Update the Data Vault schema with the Data Vault Loader when changes to the target table are additive,
including extended field sizes or new columns with a null default value. New columns with a non-null default
value require manually running an ALTER TABLE SQL command. If you removed columns or reduced the size
of columns on the target table, you must synchronize through a staging database.
Verify that the target connection exists and that you have access to the connection.
1. Define and run an archive job that generates metadata for the new schema, but does not archive any
records.
2. From the Schedule Jobs page, run a Data Vault Loader job to load the metadata and update the schema.
3. Verify that the job was successful on the Monitor Jobs page.
4. Run the restore job from the Data Vault to the target database.
Verify that the target connection exists and that you have access to the connection.
1. In the Data Discovery Portal, search for the records that you want to restore.
2. Export the metadata associated with the records.
3. Run the metadata DDL on the staging database.
4. Run the restore job from the Data Vault to the staging database.
5. Alter the table in staging to synchronize with the target database schema.
6. Using dlink and SQL scripts, manually restore the records from the staging database to the target
database.
Verify that a target connection exists and that you have access to the connection.
1. Generate Candidates. Generates a row count report for the data being restored.
2. Validate Destination. Validates the destination by checking table structures.
3. Copy to Destination. Restores the data from the Data Vault to the target. You have the option of
generating a row count report after this step.
4. Delete from Archive. Deletes the restored data from the Data Vault if it is not on legal hold. This step is
optional.
1. In the Enterprise Data Manager, create a new entity containing the specific table you want to restore.
2. Run a cycle restore and specify the entity you created.
When you restore data from the database archive, the history application user functions like the production
application user during an archive cycle. The history application user requires the SELECT ANY TABLE
privilege to run a restore job.
You can conduct a full, cycle, or transaction restore from a database archive. If you are restoring data from a
database archive that was previously restored from the Data Vault to a database archive, you can only run a
cycle restore. You can only run a cycle restore because the restore job ID is not populated in the interim
tables during the initial archive cycle.
The target database for a database archive restore job can be either an empty or existing production
database.
When you run a full restore, the system validates that the source and target connections point to different
databases. If you use the same database for the source and target connections, the source data may be
deleted.
Prerequisites
Verify that the target connection for the restore exists and that you have access to the connection.
If you are running a restore job from an IBM DB2 database back to another IBM DB2 database, an ILM
Administrator may have to complete prerequisite steps for you. Prerequisite steps are required when your
organization's policies do not allow external applications to create objects in IBM DB2 databases. The
prerequisite steps include running a script on the staging schema of the restore source connection.
Note: When you restore archived data and the records exist in the target database, the restore job will fail if
key target constraints are enabled. If target constraints are disabled, the operation might result in duplicate
records.
Schema Synchronization
When you restore from a database archive to an existing database, the Validate Destination step in the
restore cycle will synchronize schema differences when the changes are additive, such as extending field
sizes or adding new columns. If changes are not additive, you must create and run a SQL script on the
database archive to manually synchronize with the production database.
1. Generate Candidates. Generates interim tables and a row count report for the data being restored.
2. Build Staging. Builds the staging table structures for the data you want to restore.
3. Copy to Staging. Copies the data from the database archive to your staging tablespace. You have the
option of generating a row count report after this step.
1. In the Enterprise Data Manager, create a new entity containing the specific table you want to restore.
2. Run a cycle restore and specify the entity you created.
For encrypted attachments, when you run a restore job the ILM engine generates an encryption script in the
staging attachment directory. The engine invokes the Data Vault Service for External Attachments (FASA) to
create an encryption utility that reencrypts the attachments in the staging directory and then restores them to
the target directory.
The following table lists the directories that FASA and the ILM engine need to access:
Encrypted FASA and ILM engine (shared FASA and ILM engine (shared FASA
mount) mount)
Regular/Encrypted ILM engine (mount point) ILM engine (local) ILM engine (mount point)
Use a keyword, term, or specific string to search for records across applications in Data Vault. Records
that match the keyword appear in the search results. A keyword or string search searches both
structured and unstructured data files, for example .docx, .pdf or .xml files. You can narrow the search
results by filtering by application, entity, and table. After you search the Data Vault, you can export the
search results to a .pdf file.
Use Search Within an Entity to search and examine application data from specific entities in the Data
Vault. The Data Discovery portal uses the search criteria that you specify and generates a SQL query to
access data from the Data Vault. After you search the Data Vault, you can export the search results to a
CSV or PDF file.
Browse Data
Use Browse Data to search and examine application data from specific tables in the Data Vault. The
Data Discovery portal uses the search criteria that you specify and generates a SQL query to access data
from the Data Vault. After you search the Data Vault, you can export the search results to a .csv or .pdf
file.
You can also use the Data Discovery portal to apply retention policies, legal holds, and tags to archived
records. If data-masking is enabled, then sensitive information is masked or withheld from the search results.
185
Search Data Vault
You can enter a keyword or set of words, add wildcards or boolean operators to search for records in Data
Vault.
Data Archive searches across applications in Data Vault for records that match your search criteria. You can
narrow your search results by filtering by application, entity, and table. Then export your results to a PDF file.
You access Search Data Vault from the Data Discovery menu.
Records that match your search query display on the Results page. The following figure shows the Results
page.
1. The Filters panel contains sections for Application, Entities, and Tables filters. Clear the appropriate filter
boxes to narrow your search results.
2. The Search Results panel displays records that match your search criteria. Click the record header to
view technical details. Click the record summary to view the record's details in the Details panel.
3. The Actions menu contains options to export results to a PDF file. You can export all results or results
from the current page.
4. The Details panel displays a record's column names and values.
The following list describes some of the search queries you can use:
Single Word
Enter a single word in the search field. For example, enter Anita. Records with the word Anita appear in
the results.
Enter a phrase surrounded by double quotes. For example, enter "exceeding quota". Records with the
term exceeding quota appear in the results.
Replace Characters
Use ? and * as wildcard characters to replace letters in a single word search. You cannot use wildcards if
your search query has more than one word. You cannot replace the first character of the word with a
wildcard character.
To replace a single character in a word, use ?. For example, enter te?t. Records with the word tent, test,
or text appear in the results.
To replace multiple characters in a word, use *. For example, enter te**. Records with the word teak,
teal, team, tear, teen, tent, tern, test, or text appear in the results.
Similar Spelling
Use the ~ character to search for a word with a similar spelling. For example, enter roam~. Records with
the word foam or roams appear in the results.
Range Searches
You can search for records with a range of values. Use a range search for numbers or words.
Enter the high and low value of a range separated by TO. The TO must be in uppercase.
• To include the high and low values, surround the range with square brackets:
<column name>: [<high> TO <low>]
AND Operator
To search for records that contain all the terms in the search query, use AND between terms. The AND
must be in uppercase. You can replace AND with the characters &&.
For example, enter "Santa Clara CA" AND Platinum. Records with the phrase Santa Clara CA and the
word Platinum appear in the results.
To specify that a term in the search query must exist in the results, enter the + character before the term.
For example, enter +semiconductor automobile. Records with the word semiconductor appear in the
results. These records may or may not contain the word automobile.
NOT Operator
To search for records that exclude a term in the search query, enter NOT before the term. You can
replace NOT with either the ! or - character.
For example, enter Member NOT Bronze. Records that contain the word Member but not the word Bronze
appear in the results.
When you search the Data Vault, you specify the archive folder and the entity that you want to search in. If the
archive folder includes multiple schemas, the query uses the mined schema in the Enterprise Data Manager
for the corresponding entity. The search only returns records from the corresponding schema and table
combination. If the archive folder only includes one schema, the query uses the dbo schema.
Use table columns to specify additional search criteria to limit the amount of records in the search results.
The columns that you can search on depend on what columns are configured as searchable in the Data
You can add multiple search conditions. Use AND or OR statements to specify how the search handles the
conditions. You can also specify the maximum number of records you wish to see in the results. You have
the option to order the search results based on a specific column in the entity driving table and to sort them
in ascending or descending order.
You can save the search criteria that you configure. Saved criteria is useful when you configure multiple
search conditions and want to search using the same conditions in the future. You can edit and delete the
saved criteria, and you can designate whether the criteria is available for use by all users or only you.
Archive folder that contains the entity of the archived data that you want to search. The archive folder
limits the amount of entities shown in the list of values.
The archive folder list of values shows archive folders that include at least one entity that you have
access to. You have access to an entity if your user is assigned to the same role that is assigned to the
entity.
Entity
Entity that contains the table that you want to view archived data from.
The entity list of values shows all entities that you have access to within the application version of the
related archive folder.
Saved Criteria
Search conditions that have been configured and saved from a previous search.
The list of values displays all of the sets of saved criteria that are accessible to you.
The maximum number of records that you want to view in the search results.
This parameter does not affect the export functionality. When you export results, Data Archive exports all
records that meet the search criteria regardless of the value you specify in the Maximum Number of
Records in Results parameter.
Order By
Entity driving table column that you want to sort the order of the search results by.
The list of values contains the entity driving table columns that you can order the search results by.
Sort Order
The order by and sort order parameters apply to all of the search results returned across multiple pages.
When you configure the search criteria for an entity, you add search conditions to create a targeted search.
For example, you can add a condition to search on a date column for a specific date. You can add multiple
search conditions to a single search. You can also include the order by and sort order functions as part of the
saved criteria.
You can save the search conditions that you create to be used in future searches. When you save the search
conditions, you give the conditions a name that is unique within the entity. The name can have a maximum of
250 characters, though names that exceed 50 characters are displayed in the user interface with the first 45
characters followed by an ellipsis. Before you save the criteria, you also designate whether the search criteria
will be accessible to all users for searching, or if the criteria will only be visible and accessible to you.
Saved criteria can be edited or deleted only by the creator, who is the owner, and the Data Archive
Administrator. If the owner of the criteria designates that the criteria is accessible to all users, any user can
run a search using the criteria but cannot edit or delete the criteria. Users who do not own the criteria can
save the criteria under a different name and then edit it.
The number of saved criteria that you can create for a single entity is determined by a parameter in the
system profile. If you try to create new saved criteria and receive an error message that the criteria limit has
been exceeded for the entity, you can adjust the limit of allowed criteria in the system profile. To adjust the
maximum number of saved criteria that the system allows, click Administrator > System Profile > Data
Discovery Portal and enter a value for the Default Number of Saved Criteria per Entity parameter.
To delete saved criteria, select the entity and then the saved criteria name from the menu. Then click Delete
and OK in the confirmation window. Only the creator of the criteria and the Data Archive Administrator can
delete the saved criteria.
• Transaction View. The transaction view displays basic transaction details of the entity.
• Technical View. The technical view displays transaction details along with referential data details of the
entity.
The export file includes columns that are defined as display columns. The export file does not include the
legal hold, tagging, or effective retention policy columns. The export file does not contain the retention
expiration date column if you export selected records.
You can export the search results to one of the following file types:
• XML file
• PDF file
• Insert statements
• Delimited text file
Based on the file type that you download to, you can choose one of the following download options:
• Individual files. If multiple transactions are selected, each transaction has a separate file.
• Compressed file. All the selected transactions are downloaded in a compressed file.
You determine which records from the search results are included in the export file. Before you export, you
can select individual records, select all records on the current page, or select all records from all pages.
When you export the search results, the job exports the data and creates the .pdf file in the temporary
directory of the ILM application server. The job uses the following naming convention to create the file:
<table>_<number>.pdf
The job creates a hyperlink to the file that is stored in the temporary directory. Use the link to view or save
the .pdf file when you view the job status. By default, the files are stored in the temporary directory for one
day.
When you export to a delimited text file, specify the following parameters:
When you search the Data Vault by entity and select delimited text file as the export file type, the
separators include default values. The following table describes the default values for the separators:
Column Separator ,
Row Separator \n
The values for the column and row separator parameters determine the type of delimited text file the
system exports to. If you use the default values for the column and row separator parameters, then the
system exports data to a .csv file. If you use values other than the default values, then the system
exports data to a .txt file.
The Put Values in Quotes parameter is for .csv export. Enable the parameter if the source data includes
new line characters. The parameter determines whether the system puts the source values in quotes
internally when the system creates the .csv file. The quotes ensure that source data with new line
characters is inserted into one column as compared to multiple columns.
Name Address
The address column contains a new line character to separate the street and the apartment number. If
you enable the parameter, the exported file includes the address data in one column. If you disable the
parameter, the exported file splits the address data into two columns.
Only use this parameter if you use the default values for the separators. If you enable this parameter and
use values other than the default values for the separators, then the system puts the values in quotes in
the exported data.
1. From the search results, select the rows that you want to export.
Tip: You can manually select the rows or use the column header checkbox to select all records on the
page.
2. Click Export Data.
The Export Transaction Data dialog box appears.
3. Select the export data file type.
Browse Data
Browse Data enables you to use the Data Discovery portal to search and examine application data from
specific tables in the Data Vault. The Data Discovery portal uses the search criteria that you specify and
generates a SQL query to access data from the Data Vault. After you search the Data Vault, you can export
the search results to a CSV or PDF file.
When you specify the search criteria, you choose which table columns to display in the search results. You
can display all records in the table or use a where by clause to limit the search results. You can also specify
the sort order of the search results. Before you run the search, you can preview the SQL statement that the
Data Discovery portal uses to access the data.
You can also use Browse Data to view external attachments for entities that do not have a stylesheet. You
can view external attachments after the Load External Attachments job loads the attachments to the Data
Vault. You cannot search the contents of the external attachments.
When you search for external attachments, enter the entity that the attachments are associated to. The
search results show all attachments from the AM_ATTACHMENTS table. The table includes all attachments
that you archived from the main directory and also from any subfolders in that main directory. The search
results show a hyperlink to open the attachment, the original attachment directory location, and the
attachment file name. To view the attachment, click the hyperlink in the corresponding ATTACHMENT_DATA
column. When you click the link, a dialog box appears that prompts you to either open or save the file.
Archive folder that contains the entity of the archived data that you want to search.
The archive folder list of values shows archive folders that include at least one entity that you have
access to. You have access to an entity if your user is assigned to the same role that is assigned to the
entity.
You must select an archive folder to limit the amount of tables shown in the list of values.
Entity
Entity that contains the table that you want to view archived data from.
The entity is optional. Use the entity to limit the amount of tables shown in the list of values.
Schema
Schema that contains the table that you want to view. Required if the target connection for the archive
folder is enabled to maintain the source schema structure. You must enter the schema because the
archive folder may include multiple schemas with the same table name.
Table
Table that includes the archived data that you want to view.
For Browse Data, the table list of values shows all tables that you have access to within the application
version of the related archive folder.
No. of Rows
Export Data
You can export the search results to a .csv or a .pdf file. System-defined role assignments determine which
file type users can export data to. The exported file includes the columns that you selected as the display
columns for the browse data search. Note that the exported file does not include the legal hold and tagging
columns.
When you export the search results, you schedule the Browse Data Export job. The job exports the data and
creates a .pdf file in the temporary directory of the ILM application server. The job uses the following naming
convention to create the file:
<table>_<number>.pdf
The job creates a hyperlink to the file that is stored in the temporary directory. Use the hyperlink to view or
save the .pdf file when you view the job status.
1. After the search results appear, select CSV or PDF and click Export Data.
If you export data from a table that includes a large datatype object, a message appears. The message
indicates which columns contain the large object datatypes and asks if you want to continue to export
the data. If you continue, all of columns except for the columns with the large datatypes are exported.
Note: User authorizations control whether you can choose the file type to export to. Depending on your
user role assignment, the system may automatically export to a .pdf file.
The Schedule Jobs screen appears.
2. Schedule the Browse Data Export job.
3. Access Jobs > Monitor Jobs.
The Monitor Jobs screen appears if you schedule the job to run immediately.
4. Once the job completes, expand the Job ID.
5. Expand the BROWSE_DATA_EXPORT_SERVICE job.
6. Click View Export Data.
The File Download dialog box appears.
7. Open or save the file.
Legal Hold
You can put records from related entities on legal hold to ensure that records cannot be deleted or purged
from the Data Vault. You can also put a database schema or an entire application on legal hold. A record,
schema, or application can include more than one legal hold.
Records from related entities, when put on legal hold, are still available in Data Vault search results. Legal
hold at the application or schema does not depend on the relationship between tables.
You can apply a legal hold at the schema level to one or more schemas within an archive folder. A schema
might include an entity that contains tables that are also a part of a different schema. If you try to apply a
legal hold on a schema that contains tables that have a relationship to another schema, you receive a
warning message. The warning includes the name of the schema that contains the related tables.
You can continue to apply the legal hold on the selected schemas, or you can cancel to return to the selection
page and add the schema that contains the related tables. If you do not include the schema that contains the
related tables, orphan data is created in the related tables when you purge the tables that you placed on legal
hold.
A legal hold overrides any retention policy. You cannot purge records on legal hold. However, you can still
update the retention policy for records on legal hold. When you remove the legal hold, you can purge the
records that expired while they were flagged on legal hold. After you remove the legal hold, run the Purge
Expired Records job to delete the expired records from the Data Vault.
To apply a legal hold, you must have one of the following system-defined roles:
• Administrator
• Legal Hold User
When you access the menu, you see a results list of all legal hold groups. The results list includes the legal
hold group name and description. You can use the legal hold name or description to filter the list. From the
results list, you can select a legal hold group and apply the legal hold to records or an application as a whole
in the Data Vault. You can delete a legal hold group to remove the legal hold from records or an entire
application in the Data Vault.
Expand the legal hold group to view all records that are assigned to the legal hold group. The details section
includes the following information:
Archive Folder and Entities
Records that are on legal hold within the legal hold group are grouped by the archive folders and entities.
The archive folder is the target archive area that the entity was archived to. User privileges restrict the
entities shown.
Comments
Use the comments hyperlink to display the legal hold comments. The comments provide a long text
description to capture information related to the legal hold group, such as the reason why you put the
records on hold. You add comments when you apply a legal hold for the legal hold group.
View Icon
You can view the list of records that are on legal hold for each entity within the group. When you view the
records in an entity, the Data Discovery portal search results page appears. You can view each record
individually. Or, you can export the records to an XML file or a delimited text file. You must manually
select the records that you want to export. You can export up to 100 records at a time.
Print Icon
You can create a PDF report that lists all the records that are on legal hold for each entity within a legal
hold group. Each entity within the legal hold group has a print PDF icon. When you click the icon, a PDF
report opens in a new window. The PDF report displays the entity, the legal hold name, and all of the
columns defined in the search options for the entity. You can download or print the report.
To delete a legal hold group, you must have access to at least one entity that includes records that have the
legal hold group assignment. You have access to an entity if your user has the same Data Vault access role
2. Click the delete icon from the corresponding legal hold group.
The Remove Legal Hold dialog box appears.
3. In the comments field, enter a description that you can use for audit purposes.
For example, enter why you are removing the legal hold. The comments appear in the job summary when
you monitor the Remove Legal Hold job.
4. Click Delete Legal Hold.
The Schedule Job screen appears.
5. Schedule when you want to run the Remove Legal Hold job.
When the job runs, the system removes the legal hold from all records that are associated to the legal
hold group and deletes the legal hold group.
Tags
You can tag a set of records in the Data Vault with a user-defined value and later retrieve the records based
on the tag name. You can also base a retention policy on a tag value. A single record in the Data Vault can be
part of more than one tag. A tag can be of date, numeric, or string datatype.
You can define a maximum of four date datatype tags, a maximum of four numeric datatype tags, and one
string datatype tag in the Data Vault. You can use a maximum of 4056 characters for the string datatype tag.
A single tag can have different tag values for different records. When you tag a record, you specify a value
based on the datatype of the tag. You can update the value of a tag and remove the tags.
Adding Tags
Use the Data Discovery portal to add tags to your records.
Updating Tags
Use the Data Discovery portal to update the tag value of a record.
Removing Tags
Use the Data Discovery portal to remove a tag from a record.
Tags 199
9. Select the appropriate schedule options and click Schedule.
The Monitor Jobs page appears.
After the Remove Tagging Records job completes, you can click the expand option to view log details.
3. Use the arrows to move the tag and any other column to the Display Columns list.
4. Optionally, specify the Where Clause and Order By to further filter and sort the search results.
5. Click Preview SQL to view the generated SQL query.
6. Click Search.
The records that match the search criteria appear. Tag columns are identified by a yellow icon in the
column heading.
A Click to View link appears if the data in a record has more than 25 characters or if the record has an
attached file.
The View button might become permanently disabled because of an enhanced security setting in the
web browser. To enable the button, close Data Archive, disable enhanced security configuration for the
web browser, and restart Data Archive.
For example, to disable enhanced security configuration for Microsoft Internet Explorer on Windows
Server 2008, perform the following steps:
1. Log in to the computer with a user account that is a member of the local Administrators group.
2. Click Start > Administrative Tools > Server Manager.
3. If the User Account Control dialog box appears, confirm that the action it displays is what you want,
and then click Continue.
4. Under Security Information, click Configure IE ESC.
5. Under Administrators, click Off.
6. Under Users, click Off.
7. Click OK.
8. Restart Microsoft Internet Explorer.
For more information about browser enhanced security configuration, see the browser documentation.
• All of the constraints are imported for the entity in the Enterprise Data Manager.
Tip: View the entity constraints in the Data Vault tab.
• The metadata source for the Data Discovery portal searches is set to the correct source. In the
conf.properties file, verify that the informia.dataDiscoveryMetadataQuery property is set to AMHOME.
Non-English characters do not display correctly from the Data Discovery portal or any other reporting tools.
Verify that the LANG environment variable is set to UTF-8 character encoding on the Data Vault Service.
For example, you can use en_US.UTF-8 if the encoding is installed on the operating system. To verify
what type of encoding the operating system has, run the following command:
local -a
In addition, verify that all client tools, such as PuTTY and web browsers, use UTF-8 character encoding.
Technical view is not available for entities with binary or varbinary data types as primary, foreign, or
unique key constraints.
Data Visualization
This chapter includes the following topics:
Use data visualization to create, view, copy, run, and delete reports from data that is archived to the Data
Vault. You can select data from related or unrelated tables for a report. You can also create relationships
between tables and view a diagram of these relationships. You can then lay out, style, and format the report
with the design tools available on the data visualization interface.
When you access the Reports and Dashboards page, you can also view any pre-packaged reports that were
installed by an accelerator. Accelerator reports have the type "TEMPLATE_REPORT" in the Reports and
Dashboards window. You can perform only copy and grant operations on the TEMPLATE_REPORT type of
report. You must copy the accelerator reports to an archive folder that corresponds to the Data Vault
connection where the retired tables on which the reports are built are located. For example, when you copy
the reports from the Application Retirement for Healthcare accelerator, select an archive folder where the
healthcare-related tables are archived.
203
To create highly designed, pixel-perfect reports, install and use the Designer application on your desktop. The
Designer application contains an advanced design tool box. After you create a report in Designer, you can
publish it to the Data Archive server to view the report on the data visualization interface.
1. List of available reports. Click a row to select the report you plan to run. Click one or more checkboxes to
select reports you want to delete.
2. Filter box. Enter the entire or part of the value in the filter box above the appropriate column heading and
click the filter icon.
3. Actions drop-down menu. Contains options to run, create, edit, and copy reports. Also contains the
options to add permissions or revoke permissions for selected reports.
4. Filter icon. After you specify filter criteria in the fields above the column names, click the filter icon to
display the reports.
5. Clear filter icon. Click this icon to clear any filter settings.
Note: You can see the Data Visualization menu option if your product license includes data visualization.
1. Run a retirement job or an archive job so that data exists in the Data Vault.
2. Create a report by either entering an SQL query or selecting tables from the report wizard.
3. Optionally, create relationships between tables.
4. Design the report layout and style.
5. Run the report.
Report Creation
You can create reports with tables, cross tabs, and charts. Then add parameter filters, page controls, labels,
images, and videos to the report.
For example, you might create a tabular report that displays the customer ID, name, and last order date for all
retired customer records. You can add filters to allow the user to select customer-status levels. Or, you might
create a pie chart that shows the percentage of archived customer records that have the platinum, gold, or
silver status level.
You create a report by specifying the data you want in the report. You can specify the data for the report in
one of the following ways:
• Select the tables from the report-creation wizard. If required, create relationships between fields within
and between tables.
• Enter an SQL query with the schema and table details.
After you specify the data for a report, design, save, and run the report.
For example, if the column name is 34_REFTABLE_IVARCHAR, then the column name appears as
INFA_34_REFTABLE_IVARCHAR in the report creation wizard.
When a column name contains special characters, Data Archive replaces the special characters with an
underscore when the column name appears in the report creation wizard.
For example, if the column name is Order$Rate, then the column name appears as Order_Rate in the report
creation wizard.
3. Select a target connection from the Archive Folder Name drop-down list.
The archive folder list displays archive folders with the same file access role that you have.
4. Select Table(s) and click Next.
The Add Tables page appears.
5. Select the tables containing the data that you want in the report. Optionally, filter and sort schema, table,
or data object name to find the tables. Click Next.
11.
To remove a relationship, click the relationship arrow and click the Remove Join button .
12. Click OK.
The Add Tables page appears.
13. Click Add Constraints to view the updated relationship map. Click Close.
The Add Tables page appears.
14. Click Next.
The Create Report: Step 3 of 3 Step(s) window appears. This is the design phase of the report-creation
process.
3. Select a target connection from the Archive Folder Name drop-down list.
The archive folder list displays archive folders with the same file access role that you have.
4. Select SQL Query and click Next.
5. Enter a name for the SQL query in the Query Name field.
6. Enter a query and click Validate.
If you see a validation error, correct the SQL query and click Validate.
7. Click Next.
The Create Report: Step 3 of 3 Step(s) window appears. This is the design phase of the report-creation
process.
Parameter Filters
You can add parameter filters to a report to make it more interactive. Use a parameter filter to show records
with the same parameter value. To add parameter filters on a report, you must insert a prompt in the SQL
query you use to create the report.
Add a parameter prompt for each parameter you want to filter on. For example, you want to create a report
based on employee details that includes the manager and department ID. You want to give users the
flexibility to display a list of employees reporting to any given manager or department. To add filters for
manager and department IDs, you must use placeholder parameters in the SQL query for both the manager
and department ID variables.
• Integer
• Number
• String
• Decimal
If you do not specify a datatype, the parameter will be assigned the default datatype, string.
Note: Date, time, and datetime data types must be converted to string datatypes in the SQL query.
Convert data with a date, time, or datetime datatype to a string datatype in one of the following ways:
• Use the format @P_<Prompt> where the <DataType> value is not specified. The parameter values will
be treated as string by default.
• Add the char()> function to convert the parameter to string. The complete format is: char(<column
name>)>@P_<DataType>_<Prompt>
<Prompt>
Represents the label for the parameter filter. Do not use spaces or special characters in the <Prompt>
value.
Example
The SQL query to add parameter filters for manager and department IDs has the following format: select *
from employees where manager_id =@P_Integer_MgrId and Department_id =@P_Integer_DeptId
When you run a report, you can choose to filter data by manager ID and department ID. The following image
shows the filter lists for MgrId and DeptId:
You will also see the MgrId and DeptId filters in the Parameters section on the Report page.
LIKE Operators
For information about using the LIKE operator as a parameter filter, see the Informatica Data Vault SQL
Reference.
SELECT hp.party_name
,hc.cust_account_id
,hc.account_number customer_number
,amount_due_remaining amount_due_remaining
,aps.amount_due_original amount_due_original
,ra.trx_number
FROM apps.ra_customer_trx_all ra,
apps.ra_customer_trx_lines_all rl,
apps.ar_payment_schedules_all aps,
apps.ra_cust_trx_types_all rt,
apps.hz_cust_accounts hc,
apps.hz_parties hp,
apps.hz_cust_acct_sites_all hcasa_bill,
apps.hz_cust_site_uses_all hcsua_bill,
apps.hz_party_sites hps_bill,
apps.ra_cust_trx_line_gl_dist_all rct
WHERE ra.customer_trx_id = rl.customer_trx_id
AND ra.customer_trx_id = aps.customer_trx_id
AND ra.org_id = aps.org_id
AND rct.customer_trx_id = aps.customer_trx_id
AND rct.customer_trx_id = ra.customer_trx_id
and style for your report. For more information on designing a report, click the help button ( ) on the top
right corner of the wizard to open online help.
1. Select a predefined page template on the Page page of the wizard. To customize page size and
orientation, click Page Setup and enter specifications. Click Next.
The Layout page appears.
2. Select a layout for your report. You can customize the layout in the following ways:
• To split a cell, select the cell and click either the Horizontal Split and Vertical Split.
Report Permissions
Before you can run, delete, or copy a report, you must have the corresponding permission. If you have the
grant permission, you can also grant the ability to run, copy, or delete a report to other users or access roles.
The run, copy, delete, and grant permissions are separate permissions. Every report, whether created in Data
Visualization or imported from an accelerator, has permissions that can be granted or revoked for only that
report. Each permission can be granted to a specific user or to a Data Vault access role. Each user or access
role assigned to a report can have different permissions.
The run, delete, and copy permissions allow a user or access role to take those actions on an individual
report or multiple reports at one time. If your user or access role has the grant permission, you can also
assign the run, copy, and delete permissions to multiple users or access roles for a single report or multiple
reports.
You can add or delete permissions for one report at a time, or you can add or delete permissions for multiple
reports at one time.
By default, the Report Admin user has permission to run, copy, delete, and grant permissions for all reports. If
you do not have any permissions on a report, then the report is not visible to you in the Reports and
Dashboards window.
1. From the main menu, select Data Visualization > Reports and Dashboards.
The Reports and Dashboards window appears with a list of reports.
2. Select one or more reports to delete.
3. From the Actions drop-down list, select Delete Reports.
A confirmation message appears. Click Yes to delete the reports.
Copying Reports
If you have the copy permission, you can copy the reports in a folder to another folder.
To copy a report folder from one archive folder to another, the target archive folder must meet the following
requirements:
• The target archive folder must contain all the tables required for the report.
• The tables for the report must be in the same schema in the target archive folder.
1. From the main menu, select Data Visualization > Reports and Dashboards.
The Reports and Dashboards page appears with a list of reports within each folder.
2. Select a report folder that contains the reports that you want to copy.
3. Click Actions > Save As.
The Save As dialog box appears.
Before you copy reports based on a retired SAP application, perform the following tasks:
1. From the main menu, select Data Visualization > Reports and Dashboards.
The Reports and Dashboards window appears with a list of reports within each folder.
Name for the folder that you want to copy the reports to.
5. Click OK.
Data Archive copies the reports to the new folder.
6. From Data Archive, click Data Visualization > Reports and Dashboards.
7. Expand the folder that contains the copied SAP reports.
Troubleshooting
The following list describes some common issues you might encounter when you use Data Visualization.
You want to create a report but do not see the tables that you need in the selection list.
Contact your administrator and ensure that you have the same file access role as the data that you
require for the report.
You encountered a problem when creating or running a report and want to view the log file.
The name of the log file is applimation.log. To access the log file, go to Jobs > Job Log on the Data
Archive user interface.
When you click any button on the Data Visualization page, you see the error message, "HTTP Status 404."
You cannot access Data Visualization if you installed Data Visualization on one or more machines but
did not configure the machine to load it. Contact your Data Archive administrator.
You created a new user with the Report Designer role, but when you access Reports and Dashboards you receive an
HTTP error.
Data Archive updates new users according to the value of the user cache timeout parameter in the Data
Archive system profile. The default value of this parameter is five minutes. You can wait five minutes for
the user cache to refresh and then access the Reports and Dashboards page. If you do not want to wait
five minutes, or if a user assignment has not updated after five minutes, review the user cache timeout
A user can run a report even if the user does not have the Data Archive access roles for the tables in the report.
Run the report as a user with a Report Designer role. When you run the report with the Report Designer
role, Data Archive activates the security levels.
Data in a report does not appear masked although the data appears masked in Data Discovery search results.
Run the report as a user with a Report Designer role. When you run the report with the Report Designer
role, Data Archive activates the security levels.
You cannot create a Data Visualization report for entity reference tables.
Data Visualization reports for entity reference tables are not supported.
You might not be able to copy a report folder from one archive folder to another because of one of the
following reasons:
• The target archive folder does not contain one or more tables required for the report.
• The tables required for the report are not in the same schema.
The following table describes the outcome when you copy a report folder from one archive folder to
another under different scenarios:
2 UF2.MARA SAP.MARA Error. Data Archive cannot copy the report folder
UF2.MARC SAP.MARC because the target archive folder does not contain
the table UF2.BKPF.
UF2.BKPF
3 UF2.MARA SAP.MARA Error. Data Archive cannot copy the report folder
UF2.MARC SAP.MARC because the tables in the target archive folder are
spread across two different schemas, the SAP and
UF2.BKPF DBO.BKPF the DBO schemas.
Troubleshooting 225
Designer Application
Data Visualization Designer is a stand-alone application that you can install on your desktop. Use the
Designer application to create pixel-perfect page and web reports and then publish these reports to the Data
Archive server. Reports on the Data Archive server appear on the data visualization interface and can be
viewed and executed by you or other users.
Reports created in the Designer application can be edited only in Designer. Reports created on the data
visualization interface can be edited on the interface or saved to a local machine and edited in Designer.
For information on how to use Designer, see the Data Archive Data Visualization Designer User Guide.
You can generate the reports in the Data Visualization area of Data Archive. Each report contains multiple
search parameters, so that you can search for specific data within the retired entities. For example, you want
to review payment activities related to a specific invoice for a certain supplier and supplier site. You can run
the Invoice History Details report and select a specific value for the supplier and supplier type.
When you install the accelerator, the installer imports metadata for the reports as an entity within the
appropriate application module. Each report entity contains an interim table and potentially multiple tables
and views. You can view the report tables, views, and their columns and constraints in the Enterprise Data
Manager. You can also use the report entity for Data Discovery.
Some reports also contain user-defined views. The user-defined views are required for some reports that
include tables that have existed in different schemas in different versions of Oracle E-Business Suite. Some
of the user-defined views are required because of the complexity of the queries that generate a single report.
Finally, some of the user-defined views are required because some of the queries that generate the reports
refer to package functions that Data Vault does not support.
You can run a script to create a user-defined view that is required for a report. Contact Informatica Global
Customer Service to acquire the scripts for the user-defined views in each report.
227
Prerequisites
Before you can run a Data Visualization report, you must first retire the Oracle E-Business Suite application.
To retire a Oracle E-Business Suite application, perform the following high-level tasks:
Report Parameters
The following table describes the report input parameters:
Tables
• HZ_CUSTOMER_PROFILES
• HZ_CUST_ACCOUNTS
• HZ_CUST_ACCT_SITES_ALL
• HZ_CUST_PROFILE_CLASSES
Views
• OE_LOOKUPS_115
• OE_PRICE_LISTS_115_VL
• OE_TRANSACTION_TYPES_VL
• ORG_FREIGHT_VL
• ORG_ORGANIZATION_DEFINITIONS
• AR_LOOKUPS
• RA_SALESREPS
User-Defined Views
• V_REPORT_CU_ADDRESS
• V_REPORT_CU_AD_CONTACTS
• V_REPORT_CU_AD_PHONE
• V_REPORT_CU_AD_BUSPURPOSE
• V_REPORT_CU_AD_CON_ROLE
• V_REPORT_CU_AD_CON_PHONE
• V_REPORT_CU_BANKACCOUNTS
• V_REPORT_CU_BUS_BANKACCOUNT
• V_REPORT_CU_BUS_PYMNT_MTHD
• V_REPORT_CU_CONTACTS
• V_REPORT_CU_CONTACT_PHONE
• V_REPORT_CU_CONTACT_ROLES
• V_REPORT_CU_PHONES
• V_REPORT_CU_PYMNT_MTHDS
• V_REPORT_CU_RELATIONS
• V_REPORT_GL_SETS_OF_BOOKS
Tables
• GL_CODE_COMBINATIONS
Views
• ORG_ORGANIZATION_DEFINITIONS
• FND_FLEX_VALUES_VL
• FND_ID_FLEX_SEGMENTS_VL
User-Defined Views
• V_REPORT_GL_SETS_OF_BOOKS
• V_REPORT_ADJMNT_REGISTER
Report Parameters
The following table describes the report input parameters:
Views
• ORG_ORGANIZATION_DEFINITIONS
• AR_LOOKUPS
User-Defined Views
• V_REPORT_GL_SETS_OF_BOOKS
• V_REPORT_AR_ACCTSETS
• V_REPORT_AR_FRTLINES
• V_REPORT_AR_LINES
• V_REPORT_AR_RELTRX
• V_REPORT_AR_REVACCTS
• V_REPORT_AR_SALESREPS
• V_REPORT_AR_TAXLINES
• V_REPORT_AR_TRNFLEX
• V_REPORT_TRANSACTIONS
• Supplier Details
• Supplier History Payment Details
• Invoice History Details
The report displays detailed information for each supplier, and optionally, supplier site, including the user
who created the supplier/site, creation date, pay group, payment terms, bank information, and other supplier
or site information. You can sort the report by suppliers in alphabetic order, by supplier number, by the user
who last updated the supplier record, or by the user who created the supplier record.
Tables
• AP_BANK_ACCOUNTS_ALL
• AP_BANK_ACCOUNT_USES_ALL
• AP_TERMS_TL
• FND_USER
• AP_TOLERANCE_TEMPLATES
Views
• ORG_ORGANIZATION_DEFINITIONS
• PO_LOOKUP_CODES
• AP_LOOKUP_CODES
• FND_LOOKUP_VALUES_VL
• FND_TERRITORIES_VL
User-Defined Views
• V_REPORT_PO_VENDORS
• V_REPORT_PO_VENDOR_SITES_ALL
• V_REPORT_GL_SETS_OF_BOOKS
• V_REPORT_PO_VENDOR_CONTACTS
You can submit this report by supplier or supplier type to review the payments that you made during a
specified time range. The report displays totals of the payments made to each supplier, each supplier site,
and all suppliers included in the report. If you choose to include invoice details, the report displays the invoice
number, date, invoice amount, and amount paid.
The Supplier Payment History Details report also displays the void payments for a supplier site. The report
does not include the amount of the void payment in the payment total for that supplier site. The report lists
supplier payments alphabetically by supplier and site. You can order the report by payment amount, payment
date, or payment number. The report displays payment amounts in the payment currency.
Report Parameters
The following table describes the report input parameters:
Tables
• AP_CHECKS_ALL
• AP_INVOICES_ALL
• AP_INVOICE_PAYMENTS_ALL
• FND_LOOKUP_VALUES
Views
• ORG_ORGANIZATION_DEFINITIONS
User-Defined Views
• V_REPORT_PO_VENDORS
• V_REPORT_PO_VENDOR_SITES_ALL
The report generates a detailed list of all payment activities that are related to a specific invoice, such as
gains, losses, and discounts. The report displays amounts in the payment currency.
Important: Payments must be accounted before the associated payment activities appear on the report.
Report Parameters
The following table describes the report input parameters:
Supplier <Any> No
Invoice Number To - No
Invoice Date To - No
Report Views
The following views are included in the report:
Views
• ORG_ORGANIZATION_DEFINITIONS
User-Defined Views
• V_INVOICE_HIST_HDR
• V_INVOICE_HIST_DETAIL
• V_REPORT_PO_VENDORS
• V_REPORT_PO_VENDOR_SITES_ALL
• V_REPORT_GL_SETS_OF_BOOKS
Report Parameters
The following table describes the report input parameters:
Name of Sender - No
Title of Sender - No
Phone of Sender - No
Tables
• PO_USAGES
• PO_LINE_LOCATIONS_ALL
• PO_LINES_ALL
• PO_HEADERS_ALL
• PO_DISTRIBUTIONS_ALL
• FND_LOOKUP_VALUES
• FND_DOCUMENT_CATEGORIES
• FND_DOCUMENTS_TL
• FND_DOCUMENTS_SHORT_TEXT
• FND_DOCUMENTS_LONG_TEXT
• FND_DOCUMENTS
• FND_ATTACHED_DOCUMENTS
• AP_INVOICES_ALL
• AP_INVOICE_DISTRIBUTIONS_ALL
User-Defined Views
• V_REPORT_PO_VENDORS
• V_REPORT_PO_VENDOR_SITES_ALL
Purchasing Reports
In the Purchasing module, you can run the following report:
The report displays the quantity that you ordered and received, so that you can monitor the status of your
purchase orders. You can also review the open purchase orders to determine how much you still have to
receive and how much your supplier has already billed to you.
Report Parameters
The following table describes the report input parameters:
To <Any> No
PO Number From - No
To - No
Status <Any> No
Tables
• FND_CURRENCIES
Views
• ORG_ORGANIZATION_DEFINITIONS
• PO_LOOKUP_CODES
• PER_PEOPLE_F
User-Defined Views
• V_PO_HDR_LINES
• V_PO_HDR_LKP_CODE
• V_REPORT_GL_SETS_OF_BOOKS
• V_REPORT_PO_VENDORS
• V_REPORT_PO_VENDOR_SITES_ALL
This report has three sections. The first section is for enabled detail accounts, followed by disabled
accounts, and then summary accounts. Each section is ordered by the balancing segment value. You can
specify a range of accounts to include in your report. You can also sort your accounts by another segment in
addition to your balancing segment.
Report Parameters
The following tables describes the report input parameters:
Tables
• GL_CODE_COMBINATIONS
Views
• FND_FLEX_VALUES_VL
• FND_ID_FLEX_SEGMENTS_VL
• FND_ID_FLEX_STRUCTURES_VL
• GL_LOOKUPS
Use the report to review posted journal batches for a particular ledger, balancing segment value, currency,
and date range. The report sorts the information by journal batch within each journal entry category. In
addition, the report displays totals for each journal category and a grand total for each ledger and balancing
segment value combination. This report does not report on budget or encumbrance balances.
Tables
• GL_DAILY_CONVERSION_TYPES
• GL_JE_BATCHES
• GL_CODE_COMBINATIONS
• GL_JE_HEADERS
• GL_JE_LINES
• GL_PERIODS
• GL_PERIOD_STATUSES
Views
• FND_CURRENCIES_VL
• FND_FLEX_VALUES_VL
• FND_ID_FLEX_SEGMENTS_VL
• GL_JE_CATEGORIES_VL
• GL_JE_SOURCES_VL
User-Defined Views
• V_REPORT_GL_SETS_OF_BOOKS
Running a Report
You can run a report through the Data Visualization area of Data Archive. Each report has two versions. To
run the report, use the version with "Search Form" in the title, for example "Invoice History Details Report -
Search Form.cls."
To ensure that authorized users can run the report, review the user and role assignments for each report in
Data Archive.
You can generate the reports in the Data Visualization area of Data Archive. Each report contains multiple
search parameters, so that you can search for specific data within the retired entities. For example, you want
to review purchase orders with a particular order type and status code. You can run the Print Purchase Order
report and select a specific value for the order type and status code.
When you install the accelerator, the installer imports metadata for the reports as an entity within the
appropriate application module. Each report entity contains an interim table and multiple report-related
tables. You can view the report tables and the table columns and constraints in the Enterprise Data Manager.
You can also use the report entity for Data Discovery.
242
Prerequisites
Before you can run a Data Visualization report, you must first retire the JDEdwards Enterprise application.
Prerequisites 243
The pre-packaged JD Edwards application version has two schemas: TESTDTA and TESTCTL. The
TESTCTL schema has one table called "F0005." The remaining tables are from TESTDTA.
a. Run the following update statements:
/* identify your custom-defined product family version id*/
SELECT * FROM AM_PRODUCT_FAMILY_VERSIONS;
/* identify schema id’s of custom-defined product family version*/
SELECT * FROM AM_META_SCHEMAS WHERE PRODUCT_FAMILY_VERSION_ID= < pfv id>;
UPDATE AM_META_SCHEMAS SET META_SCHEMA_NAME='TESTCTL' WHERE
META_SCHEMA_ID=<Schema id of F0005 table>;
UPDATE AM_META_SCHEMAS SET META_SCHEMA_NAME='TESTDTA' WHERE
META_SCHEMA_ID=<Schema id of remaining tables>;
b. Create the metadata application modules to map to the JD Edwards retirement application modules.
Right-click on the application version and select New Application Module.
The Application Module Wizard window appears.
c. Enter the application module name to match the source name.
d. Repeat the naming process for all of the modules.
e. Right-click the product family version where the tables are imported and retired.
f. Select Copy Entities from Application Version.
The Copy Entities from Application Version window appears.
g. Select the application version from where you need to copy, typically JD Edwards Retirement 1.0.
h. Provide the prefix that will be appended before the entity while copying. The Copy All Entities option
is not required.
i. Click OK.
j. After you submit the background job, you can view the job status from the Data Archive user
interface. Once the job successfully completes, you can view the copied entities in your customer-
defined application version. You can use these entities for Data Discovery and retention.
k. Once the copy process is complete, revert the updates you made in step a.
In JD Edwards applications, dates are stored in Julian format which is 5 or 6 digits. When you search on
columns with date values in Data Discovery, select data conversion from Julian data to Gregorian data in
the available search options.
To update a retention policy, use the following conversion from Julian to Gregorian as an expression:
(CASE WHEN length(char(dec(F43199.OLUPMJ))) = 5 THEN date('1900-01-01') +
int(left(char(int(F43199.OLUPMJ)),2)) years +
int(right(char(int(F43199.OLUPMJ)),3))-1 days ELSE date('1900-01-01') +
int(left(char(int(F43199.OLUPMJ)),3)) years +
int(right(char(int(F43199.OLUPMJ)),3))-1 days END)
11. Before you run the reports, copy the report templates to the folder that corresponds to the Data Vault
connection. You can use these pre-packaged Data Visualization reports to access retired data in the
Data Vault. When you copy the reports, the system updates the report schemas based on the connection
details.
12. Optionally, validate the retired data.
13. After you retire and validate the application, use the Data Discovery Portal, Data Visualization reports, or
third-party query tools to access the retired data.
Report Parameters
The following table describes the report input parameters:
Supplier % Yes
Report Tables
The following tables are included in the report:
• F0005
• F0010
• F0101
• F0111
• F0411
• F0413
• F0414
Aging Credits 0 No
Order Type % No
Report Tables
The following tables are included in the report:
• F0005
• F0010
• F0101
• F0013
• F0115
• F0411
Report Parameters
The following table describes the report input parameters:
Document Type % No
Document Number - No
Report Tables
The following tables are included in the report:
• F0005
Report Parameters
The following table describes the report input parameters:
Document Type % No
Document Number - No
Batch Number - No
Report Tables
The following tables are included in the report:
• F0005
• F0010
• F0013
• F0111
• F0411
• F0901
• F0911
Report Parameters
The following table describes the report input parameters:
Order Number To - No
Order Type % No
Company % Yes
Supplier % No
Report Tables
The following tables are included in the report:
• F0005
• F0101
• F0010
• F0111
• F0116
• F4301
• F4311
Order Number To - No
Order Type % No
Company % Yes
Supplier % Yes
Report Tables
The following tables are included in the report:
• F0005
• F0101
• F0010
• F0006
• F4301
• F4316
• F4311
Report Parameters
The following table describes the report input parameters:
Order Number To - No
Order Type % No
Supplier % Yes
• F0005
• F0101
• F0010
• F0013
• F40205
• F4301
• F4311
• F4316
• F43199
Report Parameters
The following table describes the report input parameters:
Document Number - No
Account Number % No
Report Tables
The following tables are included in the report:
• F0005
• F0010
Report Parameters
The following table describes the report input parameters:
Account Number % No
Address Number % No
Company % Yes
Report Tables
The following tables are included in the report:
• F0006
• F0010
• F0013
• F0101
• F0111
• F0116
• F0901
• F0911
Report Parameters
The following table describes the report input parameters:
Batch Number - No
Batch Type % No
Document Number To - No
Report Tables
The following tables are included in the report:
• F0005
• F0901
• F0911
Report Parameters
The following table describes the report input parameters:
Period 1 Yes
Sub Ledger * No
Currency Code % No
Report Tables
The following tables are included in the report:
• F0012
• F0013
• F0902
• F0901
• F0008
Report Parameters
The following table describes the report input parameters:
Company % Yes
Report Tables
The following tables are included in the report:
• F0909
• F0005
• F0010
Age Credits 0 No
Company % Yes
Report Tables
The following tables are included in the report:
• F03B11
• F0010
• F0013
• F0101
Report Parameters
The following table describes the report input parameters:
Batch Type % No
Batch Number - No
Document Type % No
Document Number - No
Company % Yes
Report Tables
The following tables are included in the report:
• F0005
Report Parameters
The following tables describes the report input parameters:
Print Currency - No
Report Tables
The following tables are included in the report:
• F0005
• F0010
• F0013
• F0101
• F0111
• F0116
• F03B11
Report Parameters
The following table describes the report input parameters:
Branch Number % No
Order Type % No
Company % Yes
Status Code % No
Report Tables
The following tables are included in the report:
• F0005
• F0006
• F0013
• F0101
• F4211
• F0010
Report Parameters
The following table describes the report input parameters:
Customer Number % No
Branch Number % No
Report Tables
The following tables are included in the report:
• F0006
• F0013
• F0101
• F4101
• F42199
• F0010
Report Parameters
The following table describes the report input parameters:
Hold Code % No
Report Tables
The following tables are included in the report:
• F0005
• F0101
• F4209
Inventory Reports
In the Inventory module, you can run the following report:
Item Number % No
Report Tables
The following tables are included in the report:
• F4101
• F0005
Report Parameters
The following table describes the report input parameters:
Address Number % No
Report Tables
The following tables are included in the report:
• F0005
• F0101
• F0111
• F0115
• F0116
Running a Report
You can run a report through the Data Visualization area of Data Archive. Each report has two versions. To
run the report, use the version with "Search Form" in the title, for example "Invoice Journal Report - Search
Form.cls."
To ensure that authorized users can run the report, review the user and role assignments for each report in
Data Archive.
You can generate the reports in the Data Visualization area of Data Archive. Each report contains multiple
search parameters, so that you can search for specific data within the retired entities. For example, you want
to review purchase orders with a particular order type and status code. You can run the Print Purchase Order
report and select a specific value for the order type and status code.
When you install the accelerator, the installer imports metadata for the reports as an entity within the
appropriate application module. Each report entity contains an interim table and multiple report-related
tables. You can view the report tables and the table columns and constraints in the Enterprise Data Manager.
You can also use the report entity for Data Discovery.
260
Prerequisites
Before you can run a Data Visualization report, you must first retire the Oracle PeopleSoft application.
Prerequisites 261
• AP Payment History by Payment Method
• AP Posted Voucher Listing
Report Parameters
The following table describes the report input parameters:
Supplier ID <Any> No
Report Tables
The report contains the following tables:
• PS_BUS_UNIT_TBL_AP
• PS_BUS_UNIT_TBL_FS
• PS_INSTALLATION_FS
• PS_PYMNT_VCHR_XREF
• PS_PYMT_TRMS_HDR
• PS_SEC_BU_CLS
• PS_SEC_BU_OPR
• PS_SETID_TBL
• PS_SET_CNTRL_REC
• PS_VCHR_VNDR_INFO
• PS_VENDOR
• PS_VOUCHER
Report Tables
The following tables are included in the report:
• PSXLATDEFN
• PSXLATITEM
• PS_BANK_ACCT_CPTY
• PS_BANK_ACCT_DEFN
• PS_BANK_BRANCH_TBL
• PS_BANK_CD_TBL
• PS_CURRENCY_CD_TBL
• PS_PAYMENT_TBL
• PS_SETID_TBL
Report Parameters
The following table describes the report input parameters:
Report Tables
The following tables are included in the report:
• PS_BUS_UNIT_TBL_AP
• PS_BUS_UNIT_TBL_FS
• PS_VCHR_ACCTG_LINE
• PS_VCHR_VNDR_INFO
• PS_VENDOR
• PS_VOUCHER
Report Parameters
The following table describes the report input parameters:
Vendor ID <Any> No
Buyer <Any> No
Report Tables
The following tables are included in the report:
• PS_BUS_UNIT_TBL_FS
• PS_BUS_UNIT_TBL_PM
• PS_CNTRCT_CONTROL
• PS_OPR_DEF_TBL_FS
• PS_OPR_DEF_TBL_PM
• PS_PO_HDR
• PS_PO_LINE
• PS_PO_LINE_SHIP
• PS_SETID_TBL
• PS_VENDOR
• PSOPRDEFN
Report Parameters
The following table describes the report input parameters:
As Of Date (yyyy-MM-dd) - No
DateRange: - No
From (yyyy-MM-dd)
To (yyyy-MM-dd)
Report Tables
The following tables are included in the report:
• PS_BUS_UNIT_TBL_FS
• PS_DEPT_TBL
• PS_MASTER_ITEM_TBL
• PS_PO_HDR
• PS_PO_LINE
• PS_PO_LINE_SHIP
• PS_VENDOR
• PS_PO_LINE_DISTRIB
Report Parameters
The following table describes the report input parameters:
To Date (yyyy-MM-dd) - No
Report Tables
The report includes the following tables:
• PS_BUS_UNIT_TBL_FS
• PS_BUS_UNIT_TBL_PM
• PSOPRDEFN
• PS_PO_HDR
• PS_PO_LINE
• AR Deposit Summary
• AR Payment Details
• AR Payment Summary
Report Parameters
The following table describes the report input parameters:
UserID <Any> No
Report Tables
The following tables are included in the report:
• PSXLATITEM
• PS_BANK_ACCT_CPTY
• PS_BANK_ACCT_DEFN
Report Parameters
The following table describes the report input parameters:
User ID <Any> No
Deposit ID <Any> No
Report Tables
The following tables are included in the report:
• PS_DEPOSIT_CONTROL
• PS_PAYMENT
• PS_OPR_DEF_TBL_AR
• PS_OPR_DEF_TBL_FS
• PS_PENDING_ITEM
• PS_GROUP_CONTROL
Report Parameters
The following table describes the report parameters:
User ID <Any> No
Deposit ID <Any> No
Report Tables
The following tables are included in the report:
• PS_BUS_UNIT_TBL_AR
• PSXLATITEM
• PS_BUS_UNIT_TBL_FS
• PS_OPR_DEF_TBL_AR
• PS_OPR_DEF_TBL_FS
• PS_PAYMENT
• PS_DEPOSIT_CONTROL
• GL Trial Balance
• GL Journal Activity
• GL Journal Entry Detail
Report Parameters
The following table describes the report input parameters:
Report Tables
The following tables are included in the report:
• PS_BUS_UNIT_TBL_GL
• PS_BUS_UNIT_TBL_FS
• PS_CAL_DETP_TBL
• PS_CURRENCY_CD_TBL
• PS_GL_ACCOUNT_TBL
• PS_LEDGER
• PS_LED_DEFN_TBL
• PS_PRODUCT_TBL
Report Parameters
The following table describes the report input parameters:
For Date: - No
From Date (yyyy-MM-dd)
To Date (yyyy-MM-dd)
For Period: - No
From
To
Fiscal Year
Detail/Summary Detail No
Report By Account No
Report Tables
The following tables are included in the report:
• PS_BUS_UNIT_TBL_GL
• PS_BUS_UNIT_TBL_FS
• PSXLATITEM
• PS_JRNL_HEADER
• PS_JRNL_LN
• PS_LED_DEFN_TBL
• PS_LED_GRP_TBL
• PS_SOURCE_TBL
Report Parameters
The following table describes the report input parameters:
To Date (yyyy-MM-dd) - No
Journal ID ALL No
Source ALL No
Account <Any> No
Product <Any> No
Report Tables
The following tables are included in the report:
• PS_BUS_UNIT_TBL_GL
• PS_BUS_UNIT_TBL_FS
• PSXLATITEM
• PS_JRNL_HEADER
• PS_JRNL_LN
• PS_LED_DEFN_TBL
• PS_LED_GRP_TBL
• PS_SOURCE_TBL
• PS_GL_ACCOUNT_TBL
• PS_OPEN_ITEM_GL
• PS_PRODUCT_TBL
Running a Report
You can run a report through the Data Visualization area of Data Archive. Each report has two versions. To
run the report, use the version with "Search Form" in the title, for example "Invoice Journal Report - Search
Form.cls."
To ensure that authorized users can run the report, review the user and role assignments for each report in
Data Archive.
Smart Partitioning
This chapter includes the following topics:
As an application database grows, application performance declines. Smart partitioning uses native database
partitioning methods to create segments that increase application performance and help you manage
application data growth.
Segments are sets of data that you create with smart partitioning to optimize application performance. You
can query and manage segments independently, which increases application response time and simplifies
processes such as database compression. You can also restrict access to segments based on application,
database, or operating system users.
You create segments based on dimensions that you define according to your organization's business
practices and the applications that you want to manage. A dimension is an attribute that defines the criteria
to create segments.
• Time. You can create segments that contain data for a certain time period, such as a year or quarter.
• Business unit. You can create segments that contain the data from different business units in your
organization.
• Geographic location. You can create segments that contain data for employees based on the country they
live in.
274
segmentation policy. After you run the segmentation policy to create segments, and can create access
policies.
When you use smart partitioning, you complete the following tasks:
1. Create dimensions. Before you can create segments, you must create at least one dimension. A
dimension adds a business definition to the segments that you create. You create dimensions in the
Enterprise Data Manager.
2. Create a segmentation group. A segmentation group defines database and application relationships.
You create dimensions in the Enterprise Data Manager.
3. Create a data classification. A data classification defines the data contained in the segments you want
to create. Data classifications apply criteria such as dimensions and dimension slices.
4. Create and run a segmentation policy. A segmentation policy defines the data classification that you
apply to a segmentation group.
5. Create access policies. An access policy determines the segments that an individual user or program
can access.
After you create segments, you can optionally run management operations on them. You can compress
segments, make segments read-only, or move segments to another storage classification.
As time passes and additional transactions enter the segmented tables, you can create more segments.
Dimensions
A dimension determines the method by which the segmentation process creates segments, such as by time
or business unit. Dimensions add a business definition to data, so that you can manage it easily and access it
quickly. You create dimensions during smart partitioning implementation.
The Data Archive metadata includes a dimension called time. You can use the time dimension to classify the
data in a segment by date or time. You can also create custom dimensions based on business needs, such
as business unit, product line, or region.
Segmentation Groups
A segmentation group is a group of tables that defines database and application relationships. A
segmentation group represents a business function, such as order management, accounts payable, or human
resources.
The ILM Engine creates segments by dividing each table consistently across the segmentation group. This
process organizes transactions in the segmentation group by the dimensions you associated with the group
in the data classification. This method of partitioning maintains the referential integrity of each transaction.
If you purchased an application accelerator such as the Oracle E-Business Suite or PeopleSoft accelerator,
some segmentation groups are predefined for you. If you need to create your own segmentation groups, you
must select the related tables that you want to include in the group. After you select the segmentation group
tables you must mark the driving tables, enter constraints, associate dimensions with the segmentation
group, and define business rules.
Data Classifications
A data classification is a policy composed of segmentation criteria such as dimensions and dimension
slices. You create data classifications to apply consistent segmentation criteria to a segmentation group.
Before you can create segments, you must create a data classification and assign a segmentation group to
the data classification. Depending on how you want to create segments, you add one or more dimensions to
You can assign multiple segmentation groups to the same data classification. If the segmentation groups are
related, such as order management and purchasing, you might want to apply the same segmentation criteria
to both groups. To ensure that the ILM Engine applies the same partitioning method to both segmentation
groups, assign the groups to the same data classification.
Segmentation Policies
A segmentation policy defines the data classification that you want to apply to a segmentation group. When
you run the segmentation policy, the ILM Engine creates a segment for each dimension slice you configured
in the data classification.
After you run a segmentation policy, choose a method of periodic segment creation for the policy. The
method of periodic segment creation determines how the ILM Engine creates segments for the segmentation
group in the future, as the default segment becomes too large.
Access Policies
An access policy establishes rules that limit user or program access to specified segments. Access policies
increase application performance because they limit the number of segments that a user or program can
query.
Access policies reduce the amount of data that the database retrieves, which improves both query
performance and application response time. When a user or program queries the database, the database
performs operations on only the segments specified in the user or program access policy.
Before you create access policies based on program or user, create a default access policy. The default
access policy applies to anything that can query the database. Access policies that you create for specific
programs or users override the default access policy.
To increase application response time, you want to divide the transactions in a high-volume general ledger
application module so that application users can query specific segments. You decide to create segments
based on calendar year. In the Enterprise Data Manager you create a time dimension that you configure as a
date range. Then you create a general ledger segmentation group that contains related general ledger tables.
In the Data Archive user interface you create a data classification and configure the dimension slices to
create segments for all general ledger transactions by year, from 2010 through the current year, 2013.
When you run the segmentation policy, the ILM Engine divides each table across the general ledger
segmentation group based on the dimension slices you defined. The ILM Engine creates four segments, one
for each year of general ledger transactions from 2010-2012, plus a default segment. The default segment
contains transactions from the current year, 2013, plus transactions that do not meet business rule
requirements for the other segments. As users enter new transactions in the general ledger, the application
inserts the transactions in the default segment where they remain until you move them to another segment or
create a new default segment.
Before you run the segmentation policy, you must create a data classification and assign a segmentation
group to the data classification. Depending on how you want to create segments, you add one or more
dimensions to a data classification.
After you add dimensions to a data classification, create dimension slices for each dimension in the data
classification. Dimension slices specify how the data for each dimension is organized. When you run the
segmentation policy, the ILM Engine uses native database partitioning methods to create a segment for each
dimension slice in the data classification.
You can assign multiple segmentation groups to the same data classification. If the segmentation groups are
related, such as order management and purchasing, you might want to apply the same segmentation criteria
to both groups. To ensure that the ILM Engine applies the same partitioning method to both segmentation
groups, assign the groups to the same data classification.
278
Single-dimensional Data Classification
A single-dimensional data classification uses one dimension to divide the application data into segments.
Use a single-dimensional data classification when you want to create segments for application data in one
way, such as by year or quarter. Single-dimensional data classifications often use the time dimension.
For example, you manage three years of application data that you want to divide into segments by quarter.
You create a time dimension and select range as the dimension type and date as the datatype. When you run
the segmentation policy the ILM Engine creates 13 segments, one for each quarter and one default segment.
The data in the default segment includes all of the data that does not meet the requirements for a segment,
such as transactions that are still active. The default segment also includes new transactional data.
You can include more than one dimension in a data classification, so that Data Archive uses all the
dimensions to create the segments. When you apply multidimensional data classifications, the ILM Engine
creates segments for each combination of values.
For example, you want to create segments for a segmentation group by both a date range and sales office
region. The application has three years of data that you want to create segments for. You choose the time
dimension to create segments by year. You also create a custom dimension called region. You configure the
region dimension to create segments based on whether each sales office is in the Eastern or Western sales
territory.
When you run the segmentation policy, the ILM Engine creates seven segments, one for each year of data
from the Western region, one for each year of data from the Eastern region, and a default segment. Each non-
default segment contains the combination of one year of data from either the East or West sales offices. All
remaining data is placed in the default segment. The remaining data includes data that does not meet the
requirements for a segment, such as transactions that are still active, plus all new transactions.
Dimension Slices
Create dimension slices to specify how the data for each dimension is organized. When you run the
segmentation policy, the ILM Engine creates a segment for each dimension slice in the data classification.
Dimension slices define the value of the dimensions you include in a data classification. Each dimension
slice you create corresponds to a segment. You might create a dimension slice for each year or quarter of a
time dimension. When you create the dimension slice you enter the value of the slice.
Every dimension slice has a corresponding sequence number. When you create a dimension slice, the ILM
Engine populates the sequence number field. Sequence numbers indicate the order in which you created the
segments. The slice with the highest sequence number should correspond to the most recent data. If the
slice with the highest sequence number does not correspond to the segment with the most chronologically
recent data, change the sequence number of the slice.
First you create a single-dimensional data classification with the time dimension. Then you add three
dimension slices to the time dimension, one for each year of transactions. When you run the segmentation
policy, the ILM Engine creates a segment for each dimension slice in the classification.
The initial segmentation process creates a segment for each dimension slice in the data classification, plus a
default segment for new transactions and transactions that do not meet the business rule criteria of other
segments. Over time, the default segment becomes larger and database performance degrades. To avoid
performance degradation you must periodically create a new default segment, split the current default
segment, or create a new non-default segment to move data to. Before you redefine the default segment or
create a new one, you must create a dimension slice that specifies the values of the new or redefined
segment.
If you plan to use the time dimension to create segments by year or quarter, you can use a formula to simplify
the process of redefining or creating a new default segment. Before you run a segmentation policy for the
first time, select a formula in the data classification that you plan to apply to the segmentation group. The
formula creates and determines the value of the next logical dimension slice that Data Archive uses to
redefine the default segment or create a new one.
Formulas use a specific naming convention. If you use a formula in a data classification, you must name the
dimension slices you create according to the formula naming convention.
YYYY Creates a dimension slice with a value of one year, such as January 01, 2013-
December 31, 2013. If you use the YYYY formula, you must name the dimension slices
for the dimension in the YYYY format, such as "2013."
QQYY Creates a dimension slice with a value of one quarter, such as January 01, 2013-March
31, 2013. If you use the QQYY formula, you must name the dimension slices for the
dimension in the QQYY format, such as "0113" for the first quarter of 2013.
YYYYQQ Creates a dimension slice with a value of one quarter, such as January 01, 2013-March
31, 2013. If you use the YYYYQQ formula, you must name the dimension slices for the
dimension in the YYYYQQ format, such as "201301" for the first quarter of 2013.
YYYY_QQ Creates a dimension slice with a value of one quarter, such as January 01, 2013-March
31, 2013. If you use the YYYY_QQ formula, you must name the dimension slices for the
dimension in the YYYY_QQ format, such as "2013_01" for the first quarter of 2013.
A segmentation policy defines the data classification that you want to apply to a segmentation group. When
you run the segmentation policy, the ILM Engine creates a segment for each dimension slice you configured
in the data classification.
After you run a segmentation policy, choose a method of periodic segment creation for the policy. The
method of periodic segment creation determines how the ILM Engine creates segments for the segmentation
group in the future, as the default segment becomes too large.
After you create segments, you can run management operations on an individual segment or a segment set.
You can make segments read-only, move segments to another storage classification, or compress segments.
Related Topics:
• “Creating a Segmentation Policy” on page 287
• “ Check Indexes for Segmentation Job” on page 28
282
Segmentation Policy Process
A segmentation policy defines the data classification that you apply to a segmentation group. When you
create a segmentation policy, you configure steps and properties to determine how the policy runs. Create
and run a segmentation policy to create segments.
Before you create a segmentation policy, you must create a data classification. A data classification is a
policy composed of segmentation criteria such as dimensions and dimension slices. Data classifications
apply consistent segmentation criteria to a segmentation group. You can reuse data classifications in
multiple segmentation policies.
When you create a segmentation policy, you complete the following steps:
1. Select the data classification that you want to associate with a segmentation group. A segmentation
group is a group of tables based on business function, such as order management or accounts
receivable. You create segmentation groups in the Enterprise Data Manager. If you want to use interim
table processing to pre-process the business and dimension rules associated with a segmentation
group, you must configure interim table processing in the Enterprise Data Manager.
2. Select the segmentation group that you want to create segments for. A segmentation group is a group of
tables based on business function, such as order management or accounts receivable. You create
segmentation groups in the Enterprise Data Manager. You can run the same segmentation policy on
multiple segmentation groups.
3. After you select the segmentation group, review and configure the tablespace and data file properties for
each segment. The tablespace and date file properties determine where and how Data Archive stores the
segments. When you run the segmentation policy, the ILM Engine creates a tablespace for each
segment.
4. If necessary, you can save a draft of a segmentation policy and exit the wizard at any point in the
creation process. If you save a draft of a segmentation policy after you add a segmentation group to the
policy, the policy will appear on the Segmentation Policy tab with a status of Generated. As long as the
segmentation policy is in the Generated status, you can edit the policy or add a table to the
segmentation group in the Enterprise Data Manager.
5. Configure the segmentation policy steps, then schedule the policy to run. The ILM Engine runs a series
of steps and sub-steps to create segments for the first time. You can skip any step or pause the job run
after any step. You can also configure Data Archive to notify you after each step in the process is
complete. When you configure the policy steps you have the option to configure advanced segmentation
parameters. After you run the policy, it appears as Implemented on the Segmentation Policy tab. Once a
policy is implemented, you cannot edit it. You can run multiple segmentation policies in parallel.
6. If necessary, you can run the unpartition tables standalone job to reverse the segmentation process and
return the tables to their original state. For more information about standalone jobs, see the chapter
Scheduling Jobs.
Related Topics:
• “Creating a Segmentation Policy” on page 287
• “ Check Indexes for Segmentation Job” on page 28
Property Description
Segment Group Name of the segmentation group you want to create segments for. This field is read-
only.
Table Space Name Name of the tablespace where the segmentation group resides. This field is read-only.
Storage Class Name of the storage classification where the segmentation group resides.
Force Logging Forces the generation of redo records for all operations. Select yes or no. Default is
yes.
Segment Name Name of the segment you want to create a tablespace for. This field is read-only.
Compression Type Type of compression applied to the global tablespace. Choose either No Compression
or For OLTP . For information about Oracle OLTP and other types of compression, refer
to database documentation.
Allocation Type Type of allocation. Select auto allocate or uniform. Default is auto allocate.
File Size Maximum size of the data file. Default is 100 MB.
Auto Extensible Increases the size of the data file if it requires more space in the database.
Max File Size Maximum size to which the data file can extend if you select auto extensible.
Increment By Size to increase the data file by if you select auto extensible.
If you choose to skip a step in the policy, Data Archive might also select related steps in the policy to skip.
For example, if you skip the Create Audit Snapshot - Before step, Data Archive also selects Skip for the
Create Audit Snapshot - After and Compare Audit Snapshots - Summary steps. If Data Archive selects Skip
for a job step, you can clear the check box to run the step if required.
Step Description
Create audit snapshot - Creates a snapshot of the state of the objects, such as indexes and triggers, that
before belong to the segmentation group before the ILM Engine moves any data.
Create optimization indexes Creates optimization indexes to improve smart partitioning performance. Indexes are
used during the segment data step to generate the selection statements that create
segments. This step is applicable only if you have defined optimization indexes in the
Enterprise Data Manager.
Preprocess business rules Uses interim tables to preprocess the business rules you applied to the segmentation
group. This step is visible only if you selected to use interim table processing in the
Enterprise Data Manager.
Get table row count per Estimates the size of the segments that will be created when you run a segmentation
segment policy.
Default is skip.
Allocate datafiles to mount Estimates the size of tables in the segmentation group based on row count and
points determines how many datafiles to allocate to a particular tablespace.
Default is skip.
Generate explain plan Creates a plan for how the smart partitioning process will run the SQL statements that
create segments. This plan can help determine how efficiently the statements will run.
Default is skip.
Segment data Required. Moves the data into the new tablespaces.
Drop optimization indexes Drops any optimization indexes you created in the Enterprise Data Manager.
Create audit snapshot - Creates a snapshot of the state of the objects, such as indexes and triggers, that
after belong to the segmentation group after the ILM Engine moves the data into the
segments.
Compare audit snapshots - Compares and summarizes the before and after audit snapshots.
summary
Compile invalid objects Compiles any not valid objects, such as stored procedures and views, that remain after
the ILM Engine creates segments.
Enable access policy Enables the access policies you applied to the segmentation group.
Collect segmentation group Collects database statistics, such as row count and size, for the segmentation group.
statistics
Clean up after segmentation Cleans up temporary objects, such as tables and indexes, that are no longer valid after
segment creation.
Field Description
Add Part Column to Non Adds the partition column to a non-unique index.
Unique Index
Index Creation Option Select global, local, or global for unique indexes. The smart partitioning process
recreates indexes according to the segmentation policy you configure. If the source
table index was previously partitioned, the segmentation process might drop the
original partitions. Default is local.
Bitmap Index Option Select local bitmap or global regular. Default is local bitmap.
Parallel Degree The degree of parallelism the ILM Engine uses to create and alter tables and indexes.
Default is 4.
Parallel Processes Number of parallel processes the ILM Engine uses in the compile invalid objects step.
Default is 4.
Estimate Percentage The sampling percentage. Valid range is 0-100. Default is 10.
Apps Stats Option Select no histogram or all global indexed columns auto histogram.
Drop Managed Renamed Drops the original tables after you create segments.
Tables
Count History Segments Flag that indicates if you want to merge archived history data with production data
when you run the segmentation policy. If set to Yes, the ILM Engine includes the
history data during the Create Audit Snapshots steps and the Compare Audit Snapshot
step of the segmentation process.
Drop Existing Interim Tables If you have enabled interim table processing to pre-process business rules, the
partitioning process can drop any existing interim tables for the segmentation group
and recreate the interim tables. Select Yes to drop existing interim tables and recreate
them. If you select No and interim tables exist for the segmentation group, the
partitioning process will fail.
Related Topics:
• “Smart Partitioning Segmentation Policies Overview” on page 282
• “Segmentation Policy Process” on page 283
• “ Check Indexes for Segmentation Job” on page 28
When you create a business rule analysis report, you first specify the period of time that you want the report
to cover. You can choose from a yearly, quarterly, or monthly report.
You can generate three types of reports. The transaction summary report is a graphical representation of all
segmentation group records that will be processed using all business rules applied to the driving table. The
report details how many records will be active and how many records will be inactive for the segmentation
group. Active records will remain the default segment while inactive records will move to the appropriate
segment in accordance with business rules.
The transaction summary by period report details how many records will be active versus inactive by the
report period you specified. For example, if the segmentation group contains data from 2005 to 2013, the
report displays a bar graph for each year that details how many records will be active and how many records
will be inactive for the given year after all business rules are applied.
If you want to view how many records will be active versus inactive for a particular business rule instead of
all business rules, create a business rule details by period report. The business rule details by period report
displays a graphical representation of active versus inactive records by the time period and specific business
rule that you select.
If you copy a segmentation group, enable business rules pre-processing, and try to generate a business rule
analysis report for the copied segmentation group, you will receive an error. You can view business rule
analysis reports for one segmentation group at a time, for example either the original group or the copied
group.
The first time you run a segmentation policy, the ILM Engine creates a segment for each dimension slice in
the data classification, plus a default segment for new transactions and transactions that do not meet the
business rule criteria of other segments.
Over a period of time, the default segment becomes larger and database performance degrades. To avoid
performance degradation you must periodically create a new default segment, split the current default
segment, or create a new non-default segment to move data to. You might also create a new non-default
segment if you need to segment data that was previously missed, or to add a new organization to an
organization dimension in the segmentation policy.
Before you can create a segment for an implemented segmentation group, you must create a dimension slice
to specify the value of the segment. If you applied a data classification that includes a formula to the
segmentation group, the ILM Engine creates and populates the dimension slice when you create the
segment. If you did not apply a data classification that uses a formula, you must manually create the
dimension slice.
You can change the method of periodic segment creation. Before you can change the method of periodic
segment creation, you must delete any segments in the group that are generated and not implemented. When
you delete a generated segment, you must delete the most-recently generated segment first.
If the transactions in the default segment are primarily from the last time period, such as a year or quarter,
and the default does not have a large number of active transactions, you can create a new default segment.
When you create a new default segment, the ILM Engine creates the new default segment and renames the
old default segment based on the time period of the transactions that it contains, for example 2013_Q2.
The ILM Engine moves active transactions and transactions that do not meet the business criteria of the
renamed 2013_Q2 segment from the renamed segment to the new default segment. This method requires
repeated row-level data movement of active transactions. Do not use this method if the default segment has
many open transactions.
Note: You can generate one new default segment at a time. Before you can generate another default
segment, you must either delete or implement any existing generated default segments.
Before you run the job to create the new segment, you can configure the job steps. The following table lists
each of the job steps to create a new default segment:
Step Description
New segment internal task Required. Internal task that prepares the job to run.
Preprocess business rules Uses interim tables to preprocess the business rules you applied to the segmentation
group. This step is visible only if you selected to use interim table processing in the
Enterprise Data Manager.
Create new segments Required. Creates the table spaces, grants required schema access, and creates the
segments.
Incremental move policy Estimates the size of tables in the segmentation group based on row count and
segments to default determines how many datafiles to allocate to a particular tablespace.
Default is skip.
Collect segmentation group Collects database statistics, such as row count and size, for the segmentation group.
statistics
Drop interim tables If you have enabled interim table processing to pre-process business rules, this step
drops the interim tables.
Clean up after segmentation Cleans up temporary objects, such as tables and indexes, that are no longer valid after
segment creation.
If you have a large amount of data in the default segment that spans more than one time period, use the split
default segment method. The split default segment method creates multiple new segments from the default
segment and moves the transactions from the default segment into the appropriate segment for each time
period.
For example, the default segment contains data from three quarters, including the current quarter. The split
default method creates three new segments, one for each of the two quarters that have ended, plus a new
default segment. The ILM Engine moves closed transactions to the new segment for the quarter in which
they belong. Then the ILM Engine moves new and active transactions to the new default segment.
The split default method of segment creation works with both single and multidimensional data
classifications. This method is ideal when the application environment has complex business rules with long
transaction life cycles.
Note: If the policy contains multiple segments in a Generated state and you want to delete one of the
segments, you must delete the most recently generated segment first.
Before you run the job to split the default segment, you can configure the job steps. The following table lists
each of the job steps to split the default segment:
Step Description
New segment internal task Required. Internal task that prepares the job to run.
Create audit snapshot - Creates a snapshot of the state of the objects, such as indexes and triggers, that
before belong to the segmentation group before the ILM Engine moves any data.
Create optimization indexes Creates optimization indexes to improve smart partitioning performance. Indexes are
used during the segment data step to generate the selection statements that create
segments. This step is applicable only if you have defined optimization indexes in the
Enterprise Data Manager.
Disable access policy Disables the access policy on the default segment.
Create new segments Required. Creates the table spaces, grants required schema access, and creates the
segments.
Drop optimization indexes Drops any optimization indexes you created in the Enterprise Data Manager.
Create audit snapshot - Creates a snapshot of the state of the objects, such as indexes and triggers, that
after belong to the segmentation group after the ILM Engine moves the data into the
segments.
Compare audit snapshots - Compares and summarizes the before and after audit snapshots.
summary
Compile invalid objects Compiles any not valid objects, such as stored procedures and views, that remain after
the ILM Engine creates segments.
Enable access policy Enables the access policies you applied to the segmentation group.
Collect segmentation group Collects database statistics, such as row count and size, for the segmentation group.
statistics
Clean up after segmentation Cleans up temporary objects, such as tables and indexes, that are no longer valid after
segment creation.
Create a new non-default segment when the data classification associated with the segmentation group is
multi-dimensional and you want to keep the default segment small by moving data incrementally.
You can create a new non-default segment at any time. When you move data into a new non-default segment,
the process does not affect the default segment and does not require that the database is offline.
You might also create a new non-default segment to move data out of the default segment that was
previously missed.
Note: You can generate one new non-default segment at a time. Before you can generate another non-default
segment, you must either delete or implement any existing generated non-default segments.
Before you run the job to create the new segment, you can configure the job steps. The following table lists
each of the job steps to create a new non-default segment:
Step Description
New segment internal task Required. Internal task that prepares the job to run.
Create new segments Required. Creates the table spaces, grants required schema access, and creates the
segments.
Incremental move policy Moves data from the default segment to the new non-default segment.
segments to default
Collect segmentation group Collects database statistics, such as row count and size, for the segmentation group.
statistics
Clean up after segmentation Cleans up temporary objects, such as tables and indexes, that are no longer valid after
segment creation.
Before you create a segment for an implemented segmentation policy, create a dimension slice that specifies
the value of the segment. If the data classification you applied to the segmentation group uses a formula, the
ILM Engine will create and populate the dimension slice.
After you run a segmentation policy, you can compress a segment, make it read-only, or move it to another
storage classification. You can also create segment sets to perform an action on multiple segments at one
time.
Segment Merging
You can merge multiple segments into one segment. To merge segments, create a segment set and then run
the merge partitions into single partition standalone job.
You might want to merge segments to combine multiple history segments into one larger segment. For
example, you can merge four different quarter segments into one segment that spans the entire year.
Before you can merge segments, create a segment set that contains the segments that you want to merge.
For example, you might want to merge the segments "2015_01," "2015_02," "2015_03," and "2015_04" into one
history segment that spans data for the year 2015. First, create a segment set that contains the four quarter
segments. The name of the segment set must be unique and descriptive of the segment set.
After you create the segment set, run the merge partitions into single partition standalone job. When you run
the merge partitions job, you select parameters such as the source connection name and the name of the
segment set that you created. You can provide a tablespace for the merged segment, but if you do not, the
job creates the segment in the existing tablespace of the most recent segment. If you provide a different
tablespace, you must create the tablespace before you run the job.
After the job completes successfully, the older segments are merged into the most recent segment. In the
Manage Segmentation window, the older segments have the status "Merged." The most recent segment is
renamed to be a combination of the older segments. For instance, the most recent segment might be named
"2015_01_2015_02_2015_03_2015_04." This segment has the status "Implemented."
If you change your mind about merging a segment set, you can replace the merged segment with the original
segments. To revert the merge process, run the replace merged partitions with original partitions standalone
job. For more on the replace merged partitions job, see the Scheduling Jobs chapter.
You can continue to create other segment sets and then run the merge job on those segment sets, provided
that the segments exist in the same segmentation group. If you want to run the merge job multiple times for
a segmentation group, mark the "FinalMergeJob" parameter as "N" until you run the final merge job for the
segmentation group. On the final merge job for the segmentation group, set the "FinalMergeJob" parameter
to Y.
You can repeat the merge process for each segmentation group in a data classification. You must complete
all of the merge jobs for each segmentation group in a data classification before you run the "clean up after
merged partitions" job. The clean up job drops the empty partitions and tablespaces, and merges the
segment metadata. After you run the clean up job, the older individual segments are removed from the
Manage Segmentation window and only the merged segment remains.
1. Create a segment set that contains the segments that you want to merge together. For more
information, see the topic "Segment Sets" in this chapter.
2. Schedule the merge partitions to single partition standalone job. For more information about the
standalone job, see the Scheduling Jobs chapter.
The merge partitions into single partition job merges the data from the older partitions into the most
recent partition.
3. When the standalone job completes successfully, click Workbench > Manage Segmentation.
4. In the lower pane of the Manage Segmentation window, verify that the segment set has merged
correctly. The older segments in the set will have the status "Merged." The job renames the most recent
segment to a combination of the merged segment names. This segment has the status "Implemented."
5. If for any reason you need to reverse the merge, run the replace merged partitions with original partitions
standalone job before you continue.
6. Repeat steps 1-4 for each segment set that you want to merge.
7. When you finish merging segments, run the clean up after merge partitions standalone job. For more
information, see the Scheduling Jobs chapter.
The run clean up after merge partitions job drops the empty partitions and merges the partition
metadata. After you run the job, only the merged partition appears in the Manage Segmentation window.
Segment Compression
You can compress the tablespaces and indexes for a specific segment. Compress a segment if you want to
reduce the amount of disk space that the segment requires. When you compress a segment, you can
optionally configure advanced compression properties.
The ILM Engine uses native database technology to compress a segment. If you want to compress multiple
segments at once, create a segment set and run the compress segments standalone job.
If the segmentation group that contains the segments you want to compress has a bitmap index and any of
the segments in the group are read-only, you must either turn the read-only segments to read-write mode or
drop the bitmap index before you compress any segments. This is because Oracle requires that you mark
bitmap indexes as unusable before you compress segments for the first time. When compression is
complete, the segmentation process cannot rebuild the index for segments that are read-only.
Property Description
Segmentation Group Name The name of the segmentation group that the segment belongs to. This field is read-
only.
Segment Name The name of the segment that you want to compress. You enter the segment name
when you create the dimension slice. This field is read-only.
Tablespace Name The name of the tablespace where the segment resides. This field is read-only.
Storage Class Name The name of the storage classification that the segment resides in. This field is read-
only.
Remove Old Tablespaces Removes old tablespaces. This option is selected by default.
Compressing a Segment
When you configure a segment for compression, you can schedule it to run or you can run it immediately.
Read-only Segments
You can make the tablespaces for a specific segment read-only. Make segment tablespaces read-only to
prevent transactional data in a segment from being modified.
Properties Description
Segmentation Group Name The name of the segmentation group that the segment belongs to.
Segment Name The name of the segment that you want to make read-only. You enter the segment
name when you create the dimension slice.
Tablespace Name The name of the tablespace where the segment resides.
Storage Class Name The name of the storage classification that the segment resides in.
You might want to move a segment to another storage classification if the segment will not be accessed
frequently and can reside on a slower disk. A Data Archive administrator can create storage classifications.
Property Description
Segmentation Group Name The name of the segmentation group that the segment belongs to. This field is read-
only.
Segment Name The name of the segment that you want to move. You enter the segment name when
you create the dimension slice. This field is read-only.
Tablespace Name The name of the tablespace where the segment resides. This field is read-only.
Compress Status Indicates whether the segment is compressed. This field is read-only.
Current Storage Class Name Name of the storage classification where the segment currently resides. This field is
read-only.
Read Only Indicates whether the segment is read-only. This field is read-only.
Storage Class Name Storage classification that you want to move the segment to. Choose from the drop-
down menu of available storage classifications.
If you want to compress multiple segments at once, make multiple segments read-only, or move multiple
segments to a different storage classification, you must create a segment set. After you create a segment
set, you can run a standalone job on the entire segment set. When you run a job on a segment set, the ILM
Engine runs the job on only the segments in the set.
You can add a segment to multiple segment sets. If you want to run another job on a segment set, you can
edit a segment set to include new segments or remove segments.
Access policies reduce the amount of data that the database retrieves, which improves both query
performance and application response time. When a user or program queries the database, the database
performs operations on only the segments specified in the user or program access policy.
When you create an access policy, you apply it to one dimension of a segmentation group. For example, you
want to limit an application user's access on a general ledger segmentation group to one year of data. You
create the user access policy and apply it to the time dimension of the segmentation group. Create an access
policy for each dimension associated with the segmentation group.
Before you create access policies based on program or user, create a default access policy. The default
access policy applies to anything that can query the database. Access policies that you create for specific
programs or users override the default access policy.
Access policies are dynamic and can be changed depending on business needs. Users can optionally
manage their own access policies.
300
Access Policy Example
Most application users and programs in your organization rarely need access to more than one year of
application data. You configure the default access policy to limit access on all segmentation groups to one
year of data. A program that accesses the general ledger segmentation group to run month-end processes,
however, needs to access only quarterly data. You create an access policy specifically for this program. You
also create an application user access policy for a business analyst who needs access to three years of data
on all segmentation groups.
Before you create a default access policy, decide how much data a typical program or user in your
organization needs to access. The default policy should be broad enough to allow most users in your
organization to access the data they need without frequently adjusting their access policies.
You can configure the default access policy to apply to all segmentation groups. If you want to change the
default access policy for a specific segmentation group, you can override the default policy with a policy that
applies to a specific segmentation group.
For example, you can configure the default policy to allow two years of data access for all segmentation
groups. You can then create a default policy for the accounts receivable segmentation group that allows
access to one year of data. When you create the second policy, Data Archive gives it a higher sequence
number so that it overrides the first. Each time you create an access policy, Data Archive assigns a higher
sequence number. A Data Archive administrator can change the sequence number of any policy except the
default policy. The default policy sequence must be number one.
When a user or program queries the database, Data Archive starts with the default access policy and
successively checks each access policy with a higher sequence number. Data Archive applies the policy with
the highest sequence number unless you configure an access policy to override other policies or exempt a
user or program from all access policies.
Before you create an Oracle E-Business Suite access policy in Data Archive, create a user profile with the
Application Developer in Oracle E-Business Suite. The internal name for the Oracle E-Business Suite user
profile must match the name you want to give the access policy. When you create the access policy in Data
Archive, set the user profile value to match the value that you want Oracle E-Business Suite to return to Data
Archive. For example, set the user profile value to two if you want to allow access to two years of data.
Example
Within Data Archive you set a default data access policy of two years. You then configure the inventory
manager user profile in Oracle E-Business Suite to allow access to three years of data. When you configure
the access policy in Data Archive, Data Archive connects to Oracle E-Business Suite to return the values in
the Oracle E-Business Suite user profile.
Before you create a PeopleSoft access policy in Data Archive, create a user profile within PeopleSoft. The
internal name for the user profile must match the name you give the access policy in Data Archive.
Data Archive identifies the user who accesses the database based on the session value set by PeopleSoft. If
an access policy exists within Data Archive for the Peoplesoft user profile name, the ILM Engine only queries
the segments that the user has access to.
You might create an access policy for a database user when the database user is used to connect to an
interface, such as an application interface. For example, a specific reporting group might use a database user
to connect to the application database.
For example, if your environment has business objects that connect to a general application schema, but you
cannot identify the name of the program or database user, you can limit access with an operating system
user access policy. The database stores the operating system user in the session variables table. Data
Archive compares the operating system user in the session variable table to the operating system user
assigned to the access policy within Data Archive. If the values match, Data Archive applies the policy you
configured for the operating system user.
Create an access policy for a program based on the function of the program, such as running reports. For
example, you might create a program access policy for a transaction processor that needs to access current
data. You could limit access for the transaction processor to one quarter of data. You might also create an
access policy for a receivables aging reporting tool that needs to view historical data. You configure the
policy to allow the receivables aging reporting tool access to three years of historical data.
Field Description
Policy Type The type of access policy you want to create. Select from the drop-down menu of
available policy types.
Assigned To The user or program that you want to assign the policy to.
Dimension The dimension that you want to apply the access policy to. Select from the drop-down
menu of available dimensions.
Segmentation Group The segmentation group that you want to apply the access policy to. Available
segmentation groups are listed in the drop-down menu. Select the All button to apply
the policy to every segmentation group.
Scope Type The scope type of the policy. Select date offset, list, or exempt.
- Date offset. Choose date offset when you apply an access policy to a time
dimension to restrict access by date.
- List. Choose list when you apply an access policy to a dimension other than time.
- Exempt. Choose exempt if you want to bypass all access policies, including the
default policy, to see all data for a segmentation group.
Offset Value The amount of data you want the policy holder to be able to access if you chose a
scope type of date offset. For example, if you want the user to be able to access two
years of data, enter two in the Value field and select Year from the Unit drop-down
menu. You can select a unit of year, quarter, or month.
Override Option to override all access policies. If you want the access policy to override access
policies with higher sequence numbers, select Override.
Language Settings
This chapter includes the following topics:
305
Appendix A
If you have custom datatypes on your source database, you must manually map each custom datatype to a
Data Vault datatype before you run the Data Vault Loader job. For information on how to map custom
datatypes, see the Data Archive Administrator Guide.
Data Vault
The Data Vault Loader job converts native datatypes based on the database.
Oracle Datatypes
The following table describes how the Data Vault Loader job converts Oracle native datatypes to Data Vault
datatypes:
306
Oracle Datatype Data Vault Datatype
DATE TIMESTAMP
TIMESTAMP TIMESTAMP
BFILE BLOB
BLOB BLOB
CLOB CLOB
LONG BLOB
NUMBER DECIMAL
DECIMAL DECIMAL
DEC DECIMAL
INT INTEGER
INTEGER INTEGER
SMALLINT SMALLINT
ROWID VARCHAR(18)
UROWID VARCHAR
NUMBER If the length is more than 120, change the datatype to DOUBLE in the metadata XML file.
DATE TIMESTAMP
TIME VARCHAR
TIMESTAMP TIMESTAMP
BLOB BLOB
SMALLINT SMALLINT
INTEGER INTEGER
BIGINT DECIMAL(19)
DECIMAL DECIMAL
CHAR(8000) CHAR(8000)
TEXT CLOB
VARCHAR VARCHAR
VARCHAR(8000) VARCHAR(8000)
VARCHAR(MAX) CLOB
DATE TIMESTAMP
DATETIME2(7) VARCHAR
DATETIME TIMESTAMP
DATETIMEOFFSET VARCHAR(100)
SMALLDATETIME TIMESTAMP
TIME VARCHAR(100)
BIGINT DECIMAL(19)
BIT VARCHAR(10)
DECIMAL DECIMAL
INT INTEGER
MONEY DECIMAL(19,4)
NUMERIC DECIMAL(19)
SMALLINT SMALLINT
SMALLMONEY INTEGER
TINYINT INTEGER
BINARY BLOB
BINARY(8000) BLOB
IMAGE BLOB
VARBINARY BLOB
VARBINARY(8000) BLOB
VARBINARY(MAX) BLOB
Salesforce Datatypes
The following table describes how the Data Vault Loader job converts Salesforce datatypes to Data Vault
datatypes:
ID VARCHAR
TEXT VARCHAR
PERCENT DECIMAL
EMAIL VARCHAR
LONGTEXTAREA CLOB
MULTISELECTPICKLIST VARCHAR
DATETIME TIMESTAMP
PICKLIST VARCHAR
HTML CLOB
TEXTAREA VARCHAR
ENCRYPTEDTEXT VARCHAR
AUTONUMBER VARCHAR
REFERENCE VARCHAR
DATE DATE
PHONE VARCHAR
URL VARCHAR
CHECKBOX VARCHAR
CURRENCY DECIMAL
COMBOBOX VARCHAR
NUMBER DECIMAL
INT DECIMAL
Teradata Datatypes
The following table describes how the Data Vault Loader job converts Teradata native datatypes to Data
Vault datatypes:
BYTE BLOB
BYTE(64000) BLOB
VARBYTE(64000) BLOB
BLOB BLOB
CHARACTER(32000) CLOB
VARCHAR(32000) CLOB
LONGVARGRAPHIC CLOB
CLOB CLOB
DATE TIMESTAMP
TIME VARCHAR
TIMESTAMP VARCHAR
SMALLINT SMALLINT
INTEGER INTEGER
BIGINT DECIMAL(19)
BYTEINT INTEGER
DECIMAL DECIMAL
PERIOD(DATE) UDCLOB
PERIOD(TIMESTAMP) UDCLOB
Convert the datatypes if you receive one of the following errors in the log file:
Unsupported Datatype Error
If the Data Vault Loader job processes an unsupported datatype and is not able to identify a generic
datatype, you receive an error message in the log file. The log file indicates the unsupported datatype,
the table and column the datatype is included in, and the metadata xml file location that requires the
change.
Conversion Error
If the Data Vault Loader job loads columns that have large numeric data or long strings of multibyte
characters, it generates conversion warnings in the log file. The log file indicates the reason why the job
failed if there is a problem converting data.
1. Use a text editor to open the sample script installed with Data Archive.
2. Modify the script for the column changes you need to make.
3. Run the script.
4. Resume the Data Vault Loader job.
If the conversion error involves a change in the size of a column, use the following sample script to change
the size of columns in the Data Vault:
CREATE DOMAIN "dbo"."D_TESTCHANGESIZE_NAME_TMP" VARCHAR(105);
ALTER TABLE "dbo"."TESTCHANGESIZE" ADD COLUMN "NAME_TMP"
"dbo"."D_TESTCHANGESIZE_NAME_TMP";
ALTER TABLE "dbo"."TESTCHANGESIZE" DROP COLUMN NAME;
ALTER TABLE "dbo"."TESTCHANGESIZE" RENAME COLUMN "NAME_TEMP" "NAME";
DROP DOMAIN "dbo"."D_TESTCHANGESIZE_NAME";
Before you create a retirement project, check for the presence of special characters in table, column, and
schema names.
The following special characters are supported for table and oclumn names in Data Vault: _ $ ~ ! @ ^ & - + = : <
>?|{}
Schema/Database Names
The following special characters are supported for schema and database names in Data Vault: _ $
316
Appendix C
The retirement job logs in to the SAP system and uses the ABAP import command to read the SAP encoded
cluster data. The job transforms the encrypted data into XML format during the extraction process. The XML
structure depends on the cluster ID in the transparent table.
The job reads data from the PCL1, PCL2, PCL3, PCL4, and PCL5 transparent HR tables. The tables store
sensitive human resource data. The job transforms the data by cluster IDs. The transformation is not
supported for all cluster IDs. The job also transforms data for the all cluster IDs for the STXL text table.
317
PCL1 Cluster IDs
The retirement job converts encrypted data into XML format depending on the HR cluster ID.
The following table lists the PCL1 cluster IDs that the retirement job transforms into XML:
B1 PDC data
PC Personnel calendar
The following table lists the PCL2 cluster IDs that the retirement job transforms into XML:
DP Garnishments (DE)
DQ Garnishment directory
IF Payroll interface
PS Schema
PT Schema
Q0 Statements (international)
ZV SPF data
The following table lists the PCL3 cluster IDs that the retirement job transforms into XML:
AP Applicant actions
The following table lists the PCL4 cluster IDs that the retirement job transforms into XML:
The retirement job transforms data into XML for the PCL5 AL cluster ID.
Glossary
access policy
A policy that determines the segments that an individual user can access. Access policies can restrict
access to segments based on the application user, database user, or OS user.
administrator
The administrator is the Data Archive super user. The administrator includes privileges for tasks such as
creating and managing users, setting up security groups, defining the system profile, configuring repositories,
scheduling jobs, creating archive projects, and restoring database archives.
application
A list of brand names that identify a range of ERP applications, such as Oracle and PeopleSoft.
application module
A list of supported Application Modules for a particular Application Version.
application version
A list of versions for a particular application.
archive
Informatica refers to a backed up database as an Archive. From the process perspective, it also means
archiving data from an ERP/CRM instance to an online database using business rules.
archive (Database)
A database archive is defined as moving data from an ERP/CRM instance to another database or flat file.
archive (File)
File archive is defined as moving data from an ERP/CRM instance to one or more BCP files.
archive action
Data Archive enables data to be copied from Data Source to Data Target (Archive Only), deleted from Data
Source after the backup operation (Archive and Purge), or deleted from the Data Source as specified in the
Archive Project (Purge Only).
archive definition
An Archive Definition defines what data is archived, where it is archived from, and where it is archived to.
archive engine
A set of software components that work together to archive data.
archive project
An archive project defines what data is to be archived, where it is archived from, and where it is archived to.
business rule
A business rule is criteria that determines if a transaction is eligible to be archived or moved from the default
segment.
business table
Any Table that (with other Tables and Interim Tables) contributes to define an Entity.
custom object
An Object (Application / Entity) defined by an Enterprise Data Manager Developer.
dashboard
A component that generates information about the rate of data growth in a given source database. The
information provided by this component is used to plan the archive strategy.
Data Archive
A product of the Information Lifecycle Management Suite, which provides flexible archive / purge
functionality that quickly, reduces the overall size of production databases. Archiving for performance,
compliance, and retirement are the three main use cases for Data Archive.
data classification
A policy that defines the data set in segments based on criteria that you define, such as dimensions.
data destination
The database containing the archived data.
data source
The database containing the data to be archived.
Data Subset
A product of the Information Lifecycle Management Software, which enables organizations to create smaller,
targeted databases for project teams that maintain application integrity while taking up a fraction of the
space.
data target
The database containing the archived data. This might alternatively be referred to as History Database.
de-referencing
The process of specifying inter-Column Constraints for JOIN queries when more than two Tables are
involved. Enterprise Data Manager dictates that two Tables are specified first and then a Column in the
Second Table is De-Referenced to specify a Third Table (and so on) for building the final JOIN query.
dimension
An attribute that defines the criteria, such as time, to create segments when you perform smart partitioning.
dimension rule
A rule that defines the parameters, such as date range, to create segments associated with the dimension.
dimension slice
A subset of dimensional data based on criteria that you configure for the dimension.
entity
An entity is a hierarchy of ERP/CRM tables whose data collectively comprise a set of business transactions.
Each table in the hierarchy is connected to one or more other tables via primary key/foreign key relationships.
Entity table relationships are defined in a set of metadata tables.
ERP application
ERP applications are software suites used to create business transaction documents such as purchase
orders and sales orders.
330 Glossary
expiration date
The date at which a record is eligible for deletion. The expiration date equals the retention period plus a date
that is determined by the retention policy rules. The Purge Expired Records job uses this date to determine
which records to delete from the Data Vault. A record expires on its expiration date at 12:00:00 a.m. The time
is local to the machine that hosts the Data Vault Service. A record with an indefinite retention period does not
have an expiration date.
expression-based retention
A rule that you apply to a retention policy to base the retention period for records in an entity on a date value
that an expression returns. The expiration date for each record in the entity equals the retention period plus
the date value from the expression. For example, records in the CUSTOMER table expire five years after the
last order date, which is stored as an integer.
formula
A procedure that determines the value of a new time dimension slice based on the end value of the previous
dimension slice.
general retention
A rule that you apply to a retention policy to base the retention period for records in an entity on the archive
job date. The expiration date for each record in the entity equals the retention period plus the archive job
date. For example, records in an entity expire five years after the archive job date.
history database
Database used to store data that is marked for archive in a database archive project.
home database
The database containing metadata and other tables used to persist application data. This is contained in the
Schema, usually known as “AMHOME”.
interim
Temporary tables generated for an Entity for data archive.
job
A process scheduled for running at a particular point in time. Jobs in Data Archive include Archive Projects or
Standalone Jobs.
metadata
Metadata is data about data. It not only contains details about the structure of database tables and objects,
but also information on how data is extracted, transformed, and loaded from source to target. It can also
contain information about the origin of the data.
Metadata contains Business Rules that determines whether a given transaction is archivable.
production database
Database which stores Transaction data generated by the ERP Application.
restore (Cycle)
A restore operation based on a pre-created Archive Project.
restore (Transaction)
A restore operation that is carried out by precisely specifying an Entity, Interim Tables and related WHERE
clauses for relevant data extractions.
retention management
The process of storing records in the Data Vault and deleting records from the Data Vault. The retention
management process allows you to create retention policies for records. You can create retention policies for
records as part of a data archive project or a retirement archive project.
retention policy
The set of rules that determine the retention period for records in an entity. For example, an organization
must retain insurance policy records for five years after the most recent message date in the policy, claims,
or client message table. You apply a retention policy to an entity. You can apply a retention policy rule to a
subset of records in the entity.
security group
A Security Group is used to limit a User (during an Archive Project) to archive only data that conforms to
certain scope restrictions for an Entity.
segment
A set of data that you create through smart partitioning to optimize application performance. After you run a
segmentation policy to create segments, you can perform operations such as limiting the data access or
compressing the data.
segmentation group
A group of tables based on a business function, such as order management or human resources, for which
you want to perform smart partitioning. A segmentation group defines the database and application
relationships for the tables.
segmentation policy
A policy that defines the data classification that you apply to a segmentation group. When you create a
policy, you also choose the method of periodic segment creation and schedule the policy to run.
segment set
A set of segments that you group together so you can perform a single action on all the segments, such as
compression.
smart partitioning
A process that divides application data into segments based on rules and dimensions that you configure.
staging schema
A staging schema is created at the Data Source for validations during data backup.
332 Glossary
standard object
An Object (Application / Entity) which is pre-defined by the ERP Application and is a candidate for Import into
Enterprise Data Manager.
system-defined role
Default role that includes a predefined set of privileges. You assign roles to users. Any user with the role can
perform all of the privileges that are included in the role.
system profile
Technical information about certain parameters, usually related to the Home Schema, or Archive Users
comprises the System Profile.
transaction data
Transaction data contain the information within the business documents created using the master data, such
as purchase orders, sales orders etc. Transactional Data can change very often and is not constant.
Transaction data is created using ERP applications.
Transaction data is located in relational database table hierarchies. These hierarchies enforce top-down data
dependencies, in which a parent table has one or more child tables whose data is linked together using
primary key/foreign key relationships.
translation column
A column which is a candidate for a JOIN query on two or more Tables. This is applicable to Independent
Archive, as Referential Data is backed up exclusively for XML based Archives and not Database archive,
which contain only Transaction Data.
Note that such a scenario occurs when a Data Archive job execution extracts data from two Tables that are
indirectly related to each other through Constrains on an intermediary Table.
user
Data Archive is accessible through an authentication mechanism, only to individuals who have an account
created by the Administrator. The level of access is governed by the assigned system-defined roles.
user profile
Basic information about a user, such as name, email address, password, and role assignments.
B Data Visualization
Copy report 219
browse data create report
Data Discovery portal 193 create report from tables 206
Create report
Create report with SQL query 208
334
L retention policies
associating to entities 164
legal hold changing for archived records 168
groups and assignments 196 column level retention 161
creating 164
definition 152
Index 335
U Update Retention Policy job (continued)
running 167
Update Retention Policy job
reports 171
336 Index