Restrict Kms Access
Restrict Kms Access
User Guide on
Restoring Data Dumps
&
Importing to
Analytical Tools
1|Page
Centre for Data Management and Analytics
An
User Guide on
Restoring Data dumps
&
Importing to
Analytical Tools
2|Page
Preface
Data analytics is a multistep process, comprising of collecting, restoring and connecting data
to a software for further analysis, applying tools and techniques, generating insights etc.
Contemporary software employed for data analytics are very easy to learn and use, many of
the features being intuitive and GUI based. However, the initial steps of data preparation and
connecting a data analytics software to dataset(s)/ database(s), found in various formats, appear
intimidating and difficult to a beginner.
CDMA has been steering the use of data analytics for audit in IAAD. Field offices often
approach us for issues concerning restoration of back up files created in audited entity’s
environment and connecting data analytic software with data files of different formats. Hence,
we felt the need of a user guide, which will explain these processes in a step-by-step manner.
The present user guide is an endeavour to address such issues. The content of the guide has
been finalised based on types of databases that one comes across in audit. At present, Knime,
an open source software and Tableau, a proprietary software, apart from Excel and other
options are being used for data analytics in the Department. Accordingly, the relevant topics
have been included in the user guide.
The user guide is practical oriented and intends to provide a hands on experience on restoration
of data from various formats and connecting the same with different analytical tools. It will be
possible for personnel, having a basic grounding in data analytic tools and techniques, to follow
and retrace the steps, elucidated with the help of screenshots.
As the field of data analytics is dynamic and new tools and techniques keep coming up, we
welcome any feedback or suggestions to make this guide more relevant and useful.
3|Page
Table of Contents
Preface........................................................................................................................................ 3
Chapter 1: Restoration of Backup/Dumps from Different Databases ....................................... 7
1.1 Pre-Requisite for Importing or Restoring a Dump ...................................................... 8
1.2 Restoration from Microsoft SQL Server - Database ................................................... 9
1.3 Restoration from PostgreSQL Server – Database ..................................................... 12
1.4 Restoration from IBM DB2 Server – Database ........................................................ 14
1.5 Restoration from Oracle -Database ........................................................................... 16
1.6 Restoration from MySQL – Database ....................................................................... 22
Chapter 2: Tableau Connectivity with Different Databases .................................................... 25
2.1 Oracle Connectivity with Tableau............................................................................. 25
2.2 Microsoft SQL Server Connectivity with Tableau ................................................... 27
2.3 MySQL Connectivity with Tableau .......................................................................... 28
2.4 IBM DB2 Connectivity with Tableau ....................................................................... 29
2.5 PostgreSQL Connectivity with Tableau .................................................................... 30
2.6 MS-Excel Connectivity with Tableau ....................................................................... 31
2.7 MS-Access Connectivity with Tableau ..................................................................... 32
2.8 JSON File Connectivity with Tableau ...................................................................... 33
2.9 Text File Connectivity with Tableau ......................................................................... 34
2.10 PDF file Connectivity with Tableau.......................................................................... 35
Chapter 3: CaseWare IDEA Connectivity with Tableau ......................................................... 37
3.1 Installation of Idea Driver for connectivity with Tableau. ........................................ 37
3.2 CaseWare IDEA Connectivity with Tableau ............................................................ 40
Chapter 4: KNIME Connectivity with Different Databases .................................................... 43
4.1 Uploading Jar Files in KNIME ................................................................................. 44
4.2 Microsoft SQL Server Connectivity with KNIME ................................................... 45
4.3 MySQL Connectivity with KNIME .......................................................................... 48
4.4 Oracle Connectivity with KNIME ............................................................................ 51
4.5 IBM DB2 Connectivity with KNIME ....................................................................... 53
4.6 Postgres Connectivity with KNIME ......................................................................... 55
4.7 MS Excel Connectivity with KNIME ....................................................................... 57
4.8 Text File Connectivity with KNIME ........................................................................ 59
4.9 JSON Connectivity with KNIME ............................................................................. 61
4.10 MS-Access Connectivity with KNIME .................................................................... 66
5|Page
Chapter 1: Restoration of Backup/Dumps from Different Databases
A data dump is a large amount of data that is moved from one computer system, file, or device
to another. A database dump is a major output of data that can help users to either back up or
duplicate a database. For example, a database can perform a data dump to another computer or
server on a network, where it could then be utilized by other software applications or analysed
by a person.
Currently, large number of auditee organisations are computerised and it is a challenge for
auditors to conduct audit in an IT based environment. Over a period, the size of the databases
of each organisation has increased, coupled with the use of various Relational Database
Management Systems (RDBMS). As a result, getting the relevant data tables from an RDBMS
platform becomes difficult, partly due to inadequate information about the data structure, data
definitions etc. related to databases of audited entities. Hence, for practical reasons or
sometimes owing to the scope of the audit, databases from servers are often backed up in dump
data files, and provided to the Audit. Data dumps are to be restored using the relevant tools,
and then the required table(s) and fields are identified for analysis. A schematic diagram of the
said process is described above.
The methodology used in restoration of a dump may vary depending on the RDBMS used and
the versions. For example, a database dump created using a newer version of RDBMS may not
be restored using an older version and vice versa.
The present chapter is prepared to share the experience of CDMA in handling of data
dump/backup files and restoring the same.
7|Page
Centre for Data Management and Analytics
8|Page
Restoration of Backup/Dumps from Different Databases
9|Page
Centre for Data Management and Analytics
10 | P a g e
Restoration of Backup/Dumps from Different Databases
11 | P a g e
Centre for Data Management and Analytics
12 | P a g e
Restoration of Backup/Dumps from Different Databases
Restoration starts.
13 | P a g e
Centre for Data Management and Analytics
Process starts.
14 | P a g e
Restoration of Backup/Dumps from Different Databases
Process starts
15 | P a g e
Centre for Data Management and Analytics
Prior to launch of oracle database11g release 2 (11.2), utilities - original export (exp) and
Import (imp), were used. But in the said version, a new feature, data pump, was introduced
for faster loading/unloading of data by using new export (expdp) and import (impdp) utilities.
A brief overview of data pump is given below.
Oracle Data Pump provides a server-side infrastructure for fast data and metadata movement
between Oracle databases. Data Pump automatically manages multiple, parallel streams of
unload and load for maximum throughput. The degree of parallelism can be adjusted as per
requirement.
Data Pump Import Data Pump Import (hereinafter referred to as Import for ease of reading)
is a utility for loading an export dump file set into a target system. The dump file set is made
up of one or more disk files that contain table data, database object metadata, and control
information. The files are written in a proprietary, binary format. During an import operation,
the Data Pump Import utility uses these files to locate each database object in the dump file set.
Import can also be used to load a target database directly from a source database with no
intervening dump files. This is known as a network import.
The Data Pump Import utility is invoked using the impdp command and can be interacted by
using a command line, a parameter file, or an interactive-command interface.
16 | P a g e
Restoration of Backup/Dumps from Different Databases
Data Pump Import Modes: One of the most significant characteristics of an import operation
is its mode, because the mode largely determines what is imported. The specified mode applies
to the source of the operation, either a dump file set or another database if the
NETWORK_LINK parameter is specified.
When the source of the import operation is a dump file set, specifying a mode is optional. If no
mode is specified, then Import attempts to load the entire dump file set in the mode in which
the export operation was run.
The mode is specified on the command line, using the appropriate parameter. The available
modes are as follows:
A full import is specified using the FULL parameter. In full import mode, the entire content of
the source (dump file set or another database) is loaded into the target database. This is the
default for file-based imports. You must have the IMP_FULL_DATABASE role if the source
is another database.
Cross-schema references are not imported for non-privileged users. For example, a trigger
defined on a table within the importing user's schema, but residing in another user's schema, is
not imported.
17 | P a g e
Centre for Data Management and Analytics
A schema import is specified using the SCHEMAS parameter. In a schema import, only objects
owned by the specified schemas are loaded. The source can be a full, table, tablespace, or
schema-mode export dump file set or another database. If you have the
IMP_FULL_DATABASE role, then a list of schemas can be specified and the schemas
themselves (including system privilege grants) are created in the database in addition to the
objects contained within those schemas.
Cross-schema references are not imported for non-privileged users unless the other schema is
remapped to the current schema. For example, a trigger defined on a table within the importing
user's schema, but residing in another user's schema, is not imported.
Table Mode
A table-mode import is specified using the TABLES parameter. In table mode, only the
specified set of tables, partitions, and their dependent objects are loaded. The source can be a
full, schema, tablespace, or table-mode export dump file set or another database. You must
have the IMP_FULL_DATABASE role to specify tables that are not in your own schema.
You can use the transportable option during a table-mode import by specifying the
TRANPORTABLE=ALWAYS parameter in conjunction with the TABLES parameter. Note
that this requires use of the NETWORK_LINK parameter, as well.
Tablespace Mode
18 | P a g e
Restoration of Backup/Dumps from Different Databases
TRANSPORT_DATAFILES parameter, must be made available from the source system for
use in the target database, typically by copying them over to the target system.
Encrypted columns are not supported in transportable tablespace mode.
This mode requires the IMP_FULL_DATABASE role.
The execution of the syntax and response thereof in the command prompt can be seen in the
image below:
Note: After giving above command at SQL Prompt Directory will be created but Directory
(Data_dump_dir) will not be shown as a sub-directory because it is a virtual Directory (only
a pointer to physical directory) as it can be seen below that D:\CDMA is empty:
19 | P a g e
Centre for Data Management and Analytics
The execution of the syntax and response thereof in the command prompt can be seen in the
image below:
20 | P a g e
Restoration of Backup/Dumps from Different Databases
The execution of the syntax and response thereof in the command prompt can be seen in the
image below:
The OWNER parameter of imp has been replaced by the SCHEMAS parameter which is used
to specify the schemas to be imported. The following is an example of the schema Import
syntax:
Syntax: C:\impdp User/Password FULL=Y DIRECTORY=Data_dump_dir
DUMPFILE=<Dump File Name.dmp> LOGFILE= <Log File Name.log>; <Enter>
The execution of the syntax and response thereof in the command prompt can be seen in the
image below:
OR
Case-3: Importing selected tables from the database
The TABLES parameter is used to specify the tables that are to be imported. The following is
an example of the table import syntax:
Syntax: C:\impdp User/Password TABLES=<Table Name_1>,<Table Name_2>,<Table
Name_3> DIRECTORY=Data_dump_dir DUMPFILE=<Dump File Name> LOGFILE=
<Log File Name.log> <Enter>
The execution of the syntax and response thereof in the command prompt can be seen in the
image below:
21 | P a g e
Centre for Data Management and Analytics
Click on ‘Restore’.
Click on ‘open back up file’.
22 | P a g e
Restoration of Backup/Dumps from Different Databases
23 | P a g e
Centre for Data Management and Analytics
24 | P a g e
Chapter 2: Tableau Connectivity with Different Databases
Tableau can help anyone see and understand their data; connect to almost any database, and is
highly user friendly with its drag and drop features. In this section, a stepwise process of
connecting tableau with different RDBMS/DBMS/Various file formats, present on desktop or
local server, has been explained with the help of screenshots:
25 | P a g e
Centre for Data Management and Analytics
26 | P a g e
Tableau Connectivity with Different Databases
27 | P a g e
Centre for Data Management and Analytics
28 | P a g e
Tableau Connectivity with Different Databases
29 | P a g e
Centre for Data Management and Analytics
30 | P a g e
Tableau Connectivity with Different Databases
31 | P a g e
Centre for Data Management and Analytics
32 | P a g e
Tableau Connectivity with Different Databases
33 | P a g e
Centre for Data Management and Analytics
34 | P a g e
Tableau Connectivity with Different Databases
35 | P a g e
Chapter 3: CaseWare IDEA Connectivity with Tableau
IDEA is being used in the department since a long time as a tool for CAAT. There may be
scenarios where one would require importing an idea file into tableau for further analysis or
visualisation. The procedure is as follows.
Click Next.
37 | P a g e
Centre for Data Management and Analytics
Click Install.
Click Finish.
38 | P a g e
CaseWare IDEA Connectivity with Tableau
39 | P a g e
Centre for Data Management and Analytics
Click on DSN.
Choose Idea for Tableau.
Click on Connect
40 | P a g e
CaseWare IDEA Connectivity with Tableau
41 | P a g e
Chapter 4: KNIME Connectivity with Different Databases
KNIME, the Konstanz Information Miner, is a free and open-source data analytics, reporting
and integration platform. KNIME integrates various components for machine learning and data
mining through its modular data-pipelining concept. Few example have been illustrated below
to connect KNIME tool with our Local Desktop or Local Database Server to import data for
the purpose of Analytics. Before starting Analytics with the help of KNIME tool, we should
know about format of the data, we must have Credential/Authorisation to set up the connection
with the database in the Server. In addition, KNIME should have been updated to its newest
version so that it can have various utilities among other things such as connecting Database
Utility like .JAR files to connect data with the server databases. After that select proper Node
to import data for Analytics.
Note: It may please be noticed that sometimes Knime does not connect with the database server
due to non-compatibility of utilities (for example .JAR utility) among different versions. In
such cases, it needs corresponding .JAR files to be downloaded/installed from sources like
internet and consequently shall be updated from the preferences. (File->Preference-
>Knime->Databases->Add file->path of the downloaded/installed file)
43 | P a g e
Centre for Data Management and Analytics
44 | P a g e
KNIME Connectivity with Different Databases
45 | P a g e
Centre for Data Management and Analytics
46 | P a g e
KNIME Connectivity with Different Databases
47 | P a g e
Centre for Data Management and Analytics
48 | P a g e
KNIME Connectivity with Different Databases
49 | P a g e
Centre for Data Management and Analytics
Click on Execute.
50 | P a g e
KNIME Connectivity with Different Databases
51 | P a g e
Centre for Data Management and Analytics
52 | P a g e
KNIME Connectivity with Different Databases
53 | P a g e
Centre for Data Management and Analytics
54 | P a g e
KNIME Connectivity with Different Databases
55 | P a g e
Centre for Data Management and Analytics
56 | P a g e
KNIME Connectivity with Different Databases
57 | P a g e
Centre for Data Management and Analytics
58 | P a g e
KNIME Connectivity with Different Databases
59 | P a g e
Centre for Data Management and Analytics
60 | P a g e
KNIME Connectivity with Different Databases
61 | P a g e
Centre for Data Management and Analytics
62 | P a g e
KNIME Connectivity with Different Databases
63 | P a g e
Centre for Data Management and Analytics
64 | P a g e
KNIME Connectivity with Different Databases
65 | P a g e
Centre for Data Management and Analytics
66 | P a g e
KNIME Connectivity with Different Databases
67 | P a g e