CSC 5301 - Paper - DB & DW Security
CSC 5301 - Paper - DB & DW Security
Table of Contents
I. II. III. a. i. ii. iii. b. c. a. i. ii. iii. iv. v. vi. IV. a. b. c. V. VI. Introduction ..................................................................................................................................... 3 Software Security Engineering ........................................................................................................ 3 Database Security Management ................................................................................................. 4 Access Control (SQL's DCL) .............................................................................................................. 5 Mandatory Access Control (MAC) ................................................................................................... 5 Discretionary Access Control (DAC): ............................................................................................... 6 Role-Based Access Control (RBAC) .................................................................................................. 6 Encryption ....................................................................................................................................... 6 Database Auditing ........................................................................................................................... 7 Database Vulnerabilities.................................................................................................................. 7 SQL Injections .................................................................................................................................. 7 Denial of Service .............................................................................................................................. 8 Default, blank, and weak username & password............................................................................ 9 Privilege Escalation .......................................................................................................................... 9 Improper access controls and misconfigurations ........................................................................... 9 Database Inference ......................................................................................................................... 9 Data warehouse Security........................................................................................................... 10 Access Control ............................................................................................................................... 10 Metadata Based Security .............................................................................................................. 10 Data Warehouse Striping Technique............................................................................................. 13 Conclusion ..................................................................................................................................... 15 References ................................................................................................................................. 17
I.
Introduction
Literally speaking, the Latin word Secure described by Landwehr (2001) can undergo an anatomy as the following: se means "without", and "cure" means "to care for", or "to be concerned about" [1]. The premise of this statement refers to the fact that we should enterprise actions to void reasons behind annoyances (purchase a strongbox where to store valuable belongings ...), or, simply, attain an acceptable peace of mind level. In information technology, however, security is an independent field by itself. It is a set of activities and measures undertaken with the aim of ensuring three characteristics: Confidentiality, Integrity and Availability [2.1]. Specifically in Software Engineering, security refers to an essential system attribute of any dependable software, and a dependable software is usually the most crucial system property in a computer system [3]. Thus, security in this context, is the ability of the system to protect itself against accidental or deliberate intrusion [4]. Databases, indeed, are managed by a collection of programs [2.2], namely Database Management Systems, which eventually include modules in charge of securing the computer structure that stores a set of end-user data as well as its respective meta-data [2.2]. Therefore, such systems are forcibly equipped with mechanisms preliminarily designed to improve data security. When databases have become popular, and part of the mundane operational activity, the urge of conceiving a new kind of data repository and data management (OLAP) emerged: relieving the operational databases, and building strategic decision making based on historical data: the data warehouse is born. A data warehouse is supposed to be an open accessible system, and any security aspect is to compromise its design, and consequently affect unsuitably its purpose.
II.
First, no one can attack a software system when it is isolated from any network, and, second, a software system cannot be abused when it is exploited by a unique user. Since the widespread use of the internet in the 1990s, new types of threats and/or vulnerabilities had to be taken emphasized. When considering security issues, the infrastructure along with the software (application) itself have to be studied. As a result, security may be compromised at any layer shown in figure 1:
Application Reusable Components and Libraries Middleware Database Management Generic, Shared Applications (Browsers, E-mail, Etc.) Operating System
Figure II.1: Application & Infrastructure Layers (Sommerville)
There is an important distinction between application security and infrastructure security. The former is a design issue, regarded as ensuring the system to be able to resist attacks, while the latter is of management nature and orbits around the configuration of the system for the same goal (resisting attacks). Accordingly, due to their perpetual availability, infrastructure components (such as database servers) can be probed by attackers for weaknesses, to eventually gain unauthorized access to the system and its data. The discipline of system security management, that is responsible of securing infrastructure components, is characterized by the following three activities: 1. User permission management (privileges) 2. System deployment & maintenance (vulnerability avoidance, security repair patches) 3. Attack monitoring, detection and recovery (detecting suspicious behavior, backup) Let's put forth the aforementioned activities in a more practical context: database security management.
Infrastructure
III.
We have seen during the previous section that database management is part of the infrastructure of a given system; another affirmation ascertains that "Databases and database management systems provide the infrastructure on which other organizational information systems are built" [6]. Therefore, securing a database requires configuration (management) of its DBMS to fulfill the aim of owning a safe database environment. 4
As a matter of fact, the traditional focus of database security was to strive to insure that only authenticated users perform authorized activities at authorized times (privileges). Paradoxically, such a measure alone proved to be inadequate or insufficient, because of the raising the number of database encroaching [7]. We know that the risk of thread increases when the system accepts several users, but this asset isn't the only source of weakness for the system. That's why, to identify potential risks within a system, the following table summarizes database security essentials [2.3]: Data Users Protected Identifiable Reconstructable Authorized Auditable Monitored
A matching between database security essentials and the database security characteristics can be depicted as the following: Confidentiality Protected Identifiable, Monitored Integrity Reconstructable Authorized Availability Auditable
Data User
To reach the above mentioned attributes, we should consider the security methods pertaining to the field of security: Access Control Database Inference Encryption Database Auditing Database Vulnerabilities
A matching between a user's credentials and an object's label yields access grant [9]. Pitfall: a lot of planning ahead of MAC implementation, and high system management overhead when new data and new users added.
Pitfall: dispersed access control policy (users choose privileges), which potentially
compromises the global policies consistency [9].
Pitfall: No user, according to their role, can't change permissions provided to them.
b. Encryption [16]
Encryption can provide strong security for data at rest, but developing a database encryption strategy must take many factors into consideration the encryption level: storage layer: advantage: transparent, no changes needed at the level of the DB level drawback: insensitive to user privileges database: advantage: data are totally secured, encryption is part of the DB design. user privileges may be taken into account. drawback: major DBMS performances degradation (indexes become useless) application where the data has been produced: advantage: separated keys from encrypted stored data (DB) drawback: application (DB client) has to be modified. performance overhead 6
Figure III.d.1: the storage level, database level and application level encryption (Bouganim, Guo)
c. Database Auditing
Auditing the changes to a database is critical for identifying malicious behavior, maintaining data quality, and improving system performance. But an accurate audit log is a historical record of the past that can also pose a serious threat to privacy [3]. Indeed, database auditing (also called Data Access Auditing, Data Monitoring, Data Activity Monitoring (DAM)) is the examination of audit or transaction logs for the purpose of tracking changes with data or database structure [12]. It is utilized to identify who accessed the database objects, what actions were performed, and what data was changed. It doesn't prevent security breaches, but allows identifying breaches after they take place [7.6]. There exists two types of Database Auditing: The first technique: trace-based auditing, which is usually built directly into the native DBMS. Nonetheless, its overhead is expensive when audit tracing is enabled. The second technique: scanning and parsing the database transaction logs (All DBMSs support them for recovery purposes). Drawbacks Serious threat to confidentiality. Doesn't prevent security breaches.
a. Database Vulnerabilities
Database vulnerabilities have been part of the field of security, and being aware of them is as important as being aware of the previous covered techniques.
i. SQL Injections
SQL injections are the most renowned database security issues, and they susceptible to occur when SQL statements are dynamically created [7.4]. The main reason why injections happen, is due to SQL syntax, basically the double contiguous dashes "--"
which cause an ignorance of what comes after them . A deliberate attempt is to do the following:
CUS_CODE, CUS_LNAME, CUS_FNAME CUSTOMER CUS_CODE = I.CUS_CODE CUS_LOGIN = <content from text box 1> CUS_PASSWORD = <content from text box 2>;
A user with good intensions would enter the value he is supposed to:
SELECT CUS_CODE, CUS_LNAME, CUS_FNAME FROM CUSTOMER WHERE CUS_LOGIN = 'Mohamed' AND CUS_PASSWORD = 'Morocco'; Nothing would prevent a malicious user to provide a value that is subject to corrupt the confidentiality of the data:
The value provided by this malicious user could be:
' OR 1 = 1 --
This request returns TRUE, since the injected 'OR' breaks the control on the field and the double contiguous dashes allow to ignore the rest of the code that is beyond the, and unfortunately the malicious user would easily get an unauthorized access. Using stored procedures or stringifying the parameters of the query are enough to overcome this weakness.
IV.
a. Access Control
Both databases and data warehouses are repositories of data, but they differ in term of design, and consequently the aim: Databases store data uniquely for transactional operations, whereas data warehouses afford redundancy for analytical activities. Albeit both databases and data warehouses are different, the underlying data structure doesn't necessarily differ. Furthermore, what we call a Table in OLTP is called a Dimension (or Fact Table) in OLAP: Both are SQL tables being managed by a DBMS. Access Control concept, which was examined in the Database Security section, is valid in a Data Warehouse context.
Structural metadata: describe the structure and the content of the data warehouse. Access metadata: dynamic relationship between end-user applications and the data warehouse. They describe facts of the enterprise, user-defined names and aliases, the data warehouse server, data marts (databases), and tables.
That being said, metadata can describe security mechanisms in a data warehouse environment. Indeed, security rules are stored as metadata. A security model prototype, based on metadata, was developed within the framework entitled WWW-EIS-DWH project: When a user accesses data of the data warehouse, the Secure Query Management Layer (SQML) verifies their credentials through the corresponding access authorizations by analyzing security metadata (Access metadata). Figure IV.b.1 depicts what a data warehouse meta data security file looks like. Such a file may thoroughly describe a data warehouse. Indeed, this solution allows different user groups of the same data warehouse to be flexibly able to see different data of the data warehouse:
In Figure IV.b.2 protrays the structure of the discussed prototype. It is composed of three layers: Extraction Layer: ETL R-OLAP Layer: description of the DW content (post ETL) physical database -> data warehouse Presentation Layer: access rights decoding encoded queries encoding query results that are not cached registration, definition, and administration of users and user groups
11
The previous layers of the WWW-EIS-DWH project can be represented using the so called
12
The Security Query Management Layer (SQML): (1) SQML runs on an information server component. (2) It affects metadata layer to the information server component. (3) Syntactic analysis of the received queries. (4) assigns query to predefined user groups after evaluating them based on their login/password. User query submitted in the domain within the scope of allowed data area? (inspection phase)
(5) Access rules, that are user group dependent metadata, and filtered metadata, will be assigned on the fly (MQL). (6) MQL describes structurally the fact tables, dimensions, aggregation, and attributes of a specific user. (7) WWW-EIS-DWH's MView takes advantage of the star scheme to represent the data. Mentioning the different stages a user query undergoes obliges us to describe its outcome. Essentially, OLAP engine takes care of the query rendering: For the user to consume the result of the query s/he submitted, query passes through OLAP Engine, and the latter analyses the user's domain (reduced view), and sends the result to the client's browser; the result may be cached while the result is transiting to the client, so that OLAP calculations would rather be immediately returned than to be recalculated every now and then.
13
One of the main problems faced by system administrators is the protection of the data against unauthorized access or corruption due to malicious actions. Hence, the three characteristics of security emerge again here as the following: Data confidentiality is achieved by encrypting the dimensions data. Facts data are not encrypted due to performance issues (encryption in large tables is a heavy process that typically ruins the system performance). Nevertheless, to improve confidentiality, facts data are obfuscated by adding spurious records to the fact tables in order to mislead the attacker. Data integrity are guaranteed by using signatures in all records in the data warehouse and concurrent detection of malicious data modifications. Finally, data availability is achieved using replication. Each star schema of the same DW is distributed over an arbitrary number of nodes having the same star schema. Data of each dimension table are replicated in each node of a given cluster, except that the data of fact tables are distributed over the fact tables of the several nodes using strict row-by-row ROUND-ROBIN partitioning or hashing partitioning. It is not problematic to replicate dimension tables since the they occupy between 1% to 5% of the total space occupied by the DW repository. As a result, it is possible to execute OLAP queries on a DWS cluster parallelly by all the available nodes, and results are merged by a DWS middleware.
14
V.
Conclusion
Database and Data warehouse security is an import discipline that has been more and more evolving, knowing that vulnerabilities and threats or even risks can never be thoroughly covered. But patches and corrections, in addition to the communication between DBA about securities issues is a major asset toward database and data warehouse security. Confidentiality, Integrity and Availability are the three characteristics that striven to be reached once a repository is built to service a purpose.
15
We have seen how database security is a stable and old discipline that is widely covered, while the security of data warehouses is a recent field still under research and testing, but data warehouse security for sure is built on top of the sturdy results obtained from database security. However, to secure a data warehouse, there exists several models and techniques. We have covered the Metadata Based Model and the DW Striping Technique. The former takes advantage of access metadata to reduce the view of the data, so that any security measure wouldn't impact the navigation of the data contained in the data warehouse, while the latter is based signatures to ensure integrity, encryption to ensure confidentiality, and replication to ensure availability.
16
VI.
References
[1] Weippl E. (2010), Security in Data Warehouses; [2] Coronel C., Morris S., & Rob P. (2011), Database Principles - Fundamentals of Design, Implementation and Management, 9th edition [2.1] page 609; [2.2] page 7; [2.3] page 611; [3] Maxim B., Software Dependability, Lecture 9, CIS 376 - Fall 2008, University of Michigan-Dearborn; [4] Sommerville I. (2011), Software Engineering, 9th edition [4.1] Page 293; [4.2] Page 368-9; [5] Gupta S.L., Sonali M., & Palak M. (2012), Data Warehouse Vulnerability and Security; [6] Courtney J., Paradice D., Brewer K., Graham J. (2002), Database Systems for Management, 3rd edition, page 7; [7] Murray M., Database Security: What students need to know, Volume 9, 2010; [7.1] page IIP-62; [7.2] page IIP-63; [7.3] page IIP-64; [7.4] page IIP-68; [7.5] page IIP-71; [7.6] page IIP-73; [8] Elmasry R., Navathe S. (2004), Fundamentals of Database Systems 4th edition [8.1] page 735; [8.3] page 736; [8.3] page 740; [8.4] page 741; [8.4] page 744; [9] Gupta S., Mathur S., Modi P. (2012), Data Warehouse Vulnerability and Security; [10] Tawfik M., "Denial-of-service attack", http://securedb.blogspot.com/2010/07/denial-of-service-attack.html; [11] AppliCure Techonologies, "Prevent Denial of Service (DoS) Attacks", http://www.applicure.com/solutions/prevent-denial-of-service-attacks; [12] Katic N., Quirchmayr G., Schiefer J., Stolba M., Tjoa M. (1998), A Prototype Model for Data Warehouse Security Based on Metadata; [13] Vieira M., Vieira J., Madeira H. (2008), Towards Data Security in Affordable Data Warehouses; [14] Raut S. (2011), A Literature Surve on Data Warehouse Security Aspects; [15] APPLICATION SECURITY (APPSECINC), INC. (2009), "Addressing The Top 5 Database Vulnerabilities Plaguing Federal Agencies"; [16] Bouganim L., Guo Y. (2011), Database Encryption; [17] Oracle (2005), "Security and the Data Warehouse" [White Paper].
17