0% found this document useful (0 votes)
10 views

DBMS_Module5

Uploaded by

loloo
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

DBMS_Module5

Uploaded by

loloo
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Database Design and

Modelling
Database Security and Optimisation
Database Security
Database security refers to the measures and mechanisms put in place to protect databases and
their contents from unauthorized access, misuse, modification, or destruction. It involves
implementing various security controls, protocols, and best practices to ensure the confidentiality,
integrity, and availability of data stored in databases. Database security aims to prevent
unauthorized users or malicious entities from gaining access to sensitive information, tampering
with data, or disrupting database operations.

Key components of database security include:

• Access Control: Regulating who can access the database and what actions they can perform.
• Data Encryption: Protecting data by encoding it into unreadable formats that can only be
decrypted by authorized users.
• Auditing and Monitoring: Tracking database activities and generating logs to detect and
investigate suspicious or unauthorized behavior.
• Authentication: Verifying the identity of users or entities accessing the database.
• Data Masking and Anonymization: Concealing sensitive data to prevent unauthorized
disclosure while maintaining usability.
Importance of Securing
Databases
Securing databases is crucial for several reasons:
• Protection of Sensitive Information: Databases often contain sensitive and
confidential data, such as personal information, financial records, intellectual property, and
trade secrets. Securing databases ensures that this information remains confidential and
protected from unauthorized access.
• Compliance Requirements: Many industries and regulatory bodies have strict data
protection and privacy regulations (e.g., GDPR, HIPAA) that mandate organizations to
implement robust security measures to safeguard sensitive data. Failure to comply with
these regulations can result in legal consequences and penalties.
• Prevention of Data Loss or Theft: Database breaches or unauthorized access can lead
to data loss, theft, or exposure, causing reputational damage, financial losses, and legal
liabilities for organizations.
• Maintaining Business Continuity: Ensuring the availability and reliability of databases
is essential for business continuity. Databases must be protected against threats such as
cyberattacks, system failures, or natural disasters that could disrupt operations.
Common Threats to Database
Security
There are various threats that pose risks to database security:
• SQL Injection: Attackers inject malicious SQL code into input fields or parameters of
database queries to manipulate or access unauthorized data.
• Unauthorized Access: This includes unauthorized users gaining access to the database
through weak authentication mechanisms, stolen credentials, or exploiting vulnerabilities.
• Data Breaches: Incidents where sensitive data is accessed, stolen, or exposed to
unauthorized parties, often resulting from security vulnerabilities or insider threats.
• Malware and Ransomware: Malicious software (malware) can infect databases, leading
to data corruption, theft, or ransomware attacks where attackers demand payment to
restore access to data.
• Insider Threats: Malicious or negligent actions by employees, contractors, or trusted
entities can compromise database security by intentionally or accidentally leaking
sensitive information or abusing privileges.
Security Considerations in
Database Design
Authentication and
Authorization
Different Types of Authentication Mechanisms
1.Password-Based Authentication: Users authenticate with a username and password.
It's widely used but vulnerable to password-guessing attacks if weak passwords are used.
2.Multi-Factor Authentication (MFA): Requires users to provide two or more forms of
identification, such as a password, security token, fingerprint, or facial recognition,
enhancing security.
3.Biometric Authentication: Uses unique biological traits like fingerprints, iris scans, or
facial recognition for authentication, offering a high level of security but requiring
specialized hardware.

Implementing Role-Based Access Control (RBAC):

• RBAC assigns roles to users based on their responsibilities and grants permissions
accordingly.
• Roles define what actions users can perform (e.g., read, write, execute) on specific
database objects (e.g., tables, views).
• Administrators manage roles and assign them to users, ensuring the least privileged
access and reducing the risk of unauthorized actions.
Data Encryption
Types of Encryption:
1.Symmetric Encryption: Uses a single key for both encryption and decryption.
Fast and suitable for large amounts of data but requires secure key management
practices.
2.Asymmetric Encryption: Uses a pair of keys (public and private) for
encryption and decryption. Slower but offers stronger security and enables
secure key exchange without sharing keys.

Securing Data at Rest and in Transit:


• Data at Rest: Encrypts data stored in databases or files to prevent
unauthorized access if physical or digital security measures fail.
• Data in Transit: Encrypts data transmitted over networks using protocols like
SSL/TLS to protect against eavesdropping and man-in-the-middle attacks.
Auditing and Logging
Importance of Auditing Database Activities
• Auditing tracks and records database activities, including user logins,
queries, modifications, and access attempts.
• It helps detect suspicious behavior, comply with regulations, investigate
security incidents, and maintain accountability.
• Logging Mechanisms and Best Practices
• Database Logs: Capture detailed information about database events,
errors, and transactions.
• Logging Best Practices: Enable auditing and logging features, store
logs securely, regularly review logs for anomalies, and implement
automated alerting for critical events.
Data Masking and
Anonymization
Techniques for Masking Sensitive Data
1.Static Masking: Replaces sensitive data with fictional or scrambled
values in non-production environments.
2.Dynamic Masking: Masks data based on user roles or access
permissions, showing full data to authorized users and masked data to
others.
• Anonymization Methods to Protect Privacy
• Generalization: Aggregates data to a higher level (e.g., replacing
specific ages with age ranges) to anonymize individual records.
• Randomization: Randomly alters data values without affecting data
integrity, making it difficult to identify individuals.
Secure Coding Practices for Database Applications

1.Principle of Least Privilege: Follow the principle of least privilege by


granting users and applications only the minimum permissions required to
perform their tasks. Avoid using privileged accounts for routine operations.
2.Input Validation and Sanitization: Validate and sanitize user inputs to
prevent malicious input from being executed as part of SQL queries or
commands. This helps prevent SQL injection attacks.
3.Error Handling: Implement robust error handling mechanisms to handle
exceptions gracefully without leaking sensitive information that attackers
could exploit.
4.Secure Configuration: Configure databases and applications securely
by disabling unnecessary features, enforcing strong authentication
mechanisms, and regularly applying security patches and updates.
Input Validation and
Sanitization
• Input Validation: Verify that user inputs conform to expected
formats, lengths, and types before processing them. Use server-
side validation in addition to client-side validation for enhanced
security.
• Input Sanitization: Sanitize inputs by removing or escaping
special characters that could be interpreted as SQL commands
or injection payloads. Use libraries or frameworks that offer
built-in sanitization functions.
Parameterized Queries to Prevent SQL Injection
• Parameterized Queries: Use parameterized queries or prepared statements
instead of dynamically constructing SQL queries with user inputs. Parameters
are placeholders for user inputs, and they are automatically sanitized by the
database engine, reducing the risk of SQL injection attacks.
import sqlite3
# Establish connection to the database
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
# Using a parameterized query to insert data safely
name = "John"
age = 30
cursor.execute("INSERT INTO users (name, age) VALUES (?, ?)", (name, age))
# Commit changes and close connection
conn.commit()
conn.close()
Using Stored Procedures for Security
• Stored Procedures: Utilize stored procedures to encapsulate SQL logic
and execute predefined database operations. Stored procedures can
enhance security by:
• Reducing direct SQL access, limiting exposure to SQL injection vulnerabilities.
• Enforcing business rules and access controls within the database.
• Improving performance by reducing network traffic for complex queries.

DELIMITER //

CREATE PROCEDURE GetUserInfo(IN userId INT)


BEGIN
SELECT * FROM users WHERE id = userId;
END //
Database Security Tools
Overview of Security Tools
Firewalls:
• Network Firewalls: Monitor and control incoming and outgoing network traffic
based on predetermined security rules. They can be hardware-based or software-
based.
• Application Firewalls: Provide an additional layer of security by inspecting and
filtering traffic at the application layer to protect against application-level attacks.
Intrusion Detection Systems (IDS):
• Network-based IDS (NIDS): Monitor network traffic for suspicious activity and
potential security threats, such as intrusion attempts, unauthorized access, or
anomalies.
• Host-based IDS (HIDS): Monitor activities and events on individual hosts or servers
to detect unauthorized changes, malware infections, or suspicious behavior.
Database-Specific Security Tools and Features
Database Firewalls:
• Purpose: Protect databases from unauthorized access, SQL injection attacks, and other
malicious activities.
• Features: Monitor and analyze database traffic, enforce security policies, and block
suspicious or unauthorized queries in real time.
• Examples: Imperva SecureSphere Database Firewall, Oracle Database Firewall, IBM
Guardium.
Encryption Tools:
• Purpose: Encrypt sensitive data at rest and in transit to protect against unauthorized access
and data breaches.
• Types of Encryption: Symmetric encryption (AES, DES), asymmetric encryption (RSA, ECC),
and hashing algorithms (SHA-256, MD5).
• Examples: Oracle Advanced Security, Microsoft SQL Server Transparent Data Encryption
(TDE), and open-source libraries like OpenSSL for encryption.
Database-Specific Security Tools
and Features
Database Activity Monitoring (DAM):
• Purpose: Monitor and audit database activities, including user access, queries, modifications,
and administrative actions.
• Features: Real-time monitoring, logging, auditing, and alerting capabilities to detect
suspicious or unauthorized behavior.
• Examples: IBM Guardium Data Activity Monitor, Imperva SecureSphere Database Activity
Monitoring, McAfee Database Security.
Database Auditing and Logging:
• Purpose: Capture and record database events, transactions, and changes for auditing,
compliance, and forensic analysis purposes.
• Features: Generate audit logs with details such as user actions, timestamps, IP addresses,
and affected database objects.
• Examples: Database-native auditing features (e.g., Oracle Audit Vault and Database Firewall,
SQL Server Audit), third-party logging tools.
Security Management and
Compliance Tools
• Security Information and Event Management (SIEM): Collect,
correlate, and analyze security event data from various sources,
including databases, to detect and respond to security incidents.
• Compliance Management: Tools that help organizations adhere to
regulatory requirements and industry standards (e.g., PCI DSS, GDPR) by
implementing security controls, conducting audits, and managing
compliance documentation.
Case Study: Yahoo Data Breaches
Overview:
• Dates: 2013-2014 (first breach), 2014-2016 (second breach)
• Impact: Over 3 billion user accounts were compromised, including email addresses,
passwords, and personal information.
• Lessons Learned and Best Practices:
1.User Authentication: Yahoo faced issues with weak password storage practices (hashed
passwords using MD5), making it easier for attackers to crack passwords. Use strong and
modern hashing algorithms (e.g., bcrypt, PBKDF2) for password storage.
2.Incident Response: Having a robust incident response plan is crucial to detect and
respond to data breaches promptly. Timely disclosure and communication with affected
users and stakeholders are essential for transparency and trust.
3.Data Retention and Deletion: Limit the retention of user data and implement secure
data deletion practices for unused or obsolete data to reduce the impact of breaches and
data exposure.
Database Optimization and
Performance Tuning
Importance of Optimizing Database
Performance
1.Improved Efficiency: Optimizing database performance leads to faster
response times, reduced latency, and improved overall efficiency of database
operations.
2.Enhanced User Experience: Faster query execution and data retrieval times
result in a better user experience, especially for applications handling large
volumes of data or concurrent user requests.
3.Scalability: Optimal database performance facilitates scalability, allowing
applications to handle increasing workloads and user demands without
compromising performance.
4.Cost Savings: Efficient database operations require fewer hardware resources
(e.g., CPU, memory), reducing infrastructure costs and operational expenses.
5.Data Integrity and Availability: Optimized databases are less prone to
bottlenecks, failures, and downtime, ensuring data integrity and availability for
critical business processes.
Key Performance Metrics
1. Response Time
• Definition: The time taken by the database to process a query or transaction and return results
to the user or application.
• Importance: Lower response times indicate faster query processing and better performance,
contributing to improved user satisfaction and productivity.

2. Throughput
• Definition: The number of transactions or operations processed by the database within a specific
time period (e.g., transactions per second, queries per minute).
• Importance: Higher throughput indicates the database's ability to handle a larger workload and
process multiple transactions concurrently, enhancing scalability and performance under heavy
loads.

3. CPU Utilization
• Definition: The percentage of CPU resources utilized by the database server to execute queries,
processes, and transactions.
• Importance: Monitoring CPU utilization helps identify performance bottlenecks, optimize query
execution plans, and ensure efficient resource utilization.
Key Performance Metrics
• 4. Memory Usage
• Definition: The amount of memory (RAM) consumed by the database server to store
data, execute queries, and cache frequently accessed data.
• Importance: Efficient memory usage reduces disk I/O operations, improves query
performance, and enhances overall system responsiveness.
• 5. Disk I/O Operations
• Definition: The input/output operations performed on disk storage by the database
server to read or write data.
• Importance: Monitoring disk I/O operations helps identify storage bottlenecks, optimize
data retrieval and storage mechanisms, and improve database performance.
• 6. Lock Contention
• Definition: The occurrence of lock conflicts or contention between concurrent
transactions accessing shared database resources (e.g., tables, rows).
• Importance: Managing lock contention reduces transaction conflicts, improves
concurrency, and prevents performance degradation due to locking issues.
Indexing

• Indexing is a database optimization technique that involves creating data structures


(indexes) to improve the speed of data retrieval operations, such as SELECT queries. Indexes
store pointers to rows in a table, allowing the database engine to quickly locate and access
specific data based on indexed columns.
• What is Indexing?
1.Data Structure: An index is a data structure that maps keys to the locations of
corresponding values. In the context of databases, keys are usually the values of one or more
columns in a table, and the values are pointers or references to the actual rows in the table.
2.Search Optimization: Indexing is used to optimize the search and retrieval of rows in a
table based on the values of indexed columns. Without indexes, the database system would
need to scan the entire table sequentially to find rows that match certain criteria, which can
be highly inefficient for large datasets.
3.Types of Indexes:
1. Primary Index: Typically created automatically by the database system to enforce the uniqueness of
the primary key column(s) and to facilitate fast retrieval of rows based on the primary key.
2. Secondary Index: Created manually by users to improve the performance of queries that involve
columns other than the primary key.
Types of Indexes
• How Indexing Works:
1.B-tree Structures: Most databases use B-tree or B+-tree structures for indexing.
These structures organize index keys in a balanced tree format, allowing for
efficient search, insertion, and deletion operations.
2.Index Scan: When a query involves indexed columns, the database optimizer may
choose to perform an index scan, which involves traversing the index tree to locate
the relevant rows based on the search criteria specified in the query.
3.Index Seek: In some cases, an index seek can be performed, where the database
directly accesses the desired rows based on the indexed values without scanning
the entire index tree.
Real Life Example
Employee Database:
• Consider an employee database with a table named employees. One
common query might be to retrieve employee information based on their
employee ID (emp_id). By creating an index on the emp_id column, the
database can quickly locate the corresponding employee record(s) without
scanning the entire table.
e.g. CREATE INDEX idx_emp_id ON employees(emp_id);
E-commerce Website:
• In an e-commerce database, there's often a need to search for products
based on their attributes, such as category, brand, or price range. By
creating indexes on these attribute columns, queries like product searches or
filtering by category can be significantly accelerated.
e.g. CREATE INDEX idx_category ON products(category);
CREATE INDEX idx_brand ON products(brand);
CREATE INDEX idx_price ON products(price);
Guidelines for Choosing and
Maintaining Indexes
• Identify Frequently Queried Columns: Analyze query
patterns to identify columns frequently used in WHERE
clauses or JOIN conditions.
• Consider Cardinality: Index columns with high cardinality
(unique values) for better selectivity and query performance.
• Regular Maintenance: Monitor index usage and
performance regularly. Remove or update redundant or
unused indexes to avoid overhead.
Query Optimization
• Understanding Query Execution Plans
• Query execution plans provide insights into how the database
engine processes and executes queries.
• Use EXPLAIN or equivalent commands to view query execution
plans.
• FROM, JOIN, WHERE, GROUP BY, HAVING, SELECT, DISTINCT,
ORDER BY, and finally, LIMIT/OFFSET.
Techniques for Improving Query
Performance
1. Use Indexes:
1. Utilize appropriate indexes to speed up query execution by reducing the number of rows scanned.
2. Example:
• SELECT * FROM employees WHERE department_id = 10;
• CREATE INDEX idx_department_id ON employees(department_id);

2. Query Rewriting:
Rewrite queries to optimize performance, such as adding WHERE clauses to use indexes
efficiently.
• Example:
• -- Original query
• SELECT COUNT(*) FROM orders WHERE status = 'pending';

• -- Rewritten query
• SELECT COUNT(*) FROM orders WHERE status = 'pending' AND order_date > '2022-01-01';
Data Modeling
Normalization vs. Denormalization
• Normalization:
• Organize data into multiple related tables to reduce redundancy and improve data
integrity.
• Follow normalization forms (1NF, 2NF, 3NF) to eliminate data anomalies.
• Denormalization:
• Combine related data into fewer tables or add redundant data for performance
optimization.
• Suitable for read-heavy operations or reporting requirements.
• Designing Efficient Data Models for Performance
• Balance normalization and denormalization based on:
• Query patterns (read vs. write operations).
• Performance requirements (query response time, data retrieval speed).
• Scalability considerations (handling large data volumes).
Caching and Buffering

Caching Strategies
1.Query Caching:
1. Cache frequently executed queries and their results to reduce database load.
2. Result Caching:
• Cache query results at the application level to avoid redundant database queries.
• Buffering Mechanisms
• Buffer Pool:
• Cache frequently accessed data pages in memory to reduce disk I/O operations.
• Write-ahead Logging (WAL):
• Stage write operations in a log buffer before committing changes to disk for
improved transaction durability and performance.
Impact of Hardware on Database Performance

CPU (Central Processing Unit)


• Role: Handles processing tasks, including query execution, data
manipulation, and computation.
• Impact: Faster CPUs can improve query processing speed and overall
database performance, especially for CPU-intensive operations.
Memory (RAM)
• Role: Stores data and frequently accessed information for faster retrieval.
• Impact: Sufficient memory reduces disk I/O operations, improves caching
efficiency, and enhances query performance by reducing data access latency.
Storage (Disk or SSD)
• Role: Stores database files, including data, indexes, and transaction logs.
• Impact: Faster storage systems (SSDs) can reduce read/write latency,
improve I/O throughput, and speed up data retrieval operations.
Scalability Options
Vertical Scaling (Scaling Up)
• Description: Increasing the capacity of a single server by upgrading hardware
components (CPU, memory, storage).
• Advantages: Simplified management, easy to implement, suitable for small to
medium workloads.
• Limitations: Limited scalability beyond hardware constraints, and potential
downtime during upgrades.
Horizontal Scaling (Scaling Out)
• Description: Adding more servers or nodes to distribute workload across
multiple machines.
• Advantages: Improved scalability, fault tolerance, and performance for large-
scale applications.
• Limitations: Requires distributed systems architecture, data partitioning
strategies, and synchronization mechanisms.
Database Clustering and Replication
Clustering for High Availability:
• Description: Combines multiple database servers into a cluster for high availability
and failover resilience.
• Advantages: Redundancy and fault tolerance, automatic failover, improved
availability and uptime.
• Technologies: Examples include MySQL Cluster, PostgreSQL streaming replication,
and Microsoft SQL Server Always On Availability Groups.
Replication for Performance Scaling:
• Description: Copies and synchronizes data between primary and replica database
servers for read scalability and load balancing.
• Advantages: Offload read-heavy queries to replicas, distribute read workload, and
improve performance and response times.
• Technologies: Common solutions include master-slave replication, multi-master
replication, and distributed database systems like Cassandra.

You might also like