mod8_dbms
mod8_dbms
Core Philosophy
An OODBMS adopts the principles of object-oriented programming—specifically, the
encapsulation of state and behavior within objects—and brings them into the realm of
persistent storage. That is, objects created in an OOP language can be stored and retrieved
directly from the database, maintaining their identity, structure, and behavior.
Objects are instances of classes, and classes define both the data (attributes) and the
methods (operations) that can be performed on the data. In an OODBMS, these objects are
not transient—meaning they do not cease to exist once the application terminates. Rather,
the system manages their persistence, supporting operations such as object creation,
deletion, updates, and queries.
Characteristics
1. Object Identity: Unlike relational databases where identity is based on primary key
values, OODBMS supports intrinsic object identity. Every object has a unique object
identifier (OID) that remains constant throughout its lifetime.
2. Encapsulation: Objects encapsulate both data and behavior. Access to object state is
ideally only via defined methods.
3. Complex Objects: The system can manage arbitrarily complex data structures,
including sets, lists, nested records, and user-defined types.
4. Inheritance: OODBMS supports both single and multiple inheritance. This allows
classes to inherit properties and behaviors from other classes, encouraging reuse and
polymorphism.
mod8 1
5. Persistence: Objects can be made persistent without requiring conversion to tables or
flat files. This aligns well with applications built in OOP languages.
6. Programming Language Integration: Many OODBMSs are tightly coupled with object-
oriented programming languages (e.g., C++, Java), reducing the impedance mismatch
between the programming model and the data model.
Conceptual Foundation
In an ORDBMS, data is still represented in terms of tables (relations), but these tables can
now contain complex data types, user-defined types (UDTs), and inheritance structures.
Rather than breaking complex objects down into multiple flat tables, ORDBMS allows these
objects to be represented more natively, while still supporting SQL-based operations.
This system retains the declarative querying power of SQL, and incorporates object-
oriented modeling constructs such as encapsulation, polymorphism, and extensibility into
the relational framework.
Key Features
1. Extended Type System: ORDBMS introduces abstract data types (ADTs) or user-
defined types (UDTs) that can encapsulate structure and behavior. These are more
expressive than primitive data types.
2. Inheritance and Polymorphism: Tables (or types) can inherit from one another. This
allows for schema evolution and reuse. Queries on parent types can automatically
include data from all subtypes.
mod8 2
3. Encapsulated Methods: Methods can be defined alongside UDTs. These methods are
typically written in external languages (e.g., PL/SQL, Java, C) and can be invoked from
within SQL.
4. Complex Structures: ORDBMS supports arrays, sets, nested tables, and even
multimedia objects (images, audio, video) as columns in a relational schema.
Conclusion
Object-Oriented and Object-Relational databases address the limitations of the flat
relational model in different ways. OODBMS is more aligned with pure object-oriented
applications, while ORDBMS extends the relational paradigm with object-oriented
capabilities to support more complex data modeling without abandoning SQL. Both have
their place in database history, but ORDBMS has found broader commercial acceptance
due to its backward compatibility, extensibility, and standardization.
A logical database defines what data is stored and how it is related, while remaining
independent of storage mechanisms, access paths, and indexing methods.
mod8 3
3. Internal Schema (Physical Level) – Storage structures
The logical schema (or logical database) exists at the middle level, providing a unified and
abstract description of the entire database, which is independent of physical storage and
specific application requirements.
2. Schema Definition
A logical database includes:
Data types
mod8 4
Level Conceptual level (middle) Internal level (lowest)
Example:
An E-R diagram may define:
This schema represents the logical database — it defines the data and its structure, but not
how it is stored or accessed.
2. Portability
The same logical schema can be mapped to different physical architectures or storage
systems.
mod8 5
3. Centralized Design
Acts as a unified blueprint for managing data across different applications and user
interfaces.
Logical design is DBMS-independent and focuses on business rules, data flow, and
structural constraints.
8. Real-World Relevance
Logical databases are widely used in:
Application frameworks that generate database access layers from logical models
9. Limitations
Logical databases do not account for performance optimization — those concerns are
addressed during physical schema design.
In some cases, logical abstraction may hide important hardware constraints that affect
design decisions (e.g., storage size limits, disk layout).
Conclusion
A logical database is a crucial abstraction layer in DBMS that defines the structure,
constraints, and relationships of data independently of how data is physically stored. It
provides the foundation for data independence, system modularity, and application-
mod8 6
agnostic database access. Mastery of logical schema design is essential for any database
designer or architect aiming for scalable and maintainable systems.
Web databases are integral to modern computing ecosystems, powering online banking,
social media platforms, e-commerce systems, content management systems, and virtually
all interactive websites.
2. Definition
A Web Database is a database system that is accessible over the internet or an intranet
through web technologies such as HTTP, server-side scripting languages (like PHP, Python,
Node.js), and client-server communication protocols.
It typically works behind the scenes of dynamic websites, handling data retrieval,
manipulation, and storage operations requested via web interfaces.
Communicates with the web server via HTTP or asynchronous technologies like AJAX.
mod8 7
Uses server-side technologies such as PHP, Django (Python), Express (Node.js),
ASP.NET, Java Servlets, etc.
Platform Independent: Accessible from various devices and operating systems using
standard internet protocols.
These technologies work in tandem to enable seamless communication between the user
and the database.
mod8 8
6. Use Cases and Applications
E-commerce Websites (e.g., Amazon): To manage product catalogs, orders, users,
payments
Social Media Platforms (e.g., Facebook): To handle profiles, posts, interactions, and
feeds
Online Banking Systems: Securely store and manage financial transactions and
customer data
Content Management Systems (e.g., WordPress): Power blogs and websites with
dynamic content
3. Cost Efficiency: No need for specialized client software; browsers act as universal
frontends.
4. Scalability: Can be scaled vertically and horizontally to handle growing user bases.
5. Ease of Integration: Can be connected with third-party APIs, payment gateways, social
login systems, etc.
2. Concurrency Management: Must handle race conditions and maintain ACID properties
under load.
3. Latency: Dependent on internet bandwidth; high latency can degrade user experience.
5. Downtime Risk: Server or network failures can make the application inaccessible.
mod8 9
9. Example Scenario
Consider a web-based learning management system (LMS):
Upon successful authentication, the student’s dashboard is rendered with data from
multiple tables: courses, assignments, notifications.
Access Mode Via browser/web interface Typically via desktop apps or command line
Conclusion
Web databases have become the backbone of modern, data-driven web applications. By
enabling persistent, dynamic interaction between users and content via browsers, they
revolutionize how data is stored, accessed, and manipulated. As web technologies evolve,
web databases continue to be central to digital communication, commerce, education,
governance, and virtually all aspects of online life. Their power lies in their ability to
combine the structure and robustness of traditional databases with the interactivity and
accessibility of the web.
mod8 10
these limitations, database systems evolved into distributed database systems, which
distribute data across multiple physical sites but maintain logical consistency and unified
access.
A Distributed Database (DDB) is a type of database system in which data is logically
related but physically stored at multiple locations, often connected via a network. These
locations could be on different servers, within different geographical regions, or even
spread across cloud environments.
2. Definition
A Distributed Database System (DDBS) is a collection of multiple, logically interrelated
databases distributed over a computer network, where each site is capable of processing
part of the database independently. Despite the distribution, the system appears as a
single unified database to the user.
The server(s) store and manage the distributed data and respond with the requested
results.
Local sites manage their own data and participate in global transactions coordinated by
a central site.
Uniform data models and query languages (e.g., all use PostgreSQL).
mod8 11
Easier to manage and integrate.
Greater flexibility but adds complexity in translation, query processing, and integration.
Vertical Fragmentation: Columns are distributed (e.g., separating personal info and
financial info).
2. Replication
Copies of the same data are stored at multiple sites.
3. Allocation
Decides where to place data or fragments — based on access patterns, costs, and
constraints.
6. Key Features
1. Location Transparency
Users can query data without knowing the physical location of data.
2. Replication Transparency
The system hides the fact that data is replicated across sites.
3. Fragmentation Transparency
mod8 12
Users are unaware of how data is divided or fragmented.
4. Concurrency Control
5. Fault Tolerance
Queries can involve data from multiple sites and must be optimized accordingly.
7. Autonomy
7. Distributed Transactions
A distributed transaction is one that accesses or modifies data at multiple sites. It must
satisfy the ACID properties:
2. Phase 2 (Commit/Abort): If all respond with "ready", it sends commit; else, it sends
abort.
2. Scalability
mod8 13
Easily scale out by adding new nodes and redistributing data.
3. Local Autonomy
4. Faster Access
Users can access data stored at nearby sites, reducing response time.
Local processing reduces the need to transfer large amounts of data across the
network.
Query planning must account for network latency, data location, and fragmentation.
3. Data Security
4. Concurrency Control
Telecom Networks: Call records and user data stored across regional servers.
E-commerce: Data centers in multiple regions store product and customer data close to
the users.
Cloud Databases: Systems like Google Spanner or Amazon Aurora replicate and
partition data across multiple zones.
mod8 14
Conclusion
A Distributed Database offers a robust, scalable, and fault-tolerant alternative to
centralized systems, especially for modern applications that demand global access, high
availability, and low latency. While it introduces complexity in terms of query processing,
concurrency, and system management, it significantly enhances the performance,
modularity, and flexibility of enterprise-scale database systems. Mastery of distributed
database concepts is essential for building resilient, large-scale information systems in
today’s interconnected digital world.
To address this, the concept of a Data Warehouse was developed. A Data Warehouse is a
centralized repository that stores integrated, subject-oriented, time-variant, and non-
volatile data from multiple sources, optimized for querying and analysis rather than
transaction processing.
1.2 Definition
A Data Warehouse is a large, centralized system that collects data from various
heterogeneous sources, transforms it into a consistent format, and stores it for analytical
querying and decision-making. It acts as the foundation of business intelligence (BI)
systems.
2. Integrated
mod8 15
Consolidates data from multiple sources (e.g., relational databases, flat files, legacy
systems), resolving naming conflicts and data format inconsistencies.
3. Time-Variant
Maintains historical data to support trend analysis and forecasting. Each record is time-
stamped or associated with a period.
4. Non-Volatile
Once data is loaded into the warehouse, it is not updated or deleted through typical
transactional operations. It is read-only for analysis purposes.
3. Data Storage
Centralized data warehouse or data marts (smaller, department-specific warehouses).
4. Metadata Repository
Stores information about the data such as its origin, transformations, and structure.
mod8 16
Improves data quality through integration and cleaning
1.6 Challenges
High cost of setup and maintenance
2.2 Definition
Data Mining is the process of automatically discovering patterns, trends, correlations, or
anomalies in large datasets using techniques from statistics, machine learning, and
database systems.
mod8 17
2.4 Data Mining Process (Part of KDD)
1. Data Selection: Identify relevant data from the warehouse
3. Transformation: Convert data into suitable format for mining (e.g., feature extraction)
2. Clustering
Grouping similar data without predefined labels.
4. Regression
Predicting a continuous numeric value.
5. Anomaly Detection
Identifying data points that deviate significantly from the norm.
mod8 18
2.6 Applications of Data Mining
Marketing: Customer segmentation, recommendation engines
Bias and Fairness: Data mining systems may unintentionally reinforce societal biases
In essence:
Conclusion
Data warehousing and data mining form two crucial pillars of modern data-driven decision-
making systems. A data warehouse enables the efficient collection, storage, and
mod8 19
management of vast volumes of organizational data, while data mining leverages that
stored data to derive actionable insights, patterns, and predictions. Together, they support
strategic planning, operational efficiency, and a deeper understanding of business and user
behavior in nearly every sector.
mod8 20