Difference between DBMS and RDBMS:
| **Criteria** | **DBMS (Database Management System)**
| **RDBMS (Relational Database Management System)** |
|--------------------------|-------------------------------------------------------
-------|-------------------------------------------------------------------------|
| **Definition** | A software system that allows the creation,
management, and handling of databases. | A type of DBMS that organizes data into
tables with rows and columns, adhering to a relational model. |
| **Data Structure** | Data is generally stored as files.
| Data is stored in tabular form, with rows (records) and columns (fields).|
| **Normalization** | DBMS does not support normalization.
| RDBMS supports normalization to reduce data redundancy. |
| **Relationship Between Data** | No relationships are established between the data
stored. | Tables can be related to each other using foreign keys.
|
| **Data Integrity** | DBMS offers limited support for enforcing data
integrity. | RDBMS provides strong data integrity using constraints like
primary keys and foreign keys. |
| **Transactions** | Does not fully support ACID properties (Atomicity,
Consistency, Isolation, Durability). | Fully supports ACID properties for reliable
transactions. |
| **Query Language** | Usually uses simple query languages or proprietary
query languages. | Uses SQL (Structured Query Language) for querying and managing
data. |
| **Data Redundancy** | Higher data redundancy due to lack of normalization.
| Lower data redundancy as RDBMS normalizes data. |
| **Scalability** | Suitable for small databases.
| Designed to handle larger databases with more complex relationships. |
| **Examples** | Examples include file systems, XML databases.
| Examples include MySQL, PostgreSQL, Oracle, Microsoft SQL Server. |
---
Difference between WHERE and HAVING:
| **Criteria** | **WHERE Clause**
| **HAVING Clause** |
|--------------------------|-------------------------------------------------------
---|--------------------------------------------------------------|
| **Purpose** | Filters rows before the data is grouped or
aggregated. | Filters groups (aggregated data) after `GROUP BY` operation. |
| **Usage** | Used with SELECT, UPDATE, DELETE queries.
| Used primarily with `GROUP BY` in SELECT queries. |
| **Filtering** | Filters individual rows based on conditions.
| Filters aggregated data or groups based on aggregate functions (e.g., `COUNT`,
`SUM`). |
| **Aggregate Functions** | Cannot use aggregate functions directly (e.g.,
`COUNT`, `SUM`) unless nested in a subquery. | Can use aggregate functions directly
to filter groups (e.g., `HAVING COUNT(*) > 5`). |
| **Execution Order** | Executes first, filtering rows before grouping or
aggregation. | Executes after `GROUP BY` and aggregate functions have been applied.
|
| **Performance Impact** | Improves performance by reducing the number of rows
processed. | Affects performance only after aggregation is done. |
| **Example** | ```SELECT * FROM Employees WHERE salary > 50000;```
| ```SELECT department, COUNT(*) FROM Employees GROUP BY department HAVING COUNT(*)
> 5;``` |
---
Drawbacks of RDBMS:
1. **Scalability Limitations**:
- **Vertical Scalability:** RDBMS are typically vertically scalable, requiring
more powerful hardware to handle increased load. Scaling horizontally (adding more
servers) is challenging.
- **Large Data Volume:** Managing huge volumes of unstructured data is
inefficient in RDBMS due to the need for predefined schemas and complex joins.
2. **Complexity in Handling Unstructured Data**:
- RDBMS are optimized for structured data. Handling unstructured or semi-
structured data, such as JSON or multimedia, is cumbersome.
3. **Performance Issues with Large Datasets**:
- Performance issues like slow queries, heavy JOIN operations, and increased
disk I/O can arise with large datasets.
4. **Rigid Schema**:
- **Fixed Schema:** Any changes to the data structure require modifying the
schema, which is time-consuming.
- **Schema Changes Impact:** Frequent structure changes can cause downtime or
performance issues.
5. **Cost of Maintenance**:
- High licensing, hardware, and administrative costs compared to non-relational
(NoSQL) systems.
6. **Complexity with Distributed Systems**:
- Implementing data replication and sharding across multiple servers is complex.
7. **Overhead in Handling Complex Queries**:
- Complex queries with multiple tables and large datasets can lead to
performance bottlenecks.
8. **Not Ideal for Real-Time Big Data Processing**:
- Not well-suited for real-time analytics or big data workloads.
9. **Limited Flexibility for Advanced Data Models**:
- RDBMS are not ideal for complex hierarchical or graph-based data models.
10. **Concurrency Control Issues**:
- High concurrency can lead to performance degradation.
---
Difference between Subquery and Correlated Subquery:
| **Criteria** | **Subquery**
| **Correlated Subquery** |
|---------------------------|------------------------------------------------------
----|--------------------------------------------------------------|
| **Definition** | A subquery is a query within another query that is
executed independently and returns a result set used by the outer query. | A
correlated subquery is a subquery that references columns from the outer query and
is executed once for each row processed by the outer query. |
| **Execution** | Executed only once, regardless of the number of rows
processed by the outer query. | Executed repeatedly for each row of the outer
query. |
| **Dependency** | The subquery is independent of the outer query and
can be executed on its own. | The correlated subquery depends on the outer query
for its execution because it references values from the outer query. |
| **Performance** | Generally faster since the subquery is executed only
once. | Can be slower due to repeated execution for each row in the outer query. |
| **Use Case** | Used when you need to filter data in the outer query
based on an independent query result. | Used when you need to filter data in the
outer query based on a per-row basis from the outer query. |
| **Example** | ```SELECT name FROM Employees WHERE salary > (SELECT
AVG(salary) FROM Employees);``` | ```SELECT name FROM Employees e1 WHERE salary >
(SELECT AVG(salary) FROM Employees e2 WHERE [Link] = [Link]);``` |
| **Scope** | The inner query (subquery) is self-contained and
does not rely on any data from the outer query. | The correlated subquery uses data
from the outer query, making it context-dependent. |