0% found this document useful (0 votes)
7 views

L3...

The document provides an overview of databases and information management, detailing the distinctions between data, information, knowledge, and wisdom. It discusses various database types, including relational and non-relational databases, and highlights the importance of database management systems (DBMS) in organizing and accessing data. Additionally, it covers topics such as big data, business intelligence, data mining, and the significance of data security and knowledge management.

Uploaded by

demro channel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

L3...

The document provides an overview of databases and information management, detailing the distinctions between data, information, knowledge, and wisdom. It discusses various database types, including relational and non-relational databases, and highlights the importance of database management systems (DBMS) in organizing and accessing data. Additionally, it covers topics such as big data, business intelligence, data mining, and the significance of data security and knowledge management.

Uploaded by

demro channel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

DATABASES AND INFORMATION MANAGEMENT

DATA, INFORMATIONANDKNOWLEDGE
 Data: Raw facts without context or intent.
 Types of data:
- Quantitative - Qualitative:
 Information: Processed data with context, relevance and purpose.
 Example: Monthly sales calculated from daily sales data
 Importance of manipulating data to uncover trends and insights.
 Knowledge: Human beliefs or perceptions about relationships among facts.
 Example: Relationship between the quality of goods and sales.
 Knowledge facilitates action and decision-making.
 Distinction between explicit and tacit knowledge:
Explicit: Easily communicated (e.g., documented processes), related usually to facts
Tacit: Insights and intuitions, difficult to transfer.
 Wisdom: Combining knowledge and experience for deeper understanding.
- Importance of patience and experience in developing wisdom

WHY DATA BASE


DATATYPES
Control redundant data#
- Text - Numbers
Violation of data integrity - Boolean -Currency
- Data/Time - Paragraph
Avoid dependency on human error - Object

INTRODUCTION TO DATABASES
Database:
- Collection of related files containing records on people, places, or things
Entity:
- Generalized category representing person,place, thing
- E.g., SUPPLIER, PART
Attributes:
- Specific characteristics of each entity:
- SUPPLIER name, address
- PARTdescription, unitprice, supplier
Types: -
- Relational - non-relational

RELATIONAL DATABASE MODEL


Organize data into two-dimensional tables (relations) with
columns and rows
One table for each entity:
- Fields (columns) store data representing an attribute
- Rows store data for separate records,or tuples
Key field: uniquely identifies each record
- Primary key - Secondary key

ENTITY STUDENT

RELATIONAL DATA BASE EX: SUPPLIER ENTITY

RELATIONAL DATA BASE EX: SUPPLIER ENTITY


DESIGNING A DATABASE
Entity-relationship diagram
 Used to clarify table relationships in a relationaldatabase
Relational database tables may have:
 Zero-to-many
 Many-to-zero
 One-to-one relationship
 One-to-many relationship
 N-to-Many relationship
 Many-to-many relationship
 Requires“join table” or intersection relation that links the two tables to join information
THE FINAL DATABASE DESIGN WITHSAMPLE RECORDS

ENTITY-RELATIONSHIP DIAGRAM FOR THE DATABASE WITH FOURTABLES

DESIGNING A DATABASE
 Normalization: Reduce data redundancy and ensure integrity
- Streamlining complex groups of data
- Minimizes awkward many-to-many relationships
- Increases stability and flexibility
 Referential integrity rules
- Ensure that relationships between coupled tables
remain consistent
DATABASE MANAGEMENT SYSTEMS (DBMS)

 Definition:
 Software that facilitates the creation, manipulation, and administration of databases
 Software for creating, storing, organizing, and accessing data from a database
 Separates the logical and physical views of thedata
 Logical view: how end usersviewdata
 Physical view: how data are actually structured and organized
Examples: Oracle, SQL Server, MySQL, Access.

EXAMPLE: DIFFERENTVIEWS OF HEALTHCARE DATABASE

OPERATIONS OF RELATIONAL DBMS

 Select: Creates a subset of all records meeting statedcriteria


 Join: Combines relational tables to present the server with more information than is available from individual tabl
 Project:
Creates a subset consisting of columns in a table

Permits user to create new tables containing only desired information


EXAMPLES OF OPERATION OF DBMS

CAPABILITIES OF DATABASE MANAGEMENT SYSTEMS


 Data definition capabilities:
 Specify structure of content of database
 Data dictionary:
 Automated or manual file storing definitions of data
elements andtheir characteristics
 Querying and reporting:
 Data manipulation language
 Structured query language (SQL)
 Report generation, e.g.,Crystal Reports

SQL (STRUCTURED QUERY LANGUAGE)


 The standard language for interacting with relational databases.

 SQL allows users to perform various operations, such as retrieving data, inserting new records, and updati
existing ones

 Example :

SELECT PART.Part_Number,PART.Part_Name,SUPPLIER.Supplier_Number,
SUPPLIER.Supplier_Name FROM PART,SUPPLIER

WHERE PART.Suplier_Number = SUPPLIER.Supplier_Number AND Part_Number = 137 OR


Part_Number = 150;
Access
 Microsoft Access has a rudimentary data dictionary capability that displays information about the size, format, and
other characteristics of each field in a database.
PERSONAL VS. ENTERPRISE DATABASES
- Personal DBMS: Microsoft Access, Open Office Base.
- Enterprise DBMS: Oracle, MySQL

OTHER DATABASE TYPES


 Hierarchical
 Document centric
 NOSQL
 Distributed Data Base
 Cloud Database

NON- RELATIONAL DATABASES


 “NoSQL”
 Handle large data sets of data that are not easily organized into tables, columns, and rows
 Use more flexible data model
Don’t require extensive structuring
 Can manage unstructured data, such as social media and graphics
 Easier to Scale
 E.g. Amazon’s SimpleDB, MetLife’s MongoDB

DISTRIBUTED DATABASE
A distributed database consists of multiple interconnected databases located in different geographical areas, functioning
as a single cohesive unit.
Key Features
• Data Distribution
• High Availability
• Data Replication
• Fault Tolerance

• Examples:
• Apache Cassandra, GoogleSpanner,
• Cock roach DB
CLOUD DATABASE

• Definition:a database that runs on cloud computing platforms, allowing users to store and manage data without
the need for physical hardware.

• Key Features
• Scalability
• Accessibility
• Automatic Updates
• Backup and Recovery
• Examples: Amazon RDS, Google Cloud SQL,
Microsoft Azure SQL Database

BIG DATA

 Massive quantities of unstructured and semi-structured data from Internet andmore

3Vs: Volume,variety, velocity

Petabytes and exabytes

 Big datasets offer more patterns and insights than

smaller datasets

 Requires new technologies and tools

BUSINESS INTELLIGENCE INFRASTRUCTURE

Refers to technologies and practices that help organizations analyze data to inform decision-making.
The process that organizations use to take data they are collecting andanalyze it in the hopes of
obtaining a competitive advantage.
BI tools allow users to collect, process, and visualize data, providing insights into business performance
 Data visualization
 Data warehouses and Data marts
 Hadoop
 In-memory computing
 Analytical platforms
DATA-VISUALIZATION

 The graphical representation of information


and data.

 Summarize data to provide intuitive insights


 Tableau, Google datastudio

DATAWAREHOUSES

 Data warehouse:
 Database that stores current and historical data that may be of interest to decisionmakers
 Consolidates and standardizes data from many systems, operational and transactional databases
 Data can be accessed but not altered- ETL
 Data mart:
 Subset of data warehouses that is highly focusedand
isolated for a specific population of users

HADOOP
 Open-source software framework for big data
 Handles large quantities of unstructured data
 Breaks data task into sub-problems and distributes
the processing to many inexpensive computer processing
nodes
 Combines result into smaller data set that is easier toanalyze
 Best air fare
 Key services
 Hadoop Distributed File System (HDFS) :storage
 MapReduce: high-performance parallel data processing.
IN-MEMORY COMPUTING

 Relies on computer’s main memory (RAM) for data storage


 Benefits:
Eliminates bottlenecks in retrieving and reading data
Dramatically shortens query response times
Enabled by high-speed processors, multicore processing
Lowers processing costs
 Leading commercial
SAP’s High Performance Analytics Appliance(HANA)
Oracle Exalytics.

ANALYTIC PLATFORMS
Preconfigured hardware-software systems designed for query processing and analytics

Use both relational and non-relational technology to analyze large data sets

Include in-memory systems, NoSQL DBMS

E.g. IBM Pure Data System for Analytics

Integrated database, server, storagecomponents

Data lakes
BUSINESS INTELLIGENCE TECHNOLOGY INFRASTRUCTURE

ANALYTICAL TOOLS: RELATIONSHIPS, PATTERNS, TRENDS


 Once data is gathered, tools are required for consolidating, analyzing, to use insights to improve decisionmaking
 Multidimensional data analysis (OLAP)
 Data mining
ONLINE ANALYTICAL PROCESSING (OLAP)

- Supports multidimensional data analysis, enabling users to view the same data in different ways using
multiple dimensions
- Enables users to obtain online answers to ad hoc questions such as these in a fairly rapid amount of time
DATA MINING
 Finds hidden patterns and relationships in large databases and infers rules from them to predict
future behavior
 Machine learning
 Supervised
 Unsupervised
 Reinforcement learning
 Types of information obtainable from data mining
 Associations:occurrences linked to single event
 Sequences:events linked overtime
 Classifications:patterns describing a group an item
 Clustering:discovering as yet unclassified groupings
 Forecasting: uses series of values to forecast futurevalues
TEXT MINING
Unstructured data (mostly text files) accounts for 80 percent of an organization’s useful information.
Textmining allows businesses to extract key elements from, discover patterns in, and summarize large
unstructured datasets.
Ex: Sentimentanalysis
- Mines online text comments online or in email to
measure customer sentiment

WEB MINING
Discovery and analysis of useful patterns and information from the web
Types ofWebMining
 Content mining – mines content of websites
 Structure mining – mines website structural elements, such aslinks
 Usage mining – mines user interaction data gathered by web servers
ESTABLISHING AN INFORMATION POLICY
Information policy
States organization’s rules for organizing, managing, storing, sharinginformation
Data administration
Responsible for specific policies and procedures through which data can be
managed as a resource
Database administration
Database design and management group responsible for defining and
organizingthe structure and content of the database, and maintaining the
database.
DATA SECURITY AND INTEGRITY
 Importance of data security.
 Techniques for ensuring data integrity.
KNOWLEDGE MANAGEMENT
 Knowledge management is the process of creating, formalizing the capture, indexing,
storing, and sharing of the company’s knowledge in order to benefit from the
experiences and insights that the company has captured during its existence.
METADATA AND DATA DICTIONARIES
 Data that describes data
 Examples of metadata (field size, data type).

BLOCKCHAIN

Distributed database of transactions

Operates on a network without central authority

Maintains a growing list of records called blocks

Once recorded, blocks cannot be changed

Reduces cost of processing transactions and enhances security

You might also like