L3...
L3...
DATA, INFORMATIONANDKNOWLEDGE
Data: Raw facts without context or intent.
Types of data:
- Quantitative - Qualitative:
Information: Processed data with context, relevance and purpose.
Example: Monthly sales calculated from daily sales data
Importance of manipulating data to uncover trends and insights.
Knowledge: Human beliefs or perceptions about relationships among facts.
Example: Relationship between the quality of goods and sales.
Knowledge facilitates action and decision-making.
Distinction between explicit and tacit knowledge:
Explicit: Easily communicated (e.g., documented processes), related usually to facts
Tacit: Insights and intuitions, difficult to transfer.
Wisdom: Combining knowledge and experience for deeper understanding.
- Importance of patience and experience in developing wisdom
ENTITY STUDENT
DESIGNING A DATABASE
Normalization: Reduce data redundancy and ensure integrity
- Streamlining complex groups of data
- Minimizes awkward many-to-many relationships
- Increases stability and flexibility
Referential integrity rules
- Ensure that relationships between coupled tables
remain consistent
DATABASE MANAGEMENT SYSTEMS (DBMS)
Definition:
Software that facilitates the creation, manipulation, and administration of databases
Software for creating, storing, organizing, and accessing data from a database
Separates the logical and physical views of thedata
Logical view: how end usersviewdata
Physical view: how data are actually structured and organized
Examples: Oracle, SQL Server, MySQL, Access.
SQL allows users to perform various operations, such as retrieving data, inserting new records, and updati
existing ones
Example :
SELECT PART.Part_Number,PART.Part_Name,SUPPLIER.Supplier_Number,
SUPPLIER.Supplier_Name FROM PART,SUPPLIER
DISTRIBUTED DATABASE
A distributed database consists of multiple interconnected databases located in different geographical areas, functioning
as a single cohesive unit.
Key Features
• Data Distribution
• High Availability
• Data Replication
• Fault Tolerance
• Examples:
• Apache Cassandra, GoogleSpanner,
• Cock roach DB
CLOUD DATABASE
• Definition:a database that runs on cloud computing platforms, allowing users to store and manage data without
the need for physical hardware.
• Key Features
• Scalability
• Accessibility
• Automatic Updates
• Backup and Recovery
• Examples: Amazon RDS, Google Cloud SQL,
Microsoft Azure SQL Database
BIG DATA
smaller datasets
Refers to technologies and practices that help organizations analyze data to inform decision-making.
The process that organizations use to take data they are collecting andanalyze it in the hopes of
obtaining a competitive advantage.
BI tools allow users to collect, process, and visualize data, providing insights into business performance
Data visualization
Data warehouses and Data marts
Hadoop
In-memory computing
Analytical platforms
DATA-VISUALIZATION
DATAWAREHOUSES
Data warehouse:
Database that stores current and historical data that may be of interest to decisionmakers
Consolidates and standardizes data from many systems, operational and transactional databases
Data can be accessed but not altered- ETL
Data mart:
Subset of data warehouses that is highly focusedand
isolated for a specific population of users
HADOOP
Open-source software framework for big data
Handles large quantities of unstructured data
Breaks data task into sub-problems and distributes
the processing to many inexpensive computer processing
nodes
Combines result into smaller data set that is easier toanalyze
Best air fare
Key services
Hadoop Distributed File System (HDFS) :storage
MapReduce: high-performance parallel data processing.
IN-MEMORY COMPUTING
ANALYTIC PLATFORMS
Preconfigured hardware-software systems designed for query processing and analytics
Use both relational and non-relational technology to analyze large data sets
Data lakes
BUSINESS INTELLIGENCE TECHNOLOGY INFRASTRUCTURE
- Supports multidimensional data analysis, enabling users to view the same data in different ways using
multiple dimensions
- Enables users to obtain online answers to ad hoc questions such as these in a fairly rapid amount of time
DATA MINING
Finds hidden patterns and relationships in large databases and infers rules from them to predict
future behavior
Machine learning
Supervised
Unsupervised
Reinforcement learning
Types of information obtainable from data mining
Associations:occurrences linked to single event
Sequences:events linked overtime
Classifications:patterns describing a group an item
Clustering:discovering as yet unclassified groupings
Forecasting: uses series of values to forecast futurevalues
TEXT MINING
Unstructured data (mostly text files) accounts for 80 percent of an organization’s useful information.
Textmining allows businesses to extract key elements from, discover patterns in, and summarize large
unstructured datasets.
Ex: Sentimentanalysis
- Mines online text comments online or in email to
measure customer sentiment
WEB MINING
Discovery and analysis of useful patterns and information from the web
Types ofWebMining
Content mining – mines content of websites
Structure mining – mines website structural elements, such aslinks
Usage mining – mines user interaction data gathered by web servers
ESTABLISHING AN INFORMATION POLICY
Information policy
States organization’s rules for organizing, managing, storing, sharinginformation
Data administration
Responsible for specific policies and procedures through which data can be
managed as a resource
Database administration
Database design and management group responsible for defining and
organizingthe structure and content of the database, and maintaining the
database.
DATA SECURITY AND INTEGRITY
Importance of data security.
Techniques for ensuring data integrity.
KNOWLEDGE MANAGEMENT
Knowledge management is the process of creating, formalizing the capture, indexing,
storing, and sharing of the company’s knowledge in order to benefit from the
experiences and insights that the company has captured during its existence.
METADATA AND DATA DICTIONARIES
Data that describes data
Examples of metadata (field size, data type).
BLOCKCHAIN