Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management
Chapter 6
FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT
Problems with the traditional file environment (files maintained separately by different departments)
Data redundancy: Data inconsistency:
Presence of duplicate data in multiple files
Program-data dependence:
Database
Serves many applications by centralizing data and controlling redundant data
Relational DBMS
Represent data as two-dimensional tables called relations or files Each table contains data on entity and attributes
Many DBMS have report generation capabilities for creating polished reports (Crystal Reports)
Designing Databases
Conceptual (logical) design: Abstract model from business perspective Physical design: How database is arranged on direct-access storage devices Relationships among data elements, redundant database elements Most efficient way to group data elements to meet business requirements, needs of application programs Streamlining complex groupings of data to minimize redundant data elements and awkward many-to-many relationships
Normalization
Entity-relationship diagram
Used by database designers to document the data model Illustrates relationships between entities
AN ENTITY-RELATIONSHIP DIAGRAM
Data warehouse:
Stores current and historical data from many core operational transaction systems Consolidates and standardizes information for use across enterprise, but data cannot be altered Data warehouse system will provide query, analysis, and reporting tools Subset of data warehouse Summarized or highly focused portion of firms data for use by specific population of users Typically focuses on single subject or line of business
Data marts:
Business Intelligence:
Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions E.g., Harrahs Entertainment analyzes customers to develop gambling profiles and identify most profitable customers Principle tools include:
Software for database query and reporting Online analytical processing (OLAP) Data mining
FIGURE 6-13
Data mining:
More discovery driven than OLAP Finds hidden patterns, relationships in large databases and infers rules to predict future behavior E.g., Finding patterns in customer data for one-to-one marketing campaigns or to identify profitable customers. Types of information obtainable from data mining
Associations Sequences Classification Clustering Forecasting
Predictive analysis
Uses data mining techniques, historical data, and assumptions about future conditions to predict outcomes of events E.g., Probability a customer will respond to an offer
Text mining
Extracts key elements from large unstructured data sets (e.g., stored e-mails)
Web mining
Discovery and analysis of useful patterns and information from WWW
E.g., to understand customer behavior, evaluate effectiveness of Web site, etc.
Data governance:
Policies and processes for managing availability, usability, integrity, and security of enterprise data, especially as it relates to government regulations
Database administration:
Defining, organizing, implementing, maintaining database; performed by database design and management group
Structured survey of the accuracy and level of completeness of the data in an information system
Survey samples from data files, or Survey end users for perceptions of quality
Data cleansing
Software to detect and correct data that are incorrect, incomplete, improperly formatted, or redundant Enforces consistency among different sets of data from separate information systems