CH2 Data
CH2 Data
TOPIC 2 :
1
Outline
1. Definition of Dataset, Attribute
2. Types of Attribute
3. Types of Dataset
4. Issues related to Dataset
2
DATA SOURCE
• Other name: data objects, records, point, event, case, sample, observation, entity
5
dataset
• Data set is a file, which consists of record (or object, pattern, case,
• sample) in row and attribute (or field, attribute, dimension,
variable)
File Name: Student.xls
• in column
Attribute
DATAset :
attribute
• Is a property or characteristic of record that
may vary, either from one object to another or
from one time to another.
What
attribute can
describe this
aeroplane?
Grasshopper?
DATAset :
attribute &
record
Data Set
Records
Attributes
DATAset : types
of attribute
Discrete Continuous
[Nominal and Ordinal] [Interval and Ratio]
N
U
M
E
R
I
C
12
types of dataset
Record
data
Graph-
Ordered
based
data
data
13
• Collection of records, each of
which consists of a fixed set of
attributes
• Stored in flat file, relational
database
• Types: Market-Basket Data
(Transaction data), Data Matrix,
Sparse Data Matrix
14
• Data is represented in form of graph ~
relationship in graph, link in website
15
The attributes have relationships that involve order in time
or space.
Example:
Sequential data/temporal data – has a time associated
with it.
Sequence data – consists of a data set that is a
sequence of individual entities (exp: a sequence of words
or letters). No time stamps.
Time Series data – a special type of sequential data
(each record is a time series – a series of measurements
taken over time).
Spatial data – such as positions or areas.
16
17
18
19
20
21
Data in reality!
• Too many data.. However, far from perfect!
23
http://archive.ics.uci.edu/ml/datasets.html
24
25