Data Mining Query Language
Data Mining Query Language
l;u't [-',
'+
Data mining language must be designed to facilitate flexible and effective knowledge discovery.
+ 4 *
'S
',&
Having a query language for data mining may help standardize the development of
platforms for data mining systems. gut designed a language is challenging because data mining covers a wide spectrum of
tasks and each task has different requirement. Hence, the design of a language requires deep understanding of the limitations and
+ +
,t.
DMQL allows mining of different kinds of knowledge from relational databases and data warehouses at multiple levels of abstraction
o o
' .
Hope to achieve a similar effect like that SQL has on relational database
2
. I
Design
D
.4x Syntax
'* *
'l*
Syntax
cept
erarchy specification
'&. pottern presentotion and visualizotion * Putting it all together - o DMQL query
Syntax of DMQL
,/ ./ ./
| (Concept_Hierorchy_Definition-Statement)
(V is ua I i zoti
n-o
d-P
re se
ntati o n )
use
Doto_Mining_Stotement)
database(dotabase_nome) | use data warehouse (doto_worehouse_name) {use hierorchy (hierorchy_nome) for (attribute_or-dimension)}
::=
(Mine-Knowledge-Specification)
attri b ute-o r-d i me n si o n-l ist) from ( re I oti o n (s) /c u be ( s ) ) [where (condition)] [order by (order_list) [group by
(
in relevance to
(grouping-list)] [hoving
(condition)]
./ ./ ,/
(torget_condition)
analyze (meosure(s))
,/ ./
Mine-Assoc) Mine_Closs)
::= mine classification [as (pottern-name)] analyze i me n s i o n ) ( cl a ssify i n g-ott ri b ute -or-d
7,
,/
(Concept_Hierorchy_Definition-statemeittl
::=
(attribute_or_dimension)] as (hierarchy_description)
[for
[where (condition)]
./ ./
| {(Multilevel_Manipulation)}
(Multilevel_Monipulation)
. . . . . . .
Nomes of the relevont database or doto warehouse, conditions ond relevant attributes or
Characterization
Mine_Knowledge_Specification
m i ne ch a ro
::=
o o o
4.
Discrimination
M
in
e-Kn ow
ed
mine comporison [as pattern-name] for target-class where target-condition {versus contrast-class-i where confidst-condition-i} analYze measure(s)
''' ' .
o given target closs of obiects Specifies thot discriminant descriptions ore to be mined, compore with one or more contrasting c/osses (thus referred to os comparison)
Andlyze specifies oggregote meosures avg(t.price) >= 5L00 Example: mine comporison as purchose Groups for big Spenders where versus budget Spenders where avg(l'price) < 5100 onalyze count
/
o
Association
Mine-Knowledge-specification ::= mine associations [as pattern-namel
r o o o
[matching(metaPattern)]
/
o
Classification
Mine-Knowledge-specification ::=
m
o
i
ne
cl
. . .
to be mined
ng-attri bute-or-d
me nsion)
a class (such as For categorical attributes or dimensions, each value represents low-risk, medium risk, high risk)
5
I '
For numeric attributes, each class defined by a range (such as 20-39,40-59, 6089 for age) Example: mine classifications as classifyCustomerCreditRating analyze credit
rating
to use
o o
schema hierarchies
set-groupinghierarchies
as
. o o o o
levell: {young, middle_aged, seniorl < level0: all level2: {2O, ...,39} < levelli young level2: {4O, ...,59} < levell: middle_aged level2: {60, ..., 89} < levell: senior
operation-derived hierarchies
as
o
Def
i
rule-basedhierarchies
h i e ra rc
item
o o o
level_l: low_profit_margin< level_O: all o if (price - cost)< $50 level_l: medium-profit_margin<level_0: all o if ((price - cost) > $SO1 and ((price - cost) <= $250)) level_l: high_profit_margin< level_0: all o if (price - cost) > $250
We have syntax which allows users to specify the display of discovered patterns in one or more forms
6,
display as <result_form>
ResultJorm = Rules, tables, crosstabs, pie or bar charts, decision trees, cubes, cunres, or surfaces
To
M
u
facilitate interactive viewing at different concept level, the following syntax is defined:
lti level_Ma
n
i
pu lati
on'.'.=
addattribute_or-dimension
ute_o r_d i me
nsi o n
usehiera rchylocation_hierarchy
for B.address
S,
works-at W, branch
wherel.item_lD = S.item-lD and S.trans-lD = P.trans-lD andP.cust-lD = C.cust-lD and P.method-paid = "AmEx"
andP.empl_lD = W.empl_lD and W.branch-lD = B.branch-lD and B.address = "Canada" and
l.prico= 100
with noise threshold
displayas table
= 0.05
/
.'*
o o
7
o *
OLEDB
for DM (Microsoft'2000)
Based on OLE, OLE DB, OLE DB for OLAP
o o + o o + +
o o o
"a!
OTEDB
MineRule (MeoPsaila and Ceri'96) Query flocks based on Datalog syntax (Tsur et al'98)
o o + o
Hierarchy Specification
A hierarchy is a root member of an alternate hierarchy, which is always at generation2 of a dimension. Member value expressions are not allowed as hierarchy arguments.
Alternate hierarchies are applicable to aggregate storage databases only.
The dimension of the hierarchy argument passed to a function must match the dimension of the other arguments passed to the function. If they do not match, an error is return and the query is
aborted.
urN++7