Hybrid AI systems often leverage symbolic reasoning for high-level decision-making and machine learning for data-driven tasks, resulting in more robust and flexible systems. This POC explores such a hybrid AI approach, the integration of drools rule-based engine with emla, a machine learning framework that provides functionality for learning rules from data.
The POC currently demonstrates two basic use-cases,
- automatic DRL file (rules) creation from data,
- DRL validation and automatic updates from data, to cover gaps or fix errors.
While the process of combining new with existing rules is not just about making sure that the rule does not already exist, for the purpose of this POC only this particular control was taken into account.
The DataaccessApp example.
The tabular dataset below holds a number of resource access requests from different users and the decision to these requests (allow, deny).
Dataset
| role | experience | age | access |
|---|---|---|---|
| admin | senior | 40 | allow |
| admin | senior | 45 | allow |
| admin | senior | 42 | allow |
| contributor | senior | 42 | allow |
| contributor | junior | 30 | deny |
| contributor | junior | 32 | deny |
| contributor | senior | 43 | allow |
| guest | junior | 30 | deny |
| guest | senior | 45 | deny |
- DRL from data use-case
- use
emlato learn rules for the entire dataset, select one-rule forallowand one-rule fordenyto initiate the DRL. - the goal is to develop a DRL that supports the dataset with
100% coverageand0 errors. emlareturns the following rules per predictor (role,experience,age):
** rules for predictor 'role':
- IF role == admin THEN allow, (coverage=0.33, accuracy=1, assessment=0.33)
- IF role == contributor THEN allow, (coverage=0.44, accuracy=0.5, assessment=0.22)
- IF role == guest THEN deny, (coverage=0.22, accuracy=1, assessment=0.22)
** rules for predictor 'experience':
- IF experience == senior THEN allow, (coverage=0.67, accuracy=0.83, assessment=0.56)
- IF experience == junior THEN deny, (coverage=0.33, accuracy=1, assessment=0.33)
** rules for predictor 'age':
- IF (age > 36.0 AND age <= 45.0) THEN allow, (coverage=0.67, accuracy=0.83, assessment=0.56)
- IF age <= 36.0 THEN deny, (coverage=0.33, accuracy=1, assessment=0.33)
- the solution selects rules with the highest assessment, a metric that considers both the coverage and the accuracy of the rule.
- for
allowthe solution will select one of the rules with assessment 0.56 and fordenyone of the rules with assessment 0.33. - the DRL is initialized with one-rule for each {
allow,deny} :
rule 'rule0' when
$a: AgentDatapoint( age > 36.0 && age <= 45.0 )
then
$a.setPrediction( 'allow' );
update( $a );
end
rule 'rule1' when
$a: AgentDatapoint( age <= 36.0 )
then
$a.setPrediction( 'deny' );
update( $a );
end
-
the DRL is then is validated against the entire dataset:
Coverage=100%, Errors=1 -
rule0erroneously supports the last datapoint in the dataset:role experience age access guest senior 45 deny -
the process will try to learn a rule to fix this error.
-
the learning/selection of rules is repeated but this time
emlauses only the subset of data thatrule0supports. -
the following rules are returned:
** rules for predictor 'role':
- IF role == admin THEN allow, (coverage=0.5, accuracy=1, assessment=0.5)
- IF role == contributor THEN allow, (coverage=0.33, accuracy=1, assessment=0.33)
- IF role == guest THEN deny, (coverage=0.17, accuracy=1, assessment=0.17)
** rules for predictor `experience`:
- IF experience == senior THEN allow, (coverage=1, accuracy=0.83, assessment=0.83)
** rules for predictor `age`:
- IF age <= 45.0 THEN allow, (coverage=1, accuracy=0.83, assessment=0.83)
- the learning process with select one-rule for
denyto fix the error and update the DRL withrule2:
rule 'rule2' when
$a: AgentDatapoint( role == 'guest' )
then
$a.setPrediction( 'deny' );
update( $a );
end
- the DRL is then re-validated against the entire dataset:
Coverage=100%, Errors=0. - at this point the learning goal is reached and the process terminates.
- Validating & updating existing DRL use-case
- use-case (2) shows how the
drools-verifiermodule can be incorporated into the solution to ensure that a rule is missing before adding it to the DRL. - for this integration we can currently use only numerical fields.
- the initial DRL contains the following rules:
rule 'rule0' when
$a: AgentDatapoint( role == 'admin' )
then
$a.setPrediction( 'allow' );
update( $a );
end
rule 'rule1' when
$a: AgentDatapoint( role == 'guest' )
then
$a.setPrediction( 'deny' );
update( $a );
end
rule 'rule2' when
$a: AgentDatapoint( role == 'contributor' , age > 39 )
then
$a.setPrediction( 'allow' );
update( $a );
end
-
the DRL is validated against the entire dataset:
Coverage=77.78%, errors=2 -
two datapoints are not supported:
role experience age access contributor junior 30 deny contributor junior 32 deny -
the process will use
emlato find rules for these two datapoints. -
the following rules are returned:
** rules for predictor 'role':
- IF role == contributor THEN deny, (coverage=1, accuracy=1, assessment=1)
** rules for predictor `experience`:
- IF experience == junior THEN deny, (coverage=1, accuracy=1, assessment=1)
** rules for predictor `age`:
- IF (age >= 30.0 AND age <= 32.0) THEN deny, (coverage=1, accuracy=1, assessment=1)
- all rules have the highest assessment 1 and can be selected by the process to update the DRL.
- the process will select a numerical predictor to create
rule3to showcase the integration potential with thedrools-verifiermodule.
rule 'rule3' when
$a: AgentDatapoint( age >= 30.0 && age <= 32.0 )
then
$a.setPrediction( 'deny' );
update( $a );
end
- the DRL is then re-validated against the entire dataset:
Coverage=100%, Errors=0. - at this point the learning goal is reached and the process terminates.