0% found this document useful (0 votes)
36 views10 pages

Hkansson 2006

This document discusses reengineering knowledge-based systems. It presents an approach to reengineering that collects rules, facts, user inputs, and conclusions to understand the system's reasoning strategy. The reengineering process aims to maintain correctness, consistency, and completeness. It represents the collected knowledge using Unified Modeling Language diagrams to support further modification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views10 pages

Hkansson 2006

This document discusses reengineering knowledge-based systems. It presents an approach to reengineering that collects rules, facts, user inputs, and conclusions to understand the system's reasoning strategy. The reengineering process aims to maintain correctness, consistency, and completeness. It represents the collected knowledge using Unified Modeling Language diagrams to support further modification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Reengineering for Knowledge in Knowledge Based

Systems

Anne Håkansson1 and Ronald L. Hartung2


1
Department of Information Science, Computer Science, Uppsala University, Box 513,
SE-751 20, Uppsala, Sweden
[email protected]
2
Department of Computer Science, Franklin University, 201 S. Grant Avenue,
Columbus, Ohio 43215, USA
[email protected]

Abstract. This paper presents an approach to reengineering knowledge-based


systems. Commonly, reengineering is used to modify systems that have functioned
for many years, but are no longer able to accomplish the tasks required, and
therefore need to be updated. Reengineering can also be used to modify and extend
the knowledge contained in these systems. This is an intricate task if the systems
are large, complex and poorly documented. The rules in the knowledge base must
be gathered, analyzed and understood. In this paper, we apply reengineering to the
knowledge and the functionality of knowledge-based systems. The outcome of the
reengineering process is presented in graphic representations using Unified
Modeling Language diagrams.

1 Introduction

A wide variety of knowledge-based systems (KBS) have been developed to support


decision-making across a great number of areas, with the oldest and best established
being the field of medical services. Developing a system takes a considerable number
of man months and, during this period, the requirements for the system may change.
But the understanding of the problem may also change or discoveries may be made.
Thus, as a result of change and invention, even a system that has been newly
implemented for an organization can be out of date instantaneously [3].
Changing the system at this point may not be difficult or time-consuming, if, for
example, the alteration required is only a minor modification in the source code and if
extensive documentation is available for the system. However, it can present a
considerable problem if a profound alteration is required in a large and complex
system with complicated and highly interrelated source code. In such a situation,
problems such as inconsistency, incompleteness and redundancy can easily arise.
Commonly, reverse engineering (also known as reengineering) is usually used to
maintain and reuse source code [4]. Reengineering can also be used to transform the
architecture of expert systems to obtain object-oriented systems’ architecture [1].
During the process, the reengineering can avoid incorrectness, check for

B. Gabrys, R.J. Howlett, and L.C. Jain (Eds.): KES 2006, Part I, LNAI 4251, pp. 342 – 351, 2006.
© Springer-Verlag Berlin Heidelberg 2006
Reengineering for Knowledge in Knowledge Based Systems 343

inconsistencies and detect incompleteness. However, since reengineering needs to be


able to deal with a large amount of code, it is a cumbersome process [4]. Because of
this, automated assistance is needed for reengineering of the domain knowledge and
the functionality of a system.
In this paper, we present an approach to KBS-development based on
reengineering principles. The reverse engineering for KBSs has to be as general as
possible and should work with as many kinds of knowledge presentation as
possible. Common representations for these kinds of systems are rule-based, fuzzy
rules, frames, neural networks and hybrid systems that use a mix of these
representations. In our approach, we utilize reengineering to handle systems
developed with rule-based representation techniques.
If it is to work successfully, reengineering needs to gather up not only all the rules
in the knowledge base, but also encapture all of the relationships between each rule
and all others. It also collects up all the pre-stored facts and the user-given facts
received in response to the questions posed to the end users, and the conclusions that
can be presented to the end users. In addition, it is important to encapsulate the
relationships between the rules and the facts and, also, between the rules and the
conclusions. Taken together, these relationships constitute the reasoning strategy of
the system.
During rule gathering, the reengineering process must be able to ensure that
correctness, consistency and completeness are maintained. The definitions of these
terms differ somewhat, but in this work, correctness is a match of the system’s
solution with respect to the opinions of human experts. Consistency is used in the
sense that a knowledge base contains no rules that are conflicting, redundant,
subsumed or circular. Completeness indicates that the rule base has no dead-end rules,
i.e. no rules that have a premise that can never be satisfied [14]. The outcome from
this reengineering process is presented as a graphic representation using Unified
Modeling Language (UML) [9] diagrams.

2 Related Work

Several reengineering models have been developed during the years. Some operate on
the source code level, and others recover the design and specifications [4]. These
frequently use a knowledge-based approach to transform the architecture of a system
from one form to another. An example of such a transformation is a model developed
for the reengineering of expert systems to obtain an object-oriented architecture [1].
This model reengineers non-object-oriented systems into object-oriented architectures
by using a knowledge-based approach. Other examples are have been used to change
conventional architectures to object-oriented architectures [4] and transforming
procedural programs to object-oriented programs, such as CORET (object-oriented
reverse engineering) [10].
In our approach, we are not transforming architectures, but instead, using
reengineering in KBSs to collect and represent the knowledge contained in the
systems. This approach allows us to modify the knowledge and to extend the
344 A. Håkansson and R.L. Hartung

knowledge base with new knowledge. The collected knowledge is presented in a


graphic form in UML diagrams to support modification by the developers.

3 Reengineering in Knowledge Based Systems

The existence of large quantities of undocumented code causes re-engineering to be a


common and painful problem in industry. This code needs to be understood when it
needs to be fixed, modified or transported to a new system. There are sets of
techniques that have been developed to gain an understanding of the code [2]. The
most common of these techniques is the ad hoc approach of a person reading the code
and piecing together a picture of the system. There are also some tool-based
techniques that have been applied and can certainly help [12]. However, the
understanding still requires intense work by experienced software professionals.
In the case of KBSs and rule-based systems, this is also a severe problem. One
advantage in conventional KBSs is modularity; where the modular structure of the
code, whether object oriented classes or older procedural code, helps guide the
discovery process. However it is typical for some of the older KBS to lack a modular
system and to be driven by data. These KBSs lack the explicit control structures of
conventional programs where the practitioner is faced with a tangle of propositions,
facts and rules.
Three issues confronting the reengineering tool are to find:
• the part of the rules set that drives a particular conclusion
• the conclusions that are possible from a set of rules or inputs
• the difference set between the sets of rules or inputs that result in two different
conclusions.

3.1 Reengineering for Knowledge in the Knowledge Base

Reengineering involves collecting all kinds of knowledge in a system and taking into
account any special functionality, e.g., procedural code, utilized by the system to
make use of the knowledge, in, for example, an inference mechanism. The knowledge
is often presented as productions and heuristics rules and facts. The reengineering
must collect different kinds of rules together and the relationships to all other rules to
cover all of the knowledge contained in the knowledge base. Moreover, as pre-stored
facts are utilized within some of the rules, these must not be omitted since the
execution of the rules depends on these facts. In addition to the rules and facts, as the
reengineering must collect all the information that is involved in the reasoning in the
system, it is necessary to track down the user-given facts, in the form of answers
given or the alternative responses associated with the questions posed, and the
conclusions that will be presented for the different responses.
The reengineering uses different search methods to locate the knowledge in the
system. Different search criteria are used to search for knowledge with each set of
criteria depending on the system’s representation and syntax. Consequently, we use
different approaches to find rules and facts.
Reengineering for Knowledge in Knowledge Based Systems 345

The first and simplest approach is to search for rules as predicates in the
representation. The reengineering can easily pick up the rules using the keywords, i.e.
“rule”, if that is the word used to denote rules. See the example, rule number 105:

rule(11, “allergic purpura object”, “Find doctor


immediately text”, 1000):-
check('general disease rule', 'general disease
text', 1000),
fact('fever object', '>', 'normal'),
reply('one rash with strong red spot object','Yes'),
reply('several symptoms: Vomit / Headache /
Oversensitive to light / Pain when you bend your
head forward ', 'No').

Fig. 1. Example of source code for a rule

The reengineering procedure continues by collecting the rules by using a search for
all predicates referenced in the rule. Many of the associated rules can be found by
using the content of the rules and relationships to other rules. One rule will usually
reference several others and can, therefore, be found by investigating the content of
the predicate, i.e., using “check”. In general, the relationships between the rules
result in a complex graph of dependencies.
By searching the internal contents of these rules, many of the pre-stored facts can
be found, i.e., those labeled “fact”, see Figure 1. These pre-stored facts are often
stored in a database and are present even though the system does not execute
commands involving them.
User-given facts are knowledge that is required by the system, but which,
unfortunately, can only be found in the database during a session, i.e., “reply”, as
shown in Figure 1. Since, the user-given facts are not collected from the rules, the
reengineering must reconstruct the user-given hypothetically. This is accomplished by
inspecting the rules and determining what possible values will be given. This step is
required because the facts inserted into the system will depend on consultation, and
the values obtained will vary from one session to another.
Despite conducting a comprehensive search of all of the rules, there is no
guarantee that all kinds of rules will be identified and picked up because, for
example, some rules take on different structures depending on the purpose of the
rule. These rules are different because they will accept different numbers of
arguments depending upon the call, e.g., “rule(105, 'sign of allergic
purpura', 'Find a doctor immediately text', 1000)” which
has four arguments and “check('general disease rule', 'general
disease text', 1000)” which has three arguments, and where the second
alternative omits the rule number.
The second, and more complicated, approach is to search the syntactic form of the
rules used to identify when the rules have a different appearance to that mentioned
above, and use the presumed word “rules”. Rules frequently have a predetermined
pattern that can be used by the interpretation engine when reengineering grasps the
rules in a knowledge base. This is referred to as the pattern of antecedent-consequent
rules. Thus, determining the structure of the rules is the starting point for a search of
this type, for which it is necessary to identify the first rule and then use the pattern
revealed to find the rest of the rules.
346 A. Håkansson and R.L. Hartung

The engine for interpretation can use the in-built predicate for interpreting
antecedent-consequent rules. The predicate available in Prolog, called Clause,
separates the predicate head from the body. These parts are then utilized by the clause
predicate to check each part separately — the head, and the body. The reengineering
utilizes this clause predicate by tracking the interpreter while forcing the system to
execute the engine as though someone were consulting the system. This clause
predicate uses the predicate’s internal structure to interpret the rules, pre-stored facts
and user-given facts to reach a conclusion or conclusions. This search works for a
couple of KBSs, namely, those with predetermined patterns. However some systems
use other kinds of self-developed interpreters.
In the third approach the reengineering tool looks for the rules without using any
presumed word or pattern. If the relevant word or the pattern it should seek is
unknown, the rules can be found by counting the number of occurrences of a
predicate with the same name. The search will find predicates that occur frequently in
the code. In a system, rules are, usually, the most common predicates and the number
of occurrences of that predicate will generally exceed the number of times the other
predicates occur. Again, the reengineering uses the same method as described in first
approach to determine the pattern intrinsic to the rule and to look for other rules with
the same pattern using the pattern or patterns to find all the other rules in the
knowledge base. However, it is not certain that the reengineering procedure will find
the rules of the system. Instead, the reengineering tool may find a predicate that is the
code for functionality of the system rather than rules or facts. The knowledge
engineer must analyze whether the predicate belongs to the knowledge or the
functionality.
In addition to obtaining the rules, the other predicates, in the form of facts,
questions used to obtain the user-given facts and conclusions all need to be found by
the reengineering using the same kind of searching procedures as mentioned above.
To find the questions and the conclusion, the rules are investigated. However, the
connections between the rules, questions and the conclusions must also be collected.

3.2 Relationships Within the Set of Rules


The outcome of the reengineering is the knowledge, partitioned into rules and facts,
which can be either pre-stored or user-given, and conclusions in different sets. Each
of these can be considered to be distinct sets, thus there will be one set of rules, one
set of pre-stored facts, and so on. Since, a knowledge base can contain several
thousand rules, the set of rules will be huge and will involve a complex network of
relationships. As a result of this, collecting the rules into a rule set will have an impact
upon the comprehensibility of the rules. The rule set also contains the relationships
between rules and to facts, which the reengineering must reveal.
Reengineering to determine the relationships is accomplished by traversing the
network of rules. Almost every rule utilizes another rule because of the need to
simplify the complex knowledge in the system and, hence, a complex structure of
rules is created during the development of a knowledge base. Rules are built from the
domain expert’s expertise, but these rules usually, utilize facts and not other rules.
The knowledge engineer transfers the domain knowledge from an expert to a system
and, in so doing, learns how problem-specific rules are used. During the development
of the knowledge base, different rules are generated to facilitate the handling of the
production or heuristic rules, e.g., meta-rules, inferred rules and derived rules. These
rules are highly interrelated to other rules.
Reengineering for Knowledge in Knowledge Based Systems 347

Some rules are not really related to the domain problem [11] at all, e.g., meta-rules.
Meta-rules are used to direct the problem solving and to determine how best to solve
problems. These meta-rules are used when deciding which rules to apply to avoid a
conflict within the reasoning strategy [8]. A meta-rule devises a strategy for the use of
task-specific rules in the system [13] and is, therefore, dependent on other rules.
Some rules are specially constructed to enable them to be utilized by many other
rules. Most commonly constructed rules are inferred rules. Inferred rules are derived
from the initial rules and represent the process of derive new information from old
[11]. An inferred rule links several different rules (e.g., A->B, B->C) by concluding
that rule A implies rule C. In the inferred rules, the antecedent of the initial rule
appears in the body of the rule. If the relationships between inferred rules and initial
rules are found, it is important that they are preserved since they are affected by
changes to the rule base.
Derived rules, as their name suggests, are rules that are derived from other rules.
They are similar to inferred rules since they involve the process of linking new
information, however, instead of inferring the rules, the rules are derived from other
rules, making a derived rule a short form for several lines of primitive rules. The
derived rules may have direct relationships to the rules that produced them, for which
they are short cuts. These relationships, too, are important and must be retained.
As mention above, in the rule set, the rules also have a relationship to the facts,
both pre-stored and user-given. Moreover, the majority of the rules are connected to
the conclusions presented to the end users. The connections to the facts and
conclusions must be taken care into consideration.

3.3 Reengineering for Source Code

Reengineering must operate on the source code, which implements the functionality,
e.g. the functions and the predicates, used to handle the knowledge of the system. The
functionality is essential for working with the knowledge, e.g., for using the
interpretation engine, performing calculations and for consideration of factors relating
to uncertainty.
Reengineering collects the functionality by searching in the source code. It follows
the code and searches for the first occurrence the engine invokes rules. Other
functionality is evident when the rules themselves need to calculate a result, e.g., to
find averages. In these cases, the reengineering searches for the functionality from the
rules, i.e., it searches for the rules that invoke this functionality. An additional
functionality is the certainty factor. The interpreter usually calculates these certainty
factors at the end of a session. To collect this functionality, the reengineering must
again search in the source code. Hence, the engineering operates on the code for the
interpretation engine.

3.4 Reengineering to Handle Problems Related to Consistency and


Completeness

Consistency is a knowledge base with no rules that are conflicting, redundant, subsumed
or circular. If the knowledge base has circular rules, the reengineering tool will run into
an infinite loop, which must be taken care of during the reengineering. The structure of
circular rules must be recognized to be able to avoid the potential loops.
348 A. Håkansson and R.L. Hartung

Completeness means having a rule base without dead-end rules, i.e., avoiding the
problems caused by having a premise that can never be reached or which does not
exist. The reengineering would try the find the rule and run into a search problem.
The tool works through the code until it finds the rule corresponding to a premise, but
if the premise is lacking, the reengineering tool is searching for a non-existent rule.
The reengineering tool keeps track of all the rules it has encountered, until, when a
certain time limit is exceeded, concludes that the rule is missing.
The definition of correctness corresponds to the match of the conclusion of the system
with the opinions of domain experts. Since correctness is human based, reengineering
cannot make this kind of check, because such a check would require the system to reflect
a domain expert’s knowledge and reasoning to reach the same conclusions.
Graphic modeling is used to make the knowledge more accessible and the experts
can use the models to check the rules and facts that are involved in a conclusion. If
the models are easy to follow and understand, then it will be more straightforward for
the users to judge consistency, completeness and correctness.

4 Presenting the Sets in UML Diagrams

The outcome of the reengineering tool has to be presented in a manner that is


comprehensible for the engineer, enabling to maintain the system and reuse the
knowledge. When a graphic representation is required, UML diagrams can used to
present knowledge such as rules, facts, conclusions and reasoning strategies in KBSs
as has been shown in the references [5, 6]. However in this paper, the UML diagrams
are used to present the outcome of the reengineering.

allergic purpura rule


check rule check rule
general disease rule

fever object
reply fact
> normal

one rash with strong red spot object


reply fact
Yes

several symptoms object: Vomit /Headache /


Oversensitive to light / Pain when bending head
reply fact
Present
No
conclusion

Contact doctor immediately text

Fig. 2. Sequence diagram for the rule


Reengineering for Knowledge in Knowledge Based Systems 349

The example presented in Figure 2 illustrates the engineering operation on the rule
“allergic purpura rule”, previously presented in Figure 1. Within the rule, another rule
is implemented, the “general disease object”. This incorporates a fact “fever object”,
which corresponds to a high body temperature, the fact “one rash with strong red spot
object” with the user-given answer “Yes” and the fact “Several symptoms object:
Vomit / Headache / Oversensitive to light / Pain when bending head forward” with
the user-given answer “No”. The conclusion “Find doctor immediately text” is
connected to the rule. When a rule has been investigated, it can be folded into
packages of UML, as demonstrated in the “general disease object” rule in Figure 2.
The internal content of a rule is specified inside a package.
If a user is to be able to grasp the outcome of the reengineering, the tool must
generate diagrams to answer specific questions posed by the user. For example, how a
specified conclusion can be obtained from the rules. The diagram provides a graphic
view of the answers.
The relationships between the rules together with the order of the system invoking
the rules are presented along a lifeline. This lifeline illustrates the order with arrows
presenting the rules, the pre-stored facts and the user-given facts.
At the end of the lifeline the conclusion is presented, in the “Contact the doctor
immediately text” in Figure 2. In this example, the two conclusions that are attainable
are: in the package “general disease rule”, which cannot be seen in this example, and
that at the end of the lifeline.
In addition, all kinds of relationships between the sets of rules or inputs that
produce two conclusions will be presented in sequence diagram. However, to
illustrate this more clearly we use a UML object diagram.

Fig. 3. Object diagram for the rule

Figure 3 presents the connections between questions, rules and conclusions using
object diagrams. This example illustrates the same rule as was presented in Figure 2,
revealing how the object diagram presents the connections between different parts
more clearly than the sequence diagram.
350 A. Håkansson and R.L. Hartung

5 Conclusions and Further Work


The problem with the current generation of KBSs and expert systems is the poor
documentation and the deficiency of written information. This is complicated by the
fact that not all expert systems have a modular structure. Some systems are developed
without applying an architecture, and the code is not structured into different bases,
such as knowledge bases.
In this paper, we have shown that reengineering can be used to gather knowledge.
The reengineering has been applied to problems related to knowledge, relationships
and consistency. The first and simplest approach, which is to search for rules, has
been tested in the KANAL-system [7]. The system is an intelligent rule editor system,
which is used to support a domain expert during the development of knowledge-based
systems [7].
To comprehend the outcome of the reengineering and thereby to modify the
knowledge, we present it through graphical representations, such as UML sequential
diagrams. However, we have not worked with the application of reengineering to the
source code handling the interpretation engine. It has previously been shown that the
UML activity diagrams and state chart diagrams can used to illustrate source code in
KBS, why they can be suitable for illustrating the outcome of the reengineering of the
functionality.
Further work is needed to extend our reengineering tool to enable it to handle all
kinds of knowledge representations. Reengineering operates on rule-based systems
but it can easily be expanded to handle knowledge representation such as fuzzy rules
and frames. Moreover, the principle of reengineering can be extended to handle
neural networks and hybrid representation.

References
[1] Babiker, E., Simmons, D., Shannon, R., & Ellis, N.: A model for reengineering legacy
expert systems to object-oriented architecture, Expert systems with applications, vol. 12,
no. 3 (1997) 363-371.
[2] Carriere, J, O'Brian, L., and Verhoef, C. Reconstruction Software Architecture.
Appearing Carriere J., O’Brian L., and Verhoef C. Reconstruction Software Architecture.
Appearing in Bass L., Clements P., and Kazman R., Software Architecture in Practice
(2nd edition), ISBN 0-321-15495-9, Addison-Wesley, 2003.
[3] Durkin, J.: Expert System Design and Development. Prentice Hall International Editions.
MacMillian Publishing Company, New Jersey (1994).
[4] Gall, H., Klösch, R. & Mittermeir, R.: Using Domain Knowledge to Improve Reverse
Engineering. International Journal on Software Engineering and Knowledge
Engineering, World-Scientific Publishing, Vol. 6, No. 3, (1996) 477-505.
[5] Håkansson, A.: UML as an approach to Modelling Knowledge in Rule-based Systems.
(ES2001) The Twenty-first SGES International Conference on Knowledge Based Systems
and Applied Artificial Intelligence. Peterhouse College, Cambridge, UK; December 10th-
12th (2001).
[6] Håkansson, A.: Supporting Illustration and Modification of the Reasoning Strategy by
Visualisation. (SCAI'03) The Eighth Scandinavian Conference on Artificial Intelligence,
Bergen, Norway, November 2th-4th (2003).
Reengineering for Knowledge in Knowledge Based Systems 351

[7] Håkansson, A. Widmark, A. and Edman, 2000. A. The KANAL-system, an intelligent


rule editor supporting the knowledge acquisition process. Swedish Artificial Intelligence
Society Annual Workshop 2000 (SAIS 2000).
[8] Jackson, P.: Introduction to Expert Systems. Addison-Wesley Longman Limited. ISBN 0-
201-87686-8 (1999).
[9] Jacobson, I. Rumbaugh, J. & Booch, G.: The Unified Modeling Language User Guide.
Addison-Wesley, USA (1998).
[10] Mittermeir, R., Rauner-Reithmayer, D., Taschwer, M., Gall, H. & Weidl, J.: CORET
Object-Oriented Re-Structuring. CORET Methodology and Tool. In: Mittermeir et al
(Hrsg.): CORET Methodology and Tool 1 Klagenfurt, Wien: Inst. für Informatik-Systeme
26. November (1998).
[11] Merritt, D.: Building Expert Systems in Prolog. Springer-Verlag, New York Inc. ISBN 0-
387-97016-9 (1989).
[12] Müller H., Orgun M., Tilley S. and Uhl J.: A Reverse Engineering Approach To
Subsystem Structure Identification. In Journal of Software Maintenance Research and
Practice, vol. 5, no.4, (1993)
[13] Negnevitsky, M.: Artificial Intelligence – A Guide to Intelligent Systems. Addison-
Wesley, Pearson Education ISBN 0201-71159-1 (2002).
[14] Polat F. & Guvenir H.A.: UVT: A Unification-Based Tool for Knowledge Base
Verification, IEEE Expert: Intelligent Systems and Their Applications, vol. 8, no. 3, June,
(1993) 69-75.

You might also like