Unit-2 Reverse Engineering
Unit-2 Reverse Engineering
unit2
Definitions
The system is altered by first reverse engineering and then forward engineering.
another.
what it does,
– Control-flow-driven restructuring:
This involves the imposition of a clear control structure within the source
code and can be either intermodular or intramodular in nature.
– Efficiency-driven restructuring:
This involves restructuring a function or algorithm to make it more efficient. A
simple example of this form of restructuring is the replacement of an
IFTHEN-ELSIF-ELSE construct with a CASE construct
– Adaption-driven restructuring:
This involves changing the coding style in order to adapt the program to a
new programming language or new operating environment, for instance
changing an imperative program in Pascal into a functional program in Lisp.
3.Reengineering
• This is the process of examining and altering a
target system to implement the desired
modification
• Reengineering consists of two steps
– Firstly, r e v e rs e e n g i n e e r i n g i s a p p l i e d t o t h e
t a r g e t sy s t e m s o a s t o u n d e rs t a n d i t a n d
r e p r e s e n t i t i n a n e w fo r m
– Secondly, forward engineering is applied, implementing
and integrating any new requirements, thereby giving
rise to a new and enhanced system
Benefits
1 Maintenance
– The basic understanding gained through reverse engineering can
benefit maintenance activities in various ways: Corrective change,
Adaptive/perfective change, Preventive change
2.Software Reuse
– In general terms, software reuse refers to the application of knowledge
about a software system - usually the source and object code - to
develop or maintain other software systems. The software components
that result from a reverse engineering process can be reused
3 Reverse Engineering and Associated Techniques in Practice
Reverse engineering and associated techniques such as reengineering and
restructuring have had practical applications within different sectors of the
computing industry
Case study
• A bank has a substantial investment in a Cobol software system that is at
least one million lines of code in length and has been running for over 20
years. It is used on a daily basis to perform various operations such as
managing customer accounts and loans. After several years of modification -
both planned and ad hoc – the system has become too expensive to
maintain. As a result, the bank wants some advice on the next step to take.
Suppose that you have been employed as a software maintenance
consultant.
• What advice would you give the bank?
• Indicate the reasons for any recommendations you
make.
Reuse and Reusability
• To improve maintainability
Approaches to Reuse
Five dimensions of successful SR Classic
software reuse examples
• High-level programming languages (e.g., Java, SQL)
• Library of generic (parameterized) components (e.g. Math library) • Parser-generators and
application generators (e.g. YACC, JavaCC, ANTLR, automake, Eclipse)
• Menu/table driven mechanism for specifying parameters (e.g. GUI widgets) • Application
frameworks (e.g. Smalltalk, Motif, Swing/SWT)
• Aspects: Pointcuts and advices (e.g. AspectJ etc.)
• Internationalization/Localization (i18n/ l10n) (e.g. tag transformations)
• Document generations (e.g. Javadoc/XDoclet, DocBook, LaTeX, CSS, RSS, XSLT)
• Components-off-the-shelf (COTS) through middleware (e.g., OLE/ActiveX, CORBA, Web
Services)
• Plugin-ins, Skins, Themes, Macros, Extensions (e.g. Eclipse, Word, WinAmp, Mozilla Firefox
etc.)
• Domain engineering and application generation (e.g. SAP)
• Domain-specific languages (DSL) and transformation systems (e.g. Draco, TXL) • 4-G
languages (e.g. SQL, Wizards, templates, MIL/ADL, etc.) Over 90% of source code in new
applications is reuse code
Composition-Based Reuse
• In the composition approach, the components
being reused are atomic building blocks that are
assembled to compose the target system.
• The components retain their basic characteristics
even after they have been reused.
– Examples of such building blocks are program
modules
– routines
– functions and objects
• A simple example of such a mechanism is the
UNIX pipe .It is a way of connecting the output
of one program to the input of another, and as
such, it can be used to create a large program
by combining smaller ones
Black-box reuse
• In black-box reuse, the component is reused
without modification. Since the user does not
need to modify the component prior to reusing it,
only information on what it does is made available.
• Examples of well-understood reusable components
can be found in UNIX and mathematical
applications in Fortran, as reflected in the high
proportion of software engineers reusing them
White-box reuse
• In white-box reuse, the component is
reused after modification. This approach
to reuse requires that the user be
supplied with information on both what
the component does and how it works.
2. Generation-Based Reuse
• In the generation approach, the reusable
components are active entities that are used
to generate the target system.
• Here, the reused component is the program
that generates the target product
• Examples are application generators,
transformation-based systems and language
based systems.
1.Application generator systems
• Application generators are software systems that are used
to generate other applications. Provided the specification
of the application to be generated is expressed in some
notation, it is generated automatically.
• Examples are formal specification , logic specification,
knowledge-based specification, grammatical specification
and algorithmic specification . Application generators are
usually domain-specific.
• A typical example of an application generator is yacc12 in
UNIX . This is a program that generates a parser when given
an appropriate grammatical specification
2.Transformation-Based Systems
• Transformation-based systems are products that are developed using
an approach whereby high-level specifications of the system are
converted through a number of stages into operational programs
• There are two types of transformation that can be used during this
conversion process
Step-wise refinement involves continuously refining the high-level
specification by adding more detail until the operational programs are
obtained
During linguistic transformation the operational programs are derived by
transforming the system through different stages. At each of these stages, the
system is represented using an intermediate language which may be
translated into some other intermediate language until the final
implementation of the system - in a given programming language - is
obtained.
example
• example of a transformation system is the SETL
language
• This is an imperative sequential language. Its
philosophy is that computations can be
represented as operations on mathematical
sets. The program specified in SETL is then
translated into a lower-level language called
LITTLE - with semantics that lie between
Fortran and C.
3.Evaluation of the Generator-Based
Systems
• In principle it is easy to classify systems according to
the above generation-based taxonomy. In practice,
however, it is difficult to classify a generated system
as belonging to any specific category. Quite often,
the systems are hybrid in nature, borrowing
concepts from more than one of the categories.
• For example, Neighbors' Draco system has features
of both an application generator and a
transformation system.
Domain Analysis
Domain analysis
• domain analysis: a process by which
information used in developing and
maintaining software systems is identified,
captured, and organised with the purpose of
making it reusable when maintaining existing
systems
• Domain analysis is best performed by a
domain expert who has experience of
developing many systems in the same domain
Advantages of
domain analysis include the following:
• The repository of information produced
serves, as an invaluable asset to an
organization.
• high turnover of personnel – thus depriving
organizations of the valuable expertise gained
from previous projects.
Disadvantages
• It requires a substantial upfront investment.
• It is a long-term investment whose benefit will
not be realized until the organization observes
some increase in productivity and a reduction
in the cost of maintenance as a result of reuse.
Components Engineering
• The composition-based approach to reuse
involves composing a new system partly from
existing components. There are two main
ways in which these components can be
obtained.
– The first is through a process known as design for
reuse.
– The second way is through reverse engineering
1.Design for Reuse
Characteristics of Reusable Components
– a. Generality: This means the potential use of a
component for a wide spectrum of application or
problem domains.
– Typical examples of software systems which
exhibit generality are database and spreadsheet
packages which have been designed to
accommodate the needs of a wide variety of
users.
b. Cohesion versus coupling
• Cohesion is an internal property which
describes the degree to which elements such as
program statements and declarations are
related. A module with high cohesion denotes
ahigh affinity between its constituent elements.
• Coupling, on the other hand, is an external
property which characterizes the
interdependence between two or more
modules in a given system.
b.Interaction:
• The interaction with the user in terms of the
number ofread-write statements per line of
source code should be minimised but there
should be more interaction with utility
functions - those thatcan be used for several
purposes
c. Uniformity and standardisation
• The use of standards across different levels of the
software is likely to promote reusability of software
components. Standards exist for such things as
– user interface design,
– programming style,
– data structure design and documentation.
• For example, standards help towards uniformity in
the techniques used to invoke, control and terminate
functions as well as in the methods used for getting
help during use of the software
d.Data and control abstractions
• Data abstraction encompasses abstract data
types, encapsulation and inheritance
• To allow effective reuse, it is essential to have
a clear separation between the programs that
manipulate data and the data itself.
e.Interoperability
• The increasing popularity of interoperability
will aid reuse by allowing systems to take
advantage of remote services.
• Consider for example the issue of patient
identification in clinical systems.
2.Problems with Reuse Libraries
• A. The granularity and size dilemma: When
designing a components library, it is important
to have appropriately sized fragments so as to
facilitate understandability and increase the
generic potential.
• This implies that the library will contain many
small components, which poses problems
with classifying and searching
b.The search problem
• Without an appropriate mechanism for
describing the contents of a components
library, it will be difficult for a user to find
components that match the requirements of
the system to be composed
c. The classification problem
• It is important to store information in the
components library on
– what components it contains.
– However, it is not always obvious
– how to specify this information..
• For example, the use of functional
specifications has been suggested as a means
of representing components in the library
d.The specification and flexibility problems
Technical Factors
• 1. Programming Languages
• 2. Representation of Information
• 3. Reuse Library
• 4. Reuse-Maintenance Vicious Cycle
• Non-Technical Factors
• 1. Initial Capital Outlay
• 2. Not Invented Here Factor
• 3. Commercial Interest
• 4 .Education
• 5.Project Co-ordination
• 6. Legal Issues
Maintenance Measures
• Definitions
• Empirical - capable of being verified or disproved by observation or
experiment
• Entity - either an object (for instance an athlete or chunk of program
code) or an event (for instance sprinting or the design phase in a
software development project).
• Measurement - "the process of empirical, objective encoding of some
property of a selected class of entities in a formal system of symbols so
as to describe them“
• 3. Ease of use: The measures that are finally selected to be used need to be
easy to use, take not too much time to administer, be unobtrusive ,and
possibly subject to automation