Conceptual Cohesion of Classes in Object Oriented Systems
Conceptual Cohesion of Classes in Object Oriented Systems
Abstract— Cohesion measures in Object-oriented software functional at one end to coincidental at the other [9]. Since
reflect particular interpretations, High cohesion positively then, various attempts in the object-oriented community have
impacts understanding, reuse and maintenance. This paper been made to capture cohesion through software metrics [3, 4,
proposes a new measure based on analysis of the unstructured 5]. The best known and most investigated of these metrics is
information embedded in the source code, such as comments and
the Lack of Cohesion in Methods of a class (LCOM)
identifiers, we have the existing applications based on using the
only the structural information from the source code, attribute proposed by Chidamber and Kemerer (C&K) [4]. The LCOM
references in methods to measure cohesion. The new measure metric rates a class as cohesive if every method uses every
named the Conceptual cohesion of classes is the mechanisms instance variable; at the other extreme, a class whose methods
used to measure textual coherence in cognitive psychology and use disjoint instance variables is considered Uncohesive.
computational linguistics, presents the principles and the This paper is organized as follows. Section 2 presents the
technology that stand behind the C3 measure. A large case study Cohesion using conceptual classes in object oriented systems,
on three open source software systems is presented which Section 3 presents the Related Design concepts using
compares the new measure with an extensive set of existing conceptual cohesion of classes in Object oriented systems,
metrics and uses them to construct models that predict software
Section 4 Presents the our proposed system Using
faults. The case study shows that the design concepts and novel
measure captures different aspects of class cohesion compared to Conceptual cohesion of classes in object oriented systems
any of the existing cohesion measures. compares our study with other works on the subject. Section 5
concludes the paper by presenting lessons learned and future
Index Terms– Cohesion, Types Cohesion and Design Concepts work.
• Coincident: The elements of a method have nothing in such that no method of one set uses instance variables or
common besides being within the same method invokes methods of a different set. In particular the cohesion
• Logical: The elements with similar functionality, such as of an object class is rated separable if there is a method which
input/output handling and error handling are collected in does neither access any instance variable nor invokes any
one method method of the class or there is an instance variable which is
• Temporal: The elements of a method have logical not referenced by any of the class methods. A class with
cohesion and are performed at the same time. separable cohesion should be split into several classes each
• Procedural: The elements of methods are connected by representing a single data abstraction, i.e., a single semantic
some control flow. concept.
• Communicational: The elements of a method are Example: Consider the object class Employee as
connected by some control flow and operate on the same set defined:
of data Class EMPLOYEE {
• Sequential: The elements of method have
communicational cohesion and are connected by a …
sequential control flow (IntcomputeCompany Revenue (SET<PROJECT *)*p);
• Functional: The elements of a method have sequential …
cohesion and all elements contribute to a single task of the
problem domain. Functional cohesion is the best form of };
method cohesion since it fully supports the principle of
locality and thus minimizes maintenance efforts. The method compute Company Revenue takes all projects
of a company as input parameter and computes the
For the discussion of class cohesion and inheritance accumulated revenue of that company. It neither accesses any
cohesion we assume that all methods have functional instance variables of EMPLOYEE nor does it invoke any
cohesion. The reason is that in order to determine other method of EMPLOYEE. Thus the cohesion of
class/inheritance cohesion we have to investigate the EMPLOYEE is separable strength. To improve its cohesion
relationship between methods and instance variables. Low the method computeCompany Revenue should be factored out
cohesive methods which access most of the instance variables into a different object class, e.g., into class COMPANY.
could fake a high degree of class/inheritance cohesion
2) .Class Cohesion: Class cohesion describes the binding of III. COHESION AND DESIGN QUALITY IN OBJECT
the elements define with in the same object class, not ORIENTED SYSTEM
considering inherited instance variables and inherited
methods. Since ignoring inheritance an object class resembles Object oriented system is a good design for imperatives to
an abstract data type and since the cohesion of abstract data building a quality. For this, quantification of the design
types has been analyzed in detail by Embley and Woodfield in property is required. Several software metrics have been
[14] we build our classification of various degrees of class developed to assess and control the design phase and its
cohesion on that of [14] and redefine their definitions products. One of the most vital criteria in Object Oriented
according to the idiosyncracy of object-oriented systems. design is cohesion. A module is said to have a strong cohesion
Abstract data types in procedure-oriented systems provide if it closely characterized with one task of the problem
functionality to other abstract data types or to modules which domain, and all its components contribute to this single task.
are not abstract data types. In contrast, code in object-oriented Cohesion was introduced by Yourdon and Constantine as
systems is in general a method bound to a class. Thus for “how tightly bound or related the internal elements of a
procedure-oriented systems with abstract data types we have module are to one another”. According to design quality,
to argue which functionality we factor out to abstract data cohesion is an attribute, not of any code, but of a design that
types whereas in object-oriented systems we have to consider can be utilized to forecast reusability, maintainability, and
which methods are assigned to which classes. changeability.
A further crucial difference between abstract data types in
the notion of Embley and Woodfield and classes is implied by A. Cohesion and Cohesion Metrics
the concept of object identity. Whereas a single abstract data
type can export different domains an object class describes A class is cohesive if it cannot be partitioned into two or
exactly one set of objects where each object is uniquely more sets defined as follows. Each set contains instance
identified by some system-defined object identifier. variables and methods. Methods of one set do not access
Depending on the cohesiveness of a class its objects represent variables of another set either directly or indirectly. By way of
a single, semantic meaningful data abstraction or several, defining cohesion metrics, many authors have effectually
more or less related data abstractions. In the following we defined class cohesion. So far as the Object Oriented model is
discuss the various degrees of class cohesion from worst, i.e., concerned, almost all of the cohesion metrics are influenced
lowest to best i.e., highest Separable. by the LCOM metric that is defined by Chidamber and
The cohesion of a class is rated separable if its objects Kemerer. According to them, “if an object class has different
represent multiple unrelated data abstractions combined in methods performing different operations on the same set of
one object. This is often the case if the instance variables and instance variables, the class is cohesive”. The LCOM (Lack of
methods of a class can be partitioned into two or more sets Cohesion in Methods) defined by them is the result gained
International Journal of Computer Science and Telecommunications [Volume 2, Issue 4, July 2011] 40
from deducting the number of pairs of methods in a class Patterns are a way to describe some best practices used in
having no common attributes from the number of pairs of designing software applications. A pattern describes a
methods in a class sharing at least one attribute. If the value solution to a recurring design problem. The design patterns
reached in this calculation is in the negative, the metric is set are broken down into three subsections: Creational,
to zero. This is one metric for assessing cohesion. Likewise, Structural, and Behavioral patterns.
Li and Henry defined LCOM as the number of disjoint sets of
methods accessing similar instance variables. E. Patterns
Hitz and Montazeri reaffirm Li’s definition of LCOM Creational patterns are used to create objects in an
based on the graph theory which defines LCOM as the application. Patterns like Factory Method are used to defer the
number of connected components of a graph. A graph consists instantiation of an object to inherited sub classes while
of vertices and edges. Vertices represent methods. There is an Composite pattern allows for a recursive, tree structure of
edge between 2 vertices if the corresponding methods access containers and elements.
the same instance variable. Hitz and Montazeri propose to Structural patterns are used to design the structure of
divide a class into smaller, more cohesive classes, if LCOM > modules in an application. For example, adapter can be used
1. to modify an existing module to work with a developing
module. The bridge pattern has a similar use. The composite
B. Design Quality
pattern can also be considered a structural pattern because of
the tree structure that is created.
1). Abstraction: Abstraction is an OOP concept. It provides
Behavioral patterns describe how objects communicate
a facility to hide some unimportant information and provide
with each other. The observer pattern is used to notify many
us some information which is important for the client
classes of a change in the application. The mediator pattern
programmers.
can be used to augment communication between classes,
eg., If we consider a car which has lot of parts such as without all of the classes knowing about each other.
wheels steering DVD player etc.
We need to know how to use it. We need not to know, what F. Information Hiding
is the structure of all these parts to buy and drive a car? In computer science, information hiding is the principle of
eg., is Television. The Television has lot of properties and segregation of design decisions in a computer program that
behaviors’ like height width display On, display Off etc. and are most likely to change, thus protecting other parts of the
also it has chips and internal wires which enables the program from extensive modification if the design decision is
television's functions. changed. The protection involves providing a stable interface
But for working the Television we do not need to know which protects the remainder of the program from the
these internal things. implementation (the details that are most likely to change).
2). Architecture: The software architecture of a program or The term encapsulation is often used interchangeably with
computing system is the structure or structures of the system, information hiding. Not all agree on the distinctions between
which comprise software components, the externally visible the two though; one may think of information hiding as being
properties of those components, and the relationships between the principle and encapsulation being the technique. A
them. The term also refers to documentation of a system's software module hides information by encapsulating the
software architecture. Documenting software architecture information into a module or other construct which presents
facilitates communication between stakeholders, documents an interface. A common use of information hiding is to hide
early decisions about high-level design, and allows reuse of the physical storage layout for data so that if it is changed, the
design components and patterns between projects. change is restricted to a small subset of the total program.
In object- oriented programming, information hiding (by
C. Modularity way of nesting of types) reduces software development risk
by shifting the code's dependency on an uncertain
Modularity refers to breaking down software into different implementation (design decision) onto a well-defined
parts. These parts have different names depending on your
interface. Clients of the interface perform operations purely
programming paradigm (for example, we talk about modules
through it so if the implementation changes, the clients do not
in imperative programming and objects in object oriented
have to change.
programming). By breaking the project down into pieces, it's
(i) easier to both FIX (you can isolate problems easier) and G. Refactoring
(ii) allows you to REUSE the pieces.
Refactoring is a disciplined technique for restructuring an
D. Refinement existing body of code, altering its internal structure without
changing its external behavior. Its heart is a series of small
In each step, one or several instructions of the given behavior preserving transformations. Each transformation
program are decomposed into more detailed instructions.
(called a 'refactoring') does little, but a sequence of
This successive decomposition or refinement of specification
transformations can produce a significant restructuring. Since
terminates when all instructions are expressed in terms of any
each refactoring is small, it's less likely to go wrong. The
underlying computer or programming language.
system is also kept fully working after each small refactoring,
S. Megha Chandrika et al. 41
a class belong together. Most structural metrics define and metrics for OO software, the Logical Relatedness of Methods
measure relationships among the methods of a class based on (LORM)] and the Lack of Conceptual Cohesion in Methods
this principle. Cohesion is seen to be dependent on the (LCSM) are the only ones that use this type of information to
number of pairs of methods that share instance or class measure the conceptual similarity of the methods in a class.
variables one way or another. The differences among the The philosophy behind this class of metrics, into which our
structural metrics are based on the definition of the work falls, is that a cohesive class is a crisp implementation of
relationships among methods, system representation, and a problem or solution domain concept. Hence, if the methods
counting mechanism. A comprehensive overview of graph of a class are conceptually related to each other, the class is
theory-based cohesion metrics is given by Zhou et al. cohesive. The difficult problem here is how conceptual
Somewhat different in this class of metrics are LCOM5 and relationships can be defined and measured. LORM uses
Coh, which consider that cohesion is directly proportional to natural language processing techniques for the analysis
the number of instance variables in a class that are referenced needed to measure the conceptual similarity of methods and
by the methods in that class. represents a class as a semantic network. LCSM uses the
Briand et al. defined a unified framework for cohesion same information, indexed with LSI, and represents classes as
measurement in OO systems which classifies and discusses all graphs that have methods as nodes. It uses a counting
of these metrics. mechanism similar to LCOM.
Recently, other structural cohesion metrics have been
proposed, trying to improve existing metrics by considering
the effects of dependent instance variables whose values are
computed from other instance variables in the class. Other
recent approaches have addressed class cohesion by
considering the relationships between the attributes and
methods of a class based on dependence analysis. Although
different from each other, all of these structural metrics
capture the same aspects of cohesion, which relate to the data
flow between the methods of a class.
Other cohesion metrics exploit relationships that underline
slicing. A large-scale empirical investigation of slice-based
metrics indicated that the slice-based cohesion metrics
provide complementary views of cohesion to the structural
metrics. Although the information used by these metrics is
also structural in nature, the mechanism used and the
underlying interpretation of cohesion set these metrics apart
from the structural metrics group.
A small set of cohesion metrics was proposed for specific
types of applications. Among those are cohesion metrics for
knowledge-based, aspect-oriented systems, and dynamic
cohesion metrics for distributed applications. Fig. 2. Screen 1
From a measuring methodology point of view, two other
cohesion metrics are of interest here since they are also based
on an IR approach. However, IR methods are used differently
there than in our approach. Patel et al. proposed a composite
cohesion metric that measures the information strength of a
module. This measure is based on a vector representation of
the frequencies of occurrences of data types in a module. The
approach measures the cohesion of individual subprograms of
a system based on the relationships to each other in this vector
space. Maletic and Marcus defined a file-level cohesion
metric based on the same type of information that we are
using for our proposed metrics here. Even though these
metrics were not
Specifically designed for the measurement of cohesion in
OO software, they could be extended to measure cohesion in
OO systems. The designers and the programmers of a
software system often think about a class as a set of
responsibilities that approximate the concept from the
problem domain implemented by the class as opposed to a set
of method attribute interactions. Information that gives clues
about domain concepts is encoded in the source code as Fig. 3. Screen 2
comments and identifiers. Among the existing cohesion
S. Megha Chandrika et al. 43
REFERENCES
[20]. H. Kabaili, R.K. Keller, F. Lustman, and G.Saint-Denis, Ms. S. Megha Chandrika, Assistant
“Class Cohesion Revisited: AnEmpirical Study on Professor from SCIENT Institute of
Industrial Systems,” Proc. Fourth Int’l ECOOP Workshop Technology, B.Tech Computer science
Quantitative Approaches in Object-Oriented Software from Nizam Institute of Engg & Tech
Eng., pp. 29-38, 2000. (JNTUH) and M Tech Software
[21]. H. Kabaili, R.K. Keller, and F. Lustman,“Cohesion as Engineering From GuruNank Engg College
Changeability Indicator in Object-Oriented Systems,” (JNTUH) has 6 years of experience in
Proc. Fifth European Conf. Software Maintenance and Academic. Guided many UG & PG
Reeng., 2001. engineering students. Papers was published
[22]. W. Li and S. Henry, “Object-Oriented Metrics that Predict in National & International journals, areas
Maintainability,” J. Systems and Software, vol. 23, no. 2, of interest are Software Engineering, Data Mining, Software
pp. 111-122, 1993. Testing, Compiler design, Web Applications and Unified Modeling
[23]. M. Linton, P.R. Calder, and J.M. Vlissides,“InterViews: A Languages.
C++ Graphical Interface Toolkit,” Technical Report CSL-
TR-88-358, Stanford Univ.,
1988,ftp://interviews.stanford.edu/pub. Mr. E. Suresh Babu, Assistant Professor
[24]. J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. from Samskruti Engg College, B.Tech from
Lorensen, Object-Oriented Modeling and Design. Prentice Vathslaya Institute of Scie & Tech
Hall, 1991. (JNTUH) M.Tech from SKTRCE
[25]. W. Stevens, G. Myers, and L. Constantine, “Structured (JNTUH). His areas of interest include Data
Design,” IBM Systems J., vol. 12, no. 2, 1974. Mining, and Software Engineering,
[26]. R. Subramanyam and M.S. Krishnan, “Empirical Analysis Software Testing Methodology and
of CK Metrics for Object-Oriented Design Complexity: Network Security.
Implifications for Software Defects,” IEEE Trans.
Software Eng., vol. 29, no. 4, pp. 297-310, Apr. 2003.