Engineering Practices For Building Quality Software
Engineering Practices For Building Quality Software
Each of six common development stages will be examined. From design to final project
deployment, quality can be injected into every action taken towards product success. However,
measuring that quality and proving that it exceeds the level necessary to satisfy the requirements
of each relevant stakeholder.
In this course, learners will display the knowledge through quizzes and apply their learning by:
Divided into four weeks, the course provides a sweeping introduction to quality in the software
development lifecycle. In addition to explanatory lectures and opportunities to evaluate a variety
of sources of software quality, a curated set of resources allow the learner to dive deeper into the
content.
The standard of something as measured against other things of a similar kind, the degree of
excellence of something
Of good quality: excellent
Design
What does good design look like
Coupling, cohesion and more
Quality metrics
Software design patterns
Implementation
Code Style – coding standard
Debugging
Commenting – documentation within the code
Build Process
‘Good’
What is software?
Software quality attributes
Performance
Security
Modifiability
Reliability
Usability
SOLID
S – Single responsibility
O – Open/Closed principle
L – Liskov substitution principle
I – Interface segregation principle
D – Dependency inversion principle
Law of Demeter
Principle of least knowledge
Formally a method m of an object O may only invoke the methods of the following kinds of
objects
O itself
M’s parameters
Any objects created/instantiated within m
O’s direct component objects
A global variable, accessible by O in the scope of m
What is software?
https://docs.microsoft.com/en-us/previous-versions/msp-n-p/ee658094(v=pandp.10)
Chapter 16: Quality Attributes
01/13/2010
22 minutes to read
For more details of the topics covered in this guide, see Contents of the Guide.
Contents
Overview
Common Quality Attributes
Additional Resources
Overview
Quality attributes are the overall factors that affect run-time behavior, system design, and user experience. They
represent areas of concern that have the potential for application wide impact across layers and tiers. Some of
these attributes are related to the overall system design, while others are specific to run time, design time, or
user centric issues. The extent to which the application possesses a desired combination of quality attributes
such as usability, performance, reliability, and security indicates the success of the design and the overall quality
of the software application.
When designing applications to meet any of the quality attributes requirements, it is necessary to consider the
potential impact on other requirements. You must analyze the tradeoffs between multiple quality attributes. The
importance or priority of each quality attribute differs from system to system; for example, interoperability will
often be less important in a single use packaged retail application than in a line of business (LOB) system.
This chapter lists and describes the quality attributes that you should consider when designing your application.
To get the most out of this chapter, use the table below to gain an understanding of how quality attributes map to
system and application quality factors, and read the description of each of the quality attributes. Then use the
sections containing key guidelines for each of the quality attributes to understand how that attribute has an
impact on your design, and to determine the decisions you must make to addresses these issues. Keep in mind
that the list of quality attributes in this chapter is not exhaustive, but provides a good starting point for asking
appropriate questions about your architecture.
The following sections describe each of the quality attributes in more detail, and provide guidance on the key
issues and the decisions you must make for each one:
Availability
Conceptual Integrity
Interoperability
Maintainability
Manageability
Performance
Reliability
Reusability
Scalability
Security
Supportability
Testability
User Experience / Usability
Availability
Availability defines the proportion of time that the system is functional and working. It can be measured as a
percentage of the total system downtime over a predefined period. Availability will be affected by system errors,
infrastructure problems, malicious attacks, and system load. The key issues for availability are:
A physical tier such as the database server or application server can fail or become unresponsive, causing
the entire system to fail. Consider how to design failover support for the tiers in the system. For example,
use Network Load Balancing for Web servers to distribute the load and prevent requests being directed to
a server that is down. Also, consider using a RAID mechanism to mitigate system failure in the event of
a disk failure. Consider if there is a need for a geographically separate redundant site to failover to in
case of natural disasters such as earthquakes or tornados.
Denial of Service (DoS) attacks, which prevent authorized users from accessing the system, can interrupt
operations if the system cannot handle massive loads in a timely manner, often due to the processing time
required, or network configuration and congestion. To minimize interruption from DoS attacks, reduce
the attack surface area, identify malicious behavior, use application instrumentation to expose unintended
behavior, and implement comprehensive data validation. Consider using the Circuit Breaker or Bulkhead
patterns to increase system resiliency.
Inappropriate use of resources can reduce availability. For example, resources acquired too early and
held for too long cause resource starvation and an inability to handle additional concurrent user requests.
Bugs or faults in the application can cause a system wide failure. Design for proper exception handling in
order to reduce application failures from which it is difficult to recover.
Frequent updates, such as security patches and user application upgrades, can reduce the availability of
the system. Identify how you will design for run-time upgrades.
A network fault can cause the application to be unavailable. Consider how you will handle unreliable
network connections; for example, by designing clients with occasionally-connected capabilities.
Consider the trust boundaries within your application and ensure that subsystems employ some form of
access control or firewall, as well as extensive data validation, to increase resiliency and availability.
Conceptual Integrity
Conceptual integrity defines the consistency and coherence of the overall design. This includes the way that
components or modules are designed, as well as factors such as coding style and variable naming. A coherent
system is easier to maintain because you will know what is consistent with the overall design. Conversely, a
system without conceptual integrity will constantly be affected by changing interfaces, frequently deprecating
modules, and lack of consistency in how tasks are performed. The key issues for conceptual integrity are:
Mixing different areas of concern within your design. Consider identifying areas of concern and
grouping them into logical presentation, business, data, and service layers as appropriate.
Inconsistent or poorly managed development processes. Consider performing an Application Lifecycle
Management (ALM) assessment, and make use of tried and tested development tools and methodologies.
Lack of collaboration and communication between different groups involved in the application lifecycle.
Consider establishing a development process integrated with tools to facilitate process workflow,
communication, and collaboration.
Lack of design and coding standards. Consider establishing published guidelines for design and coding
standards, and incorporating code reviews into your development process to ensure guidelines are
followed.
Existing (legacy) system demands can prevent both refactoring and progression toward a new platform or
paradigm. Consider how you can create a migration path away from legacy technologies, and how to
isolate applications from external dependencies. For example, implement the Gateway design pattern for
integration with legacy systems.
Interoperability
Interoperability is the ability of a system or different systems to operate successfully by communicating and
exchanging information with other external systems written and run by external parties. An interoperable system
makes it easier to exchange and reuse information internally as well as externally. Communication protocols,
interfaces, and data formats are the key considerations for interoperability. Standardization is also an important
aspect to be considered when designing an interoperable system. The key issues for interoperability are:
Interaction with external or legacy systems that use different data formats. Consider how you can enable
systems to interoperate, while evolving separately or even being replaced. For example, use orchestration
with adaptors to connect with external or legacy systems and translate data between systems; or use a
canonical data model to handle interaction with a large number of different data formats.
Boundary blurring, which allows artifacts from one system to defuse into another. Consider how you can
isolate systems by using service interfaces and/or mapping layers. For example, expose services using
interfaces based on XML or standard types in order to support interoperability with other systems.
Design components to be cohesive and have low coupling in order to maximize flexibility and facilitate
replacement and reusability.
Lack of adherence to standards. Be aware of the formal and de facto standards for the domain you are
working within, and consider using one of them rather than creating something new and proprietary.
Maintainability
Maintainability is the ability of the system to undergo changes with a degree of ease. These changes could
impact components, services, features, and interfaces when adding or changing the application’s functionality in
order to fix errors, or to meet new business requirements. Maintainability can also affect the time it takes to
restore the system to its operational status following a failure or removal from operation for an upgrade.
Improving system maintainability can increase availability and reduce the effects of run-time defects. An
application’s maintainability is often a function of its overall quality attributes but there a number of key issues
that can directly affect maintainability:
Excessive dependencies between components and layers, and inappropriate coupling to concrete classes,
prevents easy replacement, updates, and changes; and can cause changes to concrete classes to ripple
through the entire system. Consider designing systems as well-defined layers, or areas of concern, that
clearly delineate the system’s UI, business processes, and data access functionality. Consider
implementing cross-layer dependencies by using abstractions (such as abstract classes or interfaces)
rather than concrete classes, and minimize dependencies between components and layers.
The use of direct communication prevents changes to the physical deployment of components and layers.
Choose an appropriate communication model, format, and protocol. Consider designing a pluggable
architecture that allows easy upgrades and maintenance, and improves testing opportunities, by designing
interfaces that allow the use of plug-in modules or adapters to maximize flexibility and extensibility.
Reliance on custom implementations of features such as authentication and authorization prevents reuse
and hampers maintenance. To avoid this, use the built-in platform functions and features wherever
possible.
The logic code of components and segments is not cohesive, which makes them difficult to maintain and
replace, and causes unnecessary dependencies on other components. Design components to be cohesive
and have low coupling in order to maximize flexibility and facilitate replacement and reusability.
The code base is large, unmanageable, fragile, or over complex; and refactoring is burdensome due to
regression requirements. Consider designing systems as well defined layers, or areas of concern, that
clearly delineate the system’s UI, business processes, and data access functionality. Consider how you
will manage changes to business processes and dynamic business rules, perhaps by using a business
workflow engine if the business process tends to change. Consider using business components to
implement the rules if only the business rule values tend to change; or an external source such as a
business rules engine if the business decision rules do tend to change.
The existing code does not have an automated regression test suite. Invest in test automation as you build
the system. This will pay off as a validation of the system’s functionality, and as documentation on what
the various parts of the system do and how they work together.
Lack of documentation may hinder usage, management, and future upgrades. Ensure that you provide
documentation that, at minimum, explains the overall structure of the application.
Manageability
Manageability defines how easy it is for system administrators to manage the application, usually through
sufficient and useful instrumentation exposed for use in monitoring systems and for debugging and performance
tuning. Design your application to be easy to manage, by exposing sufficient and useful instrumentation for use
in monitoring systems and for debugging and performance tuning. The key issues for manageability are:
Lack of health monitoring, tracing, and diagnostic information. Consider creating a health model that
defines the significant state changes that can affect application performance, and use this model to
specify management instrumentation requirements. Implement instrumentation, such as events and
performance counters, that detects state changes, and expose these changes through standard systems
such as Event Logs, Trace files, or Windows Management Instrumentation (WMI). Capture and report
sufficient information about errors and state changes in order to enable accurate monitoring, debugging,
and management. Also, consider creating management packs that administrators can use in their
monitoring environments to manage the application.
Lack of runtime configurability. Consider how you can enable the system behavior to change based on
operational environment requirements, such as infrastructure or deployment changes.
Lack of troubleshooting tools. Consider including code to create a snapshot of the system’s state to use
for troubleshooting, and including custom instrumentation that can be enabled to provide detailed
operational and functional reports. Consider logging and auditing information that may be useful for
maintenance and debugging, such as request details or module outputs and calls to other systems and
services.
Performance
Performance is an indication of the responsiveness of a system to execute specific actions in a given time
interval. It can be measured in terms of latency or throughput. Latency is the time taken to respond to any event.
Throughput is the number of events that take place in a given amount of time. An application’s performance can
directly affect its scalability, and lack of scalability can affect performance. Improving an application’s
performance often improves its scalability by reducing the likelihood of contention for shared resources. Factors
affecting system performance include the demand for a specific action and the system’s response to the demand.
The key issues for performance are:
Increased client response time, reduced throughput, and server resource over utilization. Ensure that you
structure the application in an appropriate way and deploy it onto a system or systems that provide
sufficient resources. When communication must cross process or tier boundaries, consider using coarse-
grained interfaces that require the minimum number of calls (preferably just one) to execute a specific
task, and consider using asynchronous communication.
Increased memory consumption, resulting in reduced performance, excessive cache misses (the inability
to find the required data in the cache), and increased data store access. Ensure that you design an efficient
and appropriate caching strategy.
Increased database server processing, resulting in reduced throughput. Ensure that you choose effective
types of transactions, locks, threading, and queuing approaches. Use efficient queries to minimize
performance impact, and avoid fetching all of the data when only a portion is displayed. Failure to design
for efficient database processing may incur unnecessary load on the database server, failure to meet
performance objectives, and costs in excess of budget allocations.
Increased network bandwidth consumption, resulting in delayed response times and increased load for
client and server systems. Design high performance communication between tiers using the appropriate
remote communication mechanism. Try to reduce the number of transitions across boundaries, and
minimize the amount of data sent over the network. Batch work to reduce calls over the network.
Reliability
Reliability is the ability of a system to continue operating in the expected way over time. Reliability is measured
as the probability that a system will not fail and that it will perform its intended function for a specified time
interval. The key issues for reliability are:
The system crashes or becomes unresponsive. Identify ways to detect failures and automatically initiate a
failover, or redirect load to a spare or backup system. Also, consider implementing code that uses
alternative systems when it detects a specific number of failed requests to an existing system.
Output is inconsistent. Implement instrumentation, such as events and performance counters, that detects
poor performance or failures of requests sent to external systems, and expose information through
standard systems such as Event Logs, Trace files, or WMI. Log performance and auditing information
about calls made to other systems and services.
The system fails due to unavailability of other externalities such as systems, networks, and databases.
Identify ways to handle unreliable external systems, failed communications, and failed transactions.
Consider how you can take the system offline but still queue pending requests. Implement store and
forward or cached message-based communication systems that allow requests to be stored when the
target system is unavailable, and replayed when it is online. Consider using Windows Message Queuing
or BizTalk Server to provide a reliable once-only delivery mechanism for asynchronous requests.
Reusability
Reusability is the probability that a component will be used in other components or scenarios to add new
functionality with little or no change. Reusability minimizes the duplication of components and the
implementation time. Identifying the common attributes between various components is the first step in building
small reusable components for use in a larger system. The key issues for reusability are:
The use of different code or components to achieve the same result in different places; for example,
duplication of similar logic in multiple components, and duplication of similar logic in multiple layers or
subsystems. Examine the application design to identify common functionality, and implement this
functionality in separate components that you can reuse. Examine the application design to identify
crosscutting concerns such as validation, logging, and authentication, and implement these functions as
separate components.
The use of multiple similar methods to implement tasks that have only slight variation. Instead, use
parameters to vary the behavior of a single method.
Using several systems to implement the same feature or function instead of sharing or reusing
functionality in another system, across multiple systems, or across different subsystems within an
application. Consider exposing functionality from components, layers, and subsystems through service
interfaces that other layers and systems can use. Use platform agnostic data types and structures that can
be accessed and understood on different platforms.
Scalability
Scalability is ability of a system to either handle increases in load without impact on the performance of the
system, or the ability to be readily enlarged. There are two methods for improving scalability: scaling vertically
(scale up), and scaling horizontally (scale out). To scale vertically, you add more resources such as CPU,
memory, and disk to a single system. To scale horizontally, you add more machines to a farm that runs the
application and shares the load. The key issues for scalability are:
Applications cannot handle increasing load. Consider how you can design layers and tiers for scalability,
and how this affects the capability to scale up or scale out the application and the database when
required. You may decide to locate logical layers on the same physical tier to reduce the number of
servers required while maximizing load sharing and failover capabilities. Consider partitioning data
across more than one database server to maximize scale-up opportunities and allow flexible location of
data subsets. Avoid stateful components and subsystems where possible to reduce server affinity.
Users incur delays in response and longer completion times. Consider how you will handle spikes in
traffic and load. Consider implementing code that uses additional or alternative systems when it detects a
predefined service load or a number of pending requests to an existing system.
The system cannot queue excess work and process it during periods of reduced load. Implement store-
and-forward or cached message-based communication systems that allow requests to be stored when the
target system is unavailable, and replayed when it is online.
Security
Security is the capability of a system to reduce the chance of malicious or accidental actions outside of the
designed usage affecting the system, and prevent disclosure or loss of information. Improving security can also
increase the reliability of the system by reducing the chances of an attack succeeding and impairing system
operation. Securing a system should protect assets and prevent unauthorized access to or modification of
information. The factors affecting system security are confidentiality, integrity, and availability. The features
used to secure systems are authentication, encryption, auditing, and logging. The key issues for security are:
Spoofing of user identity. Use authentication and authorization to prevent spoofing of user identity.
Identify trust boundaries, and authenticate and authorize users crossing a trust boundary.
Damage caused by malicious input such as SQL injection and cross-site scripting. Protect against such
damage by ensuring that you validate all input for length, range, format, and type using the constrain,
reject, and sanitize principles. Encode all output you display to users.
Data tampering. Partition the site into anonymous, identified, and authenticated users and use application
instrumentation to log and expose behavior that can be monitored. Also use secured transport channels,
and encrypt and sign sensitive data sent across the network
Repudiation of user actions. Use instrumentation to audit and log all user interaction for application
critical operations.
Information disclosure and loss of sensitive data. Design all aspects of the application to prevent access
to or exposure of sensitive system and application information.
Interruption of service due to Denial of service (DoS) attacks. Consider reducing session timeouts and
implementing code or hardware to detect and mitigate such attacks.
Supportability
Supportability is the ability of the system to provide information helpful for identifying and resolving issues
when it fails to work correctly. The key issues for supportability are:
Lack of diagnostic information. Identify how you will monitor system activity and performance.
Consider a system monitoring application, such as Microsoft System Center.
Lack of troubleshooting tools. Consider including code to create a snapshot of the system’s state to use
for troubleshooting, and including custom instrumentation that can be enabled to provide detailed
operational and functional reports. Consider logging and auditing information that may be useful for
maintenance and debugging, such as request details or module outputs and calls to other systems and
services.
Lack of tracing ability. Use common components to provide tracing support in code, perhaps though
Aspect Oriented Programming (AOP) techniques or dependency injection. Enable tracing in Web
applications in order to troubleshoot errors.
Lack of health monitoring. Consider creating a health model that defines the significant state changes
that can affect application performance, and use this model to specify management instrumentation
requirements. Implement instrumentation, such as events and performance counters, that detects state
changes, and expose these changes through standard systems such as Event Logs, Trace files, or
Windows Management Instrumentation (WMI). Capture and report sufficient information about errors
and state changes in order to enable accurate monitoring, debugging, and management. Also, consider
creating management packs that administrators can use in their monitoring environments to manage the
application.
Testability
Testability is a measure of how well system or components allow you to create test criteria and execute tests to
determine if the criteria are met. Testability allows faults in a system to be isolated in a timely and effective
manner. The key issues for testability are:
Complex applications with many processing permutations are not tested consistently, perhaps because
automated or granular testing cannot be performed if the application has a monolithic design. Design
systems to be modular to support testing. Provide instrumentation or implement probes for testing,
mechanisms to debug output, and ways to specify inputs easily. Design components that have high
cohesion and low coupling to allow testability of components in isolation from the rest of the system.
Lack of test planning. Start testing early during the development life cycle. Use mock objects during
testing, and construct simple, structured test solutions.
Poor test coverage, for both manual and automated tests. Consider how you can automate user interaction
tests, and how you can maximize test and code coverage.
Input and output inconsistencies; for the same input, the output is not the same and the output does not
fully cover the output domain even when all known variations of input are provided. Consider how to
make it easy to specify and understand system inputs and outputs to facilitate the construction of test
cases.
The application interfaces must be designed with the user and consumer in mind so that they are intuitive to use,
can be localized and globalized, provide access for disabled users, and provide a good overall user experience.
The key issues for user experience and usability are:
Too much interaction (an excessive number of clicks) required for a task. Ensure you design the screen
and input flows and user interaction patterns to maximize ease of use.
Incorrect flow of steps in multistep interfaces. Consider incorporating workflows where appropriate to
simplify multistep operations.
Data elements and controls are poorly grouped. Choose appropriate control types (such as option groups
and check boxes) and lay out controls and content using the accepted UI design patterns.
Feedback to the user is poor, especially for errors and exceptions, and the application is unresponsive.
Consider implementing technologies and techniques that provide maximum user interactivity, such as
Asynchronous JavaScript and XML (AJAX) in Web pages and client-side input validation. Use
asynchronous techniques for background tasks, and tasks such as populating controls or performing long-
running tasks.
Additional Resources
For more information on implementing and auditing quality attributes, see the following resources:
https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=12433
Quality Attributes
Abstract
Computer systems are used in many critical applications where a failure can have serious consequences (loss
of lives or property). Developing systematic ways to relate the software quality attributes of a system to the
system's architecture provides a sound basis for making objective decisions about design trade-offs and
enables engineers to make reasonably accurate predictions about a system's attributes that are free from bias
and hidden assumptions. The ultimate goal is the ability to quantitatively evaluate and trade off multiple
software quality attributes to arrive at a better overall system. The purpose of this report is to take a small step
in the direction of developing a unifying approach for reasoning about multiple software quality attributes. In
this report, we define software quality, introduce a generic taxonomy of attributes, discuss the connections
between the attributes, and discuss future work leading to an attribute-based methodology for evaluating
software architectures.
Quality Metrics
Measuring Coupling
Instability
A metric to help identify problematic relationships between entities
Measures the potential for change to cascade throughout a system
Instability, despite the connotation is not necessarily good or bad
Instability Measurement
Instability is determined by 2 forms of coupling, afferent (incoming) and efferent (outgoing)
For module X
Afferent coupling (AC) counts other modules that depend on X
Efferent coupling (EC) counts other modules that X depends on
Instability is the proportion of EC in the total sum of coupling
1 = EC
AC + EC
Instability Example
Client
Example 2
Service
Example 3
Schedule Service
Class Ranking Instructors
http://www.arisa.se/compendium/node109.html#metric:CF
Coupling Factor ( )
Works with all instances of a common meta-model, regardless if they where produced with the
Java or the UML front-end. The respective call, create, field access, and type reference relations
(Java) or association, message and type reference relations (UML) express the coupling (exclusive
inheritance) between two classes. They are mapped to relations of type Invokes, Accesses, and
``Is Of Type", respectively, in the common meta model and further to type coupling in the view.
By defining a view containing only classes and packages as elements, the metric definition can
ignore methods and fields as part of its description, since the relations originating from them are
lifted to the class element.
Description
Coupling Factor (CF) measures the coupling between classes excluding coupling due to
inheritance. It is the ratio between the number of actually coupled pairs of classes in a
scope (e.g., package) and the possible number of coupled pairs of classes. CF is primarily
applicable to object-oriented systems.
Scope
Package
View
Grammar
Productions
Relations 3.1
Mapping :
Definition
Scale
Absolute.
Domain
Integers in .
Highly Related Software Quality Properties
Re-Usability 2.4
is negatively influenced by coupling.
Understandability for Reuse 2.4.1:
A art of a system that has a high (outgoing) efferent coupling may be highly inversely
related to understandability, since it uses other parts of the system which need to be
understood as well.
Attractiveness 2.4.4:
Parts that have a high (outgoing) efferent coupling may be highly inversely related to
attractiveness, since they are using other parts of the system which need to be understood
as well, and represent dependencies.
Maintainability 2.6
decreases with increasing CF.
Analyzability 2.6.1:
Parts that have a high (outgoing) efferent coupling may be highly inversely related to
analyzability, since they are using other parts of the system which need to be analyzed as
well.
Changeability 2.6.2:
Parts that have a high (outgoing) efferent coupling may be inversely related to
changeability, since they are using other parts of the system which might need to be
changed as well.
Stability 2.6.3:
Parts showing a high (outgoing) efferent coupling may be inversely related to stability,
since they are using other parts of the system, which are can affect them.
Testability 2.6.4:
Parts that have a high (outgoing) efferent coupling may be highly inversely related to
testability, since they are using other parts of the system which increase the number of
possible test paths.
Portability 2.7
decreases with increasing CF.
Adaptability 2.7.1:
Parts that have a high (outgoing) efferent coupling may be inversely related to
adaptability, since they are using other parts of the system which might need to be
adapted as well.
Functionality 2.1
is both negatively and positively influenced by coupling.
Interoperability 2.1.3:
Parts that have a high (outgoing) efferent coupling may be directly related to
interoperability, since they are using/interacting with other parts of the system.
Security 2.1.4:
Parts that have a high (outgoing) efferent coupling may be inversely related to security,
since they can be affected by security problems in other parts of the system.
Reliability 2.2
might decrease with increasing CF.
Fault-tolerance 2.2.2:
Parts that have a high (outgoing) efferent coupling may be inversely related to fault-
tolerance, since they can be affected by faults in other parts of the system.
Re-Usability 2.4
might decrease with increasing CF.
Learnability for Reuse 2.4.2:
Parts that have a high (outgoing) efferent coupling may be inversely related to learnability,
since they are using other parts of the system which need to be understood as well.
Efficiency 2.5
might decrease with increasing CF.
Time Behavior 2.5.1:
Parts that have a high (outgoing) efferent coupling may be inversely related to time
behavior, since they are using other parts of the system, thus execution during test or
operation does not stay local, but might involve huge parts of the system.
References
CF is discussed in [8,2,17,9],
it is implemented in the VizzAnalyzer Metrics Suite.
Since
Compendium 1.0
Measuring Cohesion
LCOM4 Measurement
This is a simple measure of the number of connected components
Having 1 connected component of one is considered high cohesion
Methods are ‘connected’ if
- they both access the same class level variable, or
- one method calls the other
https://www.aivosto.com/project/help/pm-oo-cohesion.html#LCOM4
Cohesion metrics
Project Metrics
Cohesion metrics measure how well the methods of a class are related to each other. A cohesive
class performs one function. A non-cohesive class performs two or more unrelated functions. A
non-cohesive class may need to be restructured into two or more smaller classes.
The assumption behind the following cohesion metrics is that methods are related if they work on
the same class-level variables. Methods are unrelated if they work on different variables
altogether. In a cohesive class, methods work with the same set of variables. In a non-cohesive
class, there are some methods that work on different data.
A cohesive class performs one function. Lack of cohesion means that a class performs more than
one function. This is not desirable. If a class performs several unrelated functions, it should be
split up.
LCOM metrics Lack of Cohesion of Methods. This group of metrics aims to detect problem
classes. A high LCOM value means low cohesion.
TCC and LCC metrics: Tight and Loose Class Cohesion. This group of metrics aims to tell
the difference of good and bad cohesion. With these metrics, large values are good and
low values are bad.
Cohesion diagrams visualize class cohesion.
Non-cohesive classes report suggests which classes should be split and how.
LCOM4 is the lack of cohesion metric we recommend for Visual Basic programs. LCOM4 measures
the number of "connected components" in a class. A connected component is a set of related
methods (and class-level variables). There should be only one such a component in each class. If
there are 2 or more components, the class should be split into so many smaller classes.
After determining the related methods, we draw a graph linking the related methods to each
other. LCOM4 equals the number of connected groups of methods.
In the example on the right, we made C access x to increase cohesion. Now the class consists of a
single component (LCOM4=1). It is a cohesive class.
It is to be noted that UserControls as well as VB.NET forms and web pages frequently report high
LCOM4 values. Even if the value exceeds 1, it does not often make sense to split the control, form
or web page as it would affect the user interface of your program. — The explanation with
UserControls is that they store information in the the underlying UserControl object. The
explanation with VB.NET is the form designer generated code that you cannot modify.
Implementation details for LCOM4. We use the same definition for a method as with the WMC
metric. This means that property accessors are considered regular methods, but inherited methods
are not taken into account. Both Shared and non-Shared variables and methods are considered. —
We ignore empty procedures, though. Empty procedures tend to increase LCOM4 as they do not
access any variables or other procedures. A cohesive class with empty procedures would have a
high LCOM4. Sometimes empty procedures are required (for classic VB implements, for example).
This is why we simply drop empty procedures from LCOM4. — We also ignore constructors and
destructors (Sub New, Finalize, Class_Initialize, Class_Terminate). Constructors and destructors
frequently set and clear all variables in the class, making all methods connected through these
variables, which increases cohesion artificially.
Suggested use. Use the Non-cohesive classes report and Cohesion diagrams to determine how
the classes could be split. It is good to remove dead code before searching for uncohesive classes.
Dead procedures can increase LCOM4 as the dead parts can be disconnected from the other parts
of the class.
Hitz M., Montazeri B.: Measuring Coupling and Cohesion In Object-Oriented Systems. Proc. Int.
Symposium on Applied Corporate Computing, Oct. 25-27, Monterrey, Mexico, 75-76, 197, 78-84.
(Includes the definition of LCOM4 named as "Improving LCOM".)
LCOM1, LCOM2 and LCOM3 — less suitable for VB
LCOM1, LCOM2 and LCOM3 are not as suitable for Visual Basic projects as LCOM4. They are less accurate
especially as they don't consider the impact of property accessors and procedure calls, which are both
frequently used to access the values of variables in a cohesive way. They may be more appropriate to other
object-oriented languages such as C++. We provide these metrics for the sake of completeness. You can use
them as complementary metrics in addition to LCOM4.
LCOM1 Chidamber & Kemerer
LCOM1 was introduced in the Chidamber & Kemerer metrics suite. It’s also called LCOM or LOCOM, and it’s
calculated as follows:
Take each pair of methods in the class. If they access disjoint sets of instance variables, increase P by one. If
they share at least one variable access, increase Q by one.
LCOM1 = P - Q , if P > QLCOM1 = 0 otherwise
LCOM1 = 0 indicates a cohesive class.
LCOM1 > 0 indicates that the class needs or can be split into two or more classes, since its variables belong in
disjoint sets.
Classes with a high LCOM1 have been found to be fault-prone.
A high LCOM1 value indicates disparateness in the functionality provided by the class. This metric can be used
to identify classes that are attempting to achieve many different objectives, and consequently are likely to
behave in less predictable ways than classes that have lower LCOM1 values. Such classes could be more error
prone and more difficult to test and could possibly be disaggregated into two or more classes that are more
well defined in their behavior. The LCOM1 metric can be used by senior designers and project managers as a
relatively simple way to track whether the cohesion principle is adhered to in the design of an application and
advise changes.
LCOM1 critique
LCOM1 has received its deal of critique. It has been shown to have a number of drawbacks, so it should be
used with caution.
First, LCOM1 gives a value of zero for very different classes. To overcome that problem, new metrics, LCOM2
and LCOM3, have been suggested (see below).
Second, Gupta suggests that LCOM1 is not a valid way to measure cohesiveness of a class. That’s because its
definition is based on method-data interaction, which may not be a correct way to define cohesiveness in the
object-oriented world. Moreover, very different classes may have an equal LCOM1.
Third, as LCOM1 is defined on variable access, it's not well suited for classes that internally access their data
via properties. A class that gets/sets its own internal data via its own properties, and not via direct variable
read/write, may show a high LCOM1. This is not an indication of a problematic class. LCOM1 is not suitable for
measuring such classes.
Implementation details. The definition of LCOM1 deals with instance variables but all methods of a class.
Class variables (Shared variables in VB.NET) are not taken into account. On the contrary, all the methods are
taken into account, whether Shared or not.
Project Analyzer assumes that a procedure in a class is a method if it can have code in it. Thus, Subs,
Functions and each of Property Get/Set/Let are methods, whereas a DLL declare or Event declaration are not
methods. What is more, empty procedure definitions, such as abstract MustOverride procedures in VB.NET, are
not methods.
Readings for LCOM1
Shyam R. Chidamber, Chris F. Kemerer: A Metrics suite for Object Oriented design. M.I.T. Sloan
School of Management E53-315. 1993.
Victor Basili, Lionel Briand and Walcélio Melo: A Validation of Object-Oriented Design Metrics as
Quality Indicators. IEEE Transactions on Software Engineering. Vol. 22, No. 10, October 1996.
Bindu S. Gupta: A Critique of Cohesion Measures in the Object-Oriented Paradigm. Master of Science
Thesis. Michigan Technological University, Department of Computer Science. 1997.
Implementation details. m is equal to WMC. a contains all variables whether Shared or not. All accesses to a
variable are counted.
LCOM2
LCOM2 = 1 - sum(mA)/(m*a)
LCOM2 equals the percentage of methods that do not access a specific attribute averaged over all attributes in
the class. If the number of methods or attributes is zero, LCOM2 is undefined and displayed as zero.
LCOM3 alias LCOM*
LCOM3 = (m - sum(mA)/a) / (m-1)
LCOM3 varies between 0 and 2. Values 1..2 are considered alarming.
In a normal class whose methods access the class's own variables, LCOM3 varies between 0 (high cohesion)
and 1 (no cohesion). When LCOM3=0, each method accesses all variables. This indicates the highest possible
cohesion. LCOM3=1 indicates extreme lack of cohesion. In this case, the class should be split.
When there are variables that are not accessed by any of the class's methods, 1 < LCOM3 <= 2. This happens
if the variables are dead or they are only accessed outside the class. Both cases represent a design flaw. The
class is a candidate for rewriting as a module. Alternatively, the class variables should be encapsulated with
accessor methods or properties. There may also be some dead variables to remove.
If there are no more than one method in a class, LCOM3 is undefined. If there are no variables in a class,
LCOM3 is undefined. An undefined LCOM3 is displayed as zero.
Readings for LCOM2/LCOM3
Henderson-Sellers, B, L, Constantine and I, Graham: Coupling and Cohesion (Towards a Valid Metrics
Suite for Object-Oriented Analysis and Design). Object-Oriented Systems, 3(3), pp143-158, 1996.
Henderson-Sellers, 1996, Object-Oriented Metrics: Measures of Complexity, Prentice Hall.
Roger Whitney: Course material. CS 696: Advanced OO. Doc 6, Metrics. Spring Semester, 1997. San
Diego State University.
For TCC and LCC we only consider visible methods (whereas the LCOMx metrics considered all
methods). A method is visible unless it is Private. A method is visible also if it implements an
interface or handles an event. In other respects, we use the same definition for a method as for
LCOM4.
When 2 methods are related this way, we call them directly connected.
When 2 methods are not directly connected, but they are connected via other methods, we call
them indirectly connected. Example: A - B - C are direct connections. A is indirectly connected to C
(via B).
What are good or bad values? According to the authors, TCC<0.5 and LCC<0.5 are considered
non-cohesive classes. LCC=0.8 is considered "quite cohesive". TCC=LCC=1 is a maximally
cohesive class: all methods are connected.
As the authors Bieman & Kang stated: If a class is designed in ad hoc manner and unrelated components are
included in the class, the class represents more than one concept and does not model an entity. A class
designed so that it is a model of more than one entity will have more than one group of connections in the
class. The cohesion value of such a class is likely to be less than 0.5.
LCC tells the overall connectedness. It depends on the number of methods and how they group
together.
When LCC=1, all the methods in the class are connected, either directly or indirectly. This
is the cohesive case.
When LCC<1, there are 2 or more unconnected method groups. The class is not cohesive.
You may want to review these classes to see why it is so. Methods can be unconnected
because they access no class-level variables or because they access totally different
variables.
When LCC=0, all methods are totally unconnected. This is the non-cohesive case.
TCC tells the "connection density", so to speak (while LCC is only affected by whether the methods
are connected at all).
TCC=LCC=1 is the maximally cohesive class where all methods are directly connected to
each other.
When TCC=LCC<1, all existing connections are direct (even though not all methods are
connected).
When TCC<LCC, the "connection density" is lower than what it could be in theory. Not all
methods are directly connected with each other. For example, A & B are connected
through variable x and B & C through variable y. A and C do not share a variable, but they
are indirectly connected via B.
When TCC=0 (and LCC=0), the class is totally non-cohesive and all the methods are
totally unconnected.
This example (left) shows the same class as above. The connections considered are marked with
thick violet lines. A and B are connected via variable x. C and D are connected via variable y. E is
not connected because its call tree doesn't access any variables. There are 2 direct ("tight")
connections. There are no additional indirect connections this time.
On the right, we made C access x to increase cohesion. Now {A, B, C} are directly connected via
x. C and D are still connected via y and E stays unconnected. There are 4 direct connections, thus
TCC=4/10. The indirect connections are A–D and B–D. Thus, LCC=(4+2)/10=6/10.
TCC/LCC readings
Bieman, James M. & Kang, Byung-Kyoo: Cohesion and reuse in an object-oriented system.
Proceedings of the 1995 Symposium on Software. Pages: 259 - 262. ISSN:0163-5948. ACM Press,
New York, NY, USA. (The original definition of TCC and LCC.)
High LCOM4 means non-cohesive class. LCOM4=1 is best. Higher values are non-cohesive.
High TCC and LCC means cohesive class. TCC=LCC=1 is best. Lower values are less
cohesive.
Auxiliary methods (leaf methods that don't access variables) are treated differently.
LCOM4 accepts them as cohesive methods. TCC and LCC consider them non-cohesive. See
method E in the examples above.
Validity of cohesion
Is data cohesion the right kind of cohesion? Should the data and the methods in a class be
related? If your answer is yes, these cohesion measures are the right choice for you. If, on the
other hand, you don't care about that, you don't need these metrics.
There are several ways to design good classes with low cohesion. Here are some examples:
A class groups related methods, not data. If you use classes as a way to group auxiliary
procedures that don't work on class-level data, the cohesion is low. This is a viable,
cohesive way to code, but not cohesive in the "connected via data" sense.
A class groups related methods operating on different data. The methods perform related
functionalities, but the cohesion is low as they are not connected via data.
A class provides stateless methods in addition to methods operating on data. The stateless
methods are not connected to the data methods and cohesion is low.
A class provides no data storage. It is a stateless class with minimal cohesion. Such a class
could well be written as a standard module, but if you prefer classes instead of modules,
the low cohesion is not a problem, but a choice.
A class provides storage only. If you use a class as a place to store and retrieve related
data, but the class doesn't act on the data, its cohesion can be low. Consider a class that
encapsulates 3 variables and provides 3 properties to access each of these 3 variables.
Such a class displays low cohesion, even though it is well designed. The class could well be
split into 3 small classes, yet this may not make any sense.
Defect Density
Cyclomatic Complexity
Cognitive Complexity
Maintainability Rating
Coupling Factor
Lack of Documentation
Defect Density
- One of many post mortem quality metrics
- Defect count found divided by number of lines of code
- Quick check of quality between classes
Cyclomatic Complexity
- Complexity determined by the number of paths in a method
- All functions start with 1
- Add one for each control flow split (if/while/for… etc)
Cognitive Complexity
- An extension of cyclomatic complexity
- Attempts to incorporate an intuitive sense of complexity
- Favours measuring complex structure over method count
Maintainability Rating
- Measure of the technical debt against the development time to date
- Example rating for differing ratios (used by SonarQube):
o A – les that 5% of time already spent on development
o B – 6%-10%
o C – 11%-20%
o D – 21%-50%
o E – 50% and above
Coupling Factor
- The ratio of classes of coupled class and the total number of possible coupled classes
- 2 classes are couple if one has a dependency on the other
- The coupling factor (CF) or package p is defined as
Lack of Documentation
- Simplistic ratio of code which is documented and all code
- 1 Comment per class and 1 comment per method is expected
http://www.arisa.se/compendium/node121.html#metric:LOD
Lack Of Documentation ( )
Works with all instances of a common meta-model, regardless of whether they were produced with
the Java or the UML front-end.
Handle
Description
How many comments are lacking in a class, considering one class comment and a
comment per method as optimum. Structure and content of the comments are ignored.
Scope
Class
View
Grammar
Relations
Mapping :
Definition
Scale
Rational.
Domain
Rational numbers .
Highly Related Software Quality Properties
Re-Usability 2.4
is inversely influenced by LOD.
Understandability for Reuse 2.4.1:
Understanding if a class is suitable for reuse depends on its degree of documentation.
Maintainability 2.6
decreases with increasing LOD.
Analyzability 2.6.1:
The effort and time for diagnosis of deficiencies or causes of failures in software entity, or
for identification of parts to be modified is directly related to its degree of documentation.
Changeability 2.6.2:
Changing a class requires prior understanding, which, in turn, is more complicated for
undocumented classes.
Testability 2.6.4:
Writing test cases for classes and methods requires understanding, which, in turn, is more
complicated for undocumented classes.
Portability 2.7
decreases with increasing LOD.
Adaptability 2.7.1:
As for changeability 2.6.2, the degree of documentation of software has a direct impact.
Each modification requires understanding which is more complicated for undocumented
systems.
Replaceablity 2.7.4:
The substitute of a component must imitate its interface. Undocumented interfaces are
difficult to check for substitutability and to actually substitute.
Reliability 2.2
decreases with increasing LOD.
Maturity 2.2.1:
Due to reduced analyzability 2.6.1 and testability 2.6.4, bugs might be left in
undocumented software. Therefore, maturity may be influenced by degree of
documentation.
Re-Usability 2.4
is inversely influenced by LOD.
Attractiveness 2.3.4:
Attractiveness of a class depends on its adherence to coding conventions such as degree of
documentation.
Maintainability 2.6
decreases with increasing LOD.
Stability 2.6.3:
Due to reduced analyzability 2.6.1 and testability 2.6.4, also stability may be influenced
negatively by size.
Complexity
Complexity (complexity)
It is the Cyclomatic Complexity calculated based on the number of paths through the code. Whenever the
control flow of a function splits, the complexity counter gets incremented by one. Each function has a
minimum complexity of 1. This calculation varies slightly by language because keywords and functionalities do.
Language-specific details
Cognitive Complexity (cognitive_complexity)
How hard it is to understand the code's control flow. See the Cognitive Complexity White Paper for a complete
description of the mathematical model applied to compute this measure.
Duplications
Issues
Maintainability
The Maintainability Rating scale can be alternately stated by saying that if the outstanding remediation cost is:
<=5% of the time that has already gone into the application, the rating is A
between 6 to 10% the rating is a B
between 11 to 20% the rating is a C
between 21 to 50% the rating is a D
anything over 50% is an E
Technical Debt (sqale_index)
Effort to fix all Code Smells. The measure is stored in minutes in the database. An 8-hour day is assumed when
values are shown in days.
Technical Debt on New Code (new_technical_debt)
Effort to fix all Code Smells raised for the first time in the New Code period.
Technical Debt Ratio (sqale_debt_ratio)
Ratio between the cost to develop the software and the cost to fix it. The Technical Debt Ratio formula is:
Remediation cost / Development cost
Which can be restated as:
Remediation cost / (Cost to develop 1 line of code * Number of lines of code)
The value of the cost to develop a line of code is 0.06 days.
Technical Debt Ratio on New Code (new_sqale_debt_ratio)
Ratio between the cost to develop the code changed in the New Code period and the cost of the issues linked
to it.
Quality Gates
Reliability
Bugs (bugs)
Number of bug issues.
New Bugs (new_bugs)
Number of new bug issues.
Reliability Rating (reliability_rating)
A = 0 Bugs
B = at least 1 Minor Bug
C = at least 1 Major Bug
D = at least 1 Critical Bug
E = at least 1 Blocker Bug
Reliability remediation effort (reliability_remediation_effort)
Effort to fix all bug issues. The measure is stored in minutes in the DB. An 8-hour day is assumed when values
are shown in days.
Reliability remediation effort on new code (new_reliability_remediation_effort)
Same as Reliability remediation effort but on the code changed in the New Code period.
Security
Vulnerabilities (vulnerabilities)
Number of vulnerability issues.
Vulnerabilities on new code (new_vulnerabilities)
Number of new vulnerability issues.
Security Rating (security_rating)
A = 0 Vulnerabilities
B = at least 1 Minor Vulnerability
C = at least 1 Major Vulnerability
D = at least 1 Critical Vulnerability
E = at least 1 Blocker Vulnerability
Security remediation effort (security_remediation_effort)
Effort to fix all vulnerability issues. The measure is stored in minutes in the DB. An 8-hour day is assumed when
values are shown in days.
Security remediation effort on new code (new_security_remediation_effort)
Same as Security remediation effort but on the code changed in the New Code period.
Security Hotspots (security_hotspots) Number of Security Hotspots
Security Hotspots on new code (new_security_hotspots) Number of new Security Hotspots in the New Code
Period.
Security Review Rating (security_review_rating)
The Security Review Rating is a letter grade based on the percentage of Reviewed (Fixed or Safe) Security
Hotspots.
A = >= 80%
B = >= 70% and <80%
C = >= 50% and <70%
D = >= 30% and <50%
E = < 30%
Security Review Rating on new code (new_security_review_rating)
Security Review Rating for the New Code Period.
Ratio Formula: Number of Reviewed (Fixed or Safe) Hotspots x 100 / (To_Review Hotspots + Reviewed
Hotspots)
New Security Hotspots Reviewed
Percentage of Reviewed Security Hotspots (Fixed or Safe) for the New Code Period.
Size
Classes (classes)
Number of classes (including nested classes, interfaces, enums and annotations).
Comment lines (comment_lines)
Number of lines containing either comment or commented-out code.
Non-significant comment lines (empty comment lines, comment lines containing only special characters, etc.)
do not increase the number of comment lines.
Language-specific details
Comments (%) (comment_lines_density)
Density of comment lines = Comment lines / (Lines of code + Comment lines) * 100
With such a formula:
50% means that the number of lines of code equals the number of comment lines
100% means that the file only contains comment lines
Directories (directories)
Number of directories.
Files (files)
Number of files.
Lines (lines)
Number of physical lines (number of carriage returns).
Lines of code (ncloc)
Number of physical lines that contain at least one character which is neither a whitespace nor a tabulation nor
part of a comment.
Language-specific details
Lines of code per language (ncloc_language_distribution)
Non Commenting Lines of Code Distributed By Language
Functions (functions)
Number of functions. Depending on the language, a function is either a function or a method or a paragraph.
Language-specific details
Projects (projects)
Number of projects in a Portfolio.
Statements (statements)
Number of statements.
Tests
Condition coverage (branch_coverage)
On each line of code containing some boolean expressions, the condition coverage simply answers the
following question: 'Has each boolean expression been evaluated both to true and false?'. This is the density of
possible conditions in flow control structures that have been followed during unit tests execution.
Condition coverage = (CT + CF) / (2*B)
where
Design Patterns
Elements of reusable design
Vocabulary to talk about design
Just another abstraction, among many, to reason about design
Observer
Common barrier to better modulartity – Need to maintain consistent state
Subject – has changing state
Observer – needs to know about changes to that state
Obvious approach – subject informs observers on change
https://sourcemaking.com/design_patterns/observer/cpp/1
Before
The number and type of "user interface" (or dependent) objects is hard-wired in the Subjectclass. The user has
no ability to affect this configuration.
class DivObserver
{
int m_div;
public:
DivObserver(int div)
{
m_div = div;
}
void update(int val)
{
cout << val << " div " << m_div << " is " << val / m_div << '\n';
}
};
class ModObserver
{
int m_mod;
public:
ModObserver(int mod)
{
m_mod = mod;
}
void update(int val)
{
cout << val << " mod " << m_mod << " is " << val % m_mod << '\n';
}
};
class Subject
{
int m_value;
DivObserver m_div_obj;
ModObserver m_mod_obj;
public:
Subject(): m_div_obj(4), m_mod_obj(3){}
void set_value(int value)
{
m_value = value;
notify();
}
void notify()
{
m_div_obj.update(m_value);
m_mod_obj.update(m_value);
}
};
int main()
{
Subject subj;
subj.set_value(14);
}
Output
14 div 4 is 3
14 mod 3 is 2
After
The Subject class is now decoupled from the number and type of Observer objects. The client has asked for
two DivObserver delegates (each configured differently), and one ModObserver delegate.
class Observer
{
public:
virtual void update(int value) = 0;
};
class Subject
{
int m_value;
vector m_views;
public:
void attach(Observer *obs)
{
m_views.push_back(obs);
}
void set_val(int value)
{
m_value = value;
notify();
}
void notify()
{
for (int i = 0; i < m_views.size(); ++i)
m_views[i]->update(m_value);
}
};
int main()
{
Subject subj;
DivObserver divObs1(&subj, 4);
DivObserver divObs2(&subj, 3);
ModObserver modObs3(&subj, 3);
subj.set_val(14);
}
Output
14 div 4 is 3
14 div 3 is 4
14 mod 3 is 2
Strategy Pattern
Strategy
Define a family of algorithms, encapsulate each one and make the interchangeable
Definition
Client
Context
Strategy
ConcreteStrategy
Example 1
Logger can be configured for several types of output
Selecting strategy
Whose responsibility is it to select the strategy to use?
When should the context select? When should it not?
Client or context requests strategy instances from a StrategyFactory, passing properties uninterpreted
https://www.oodesign.com/strategy-pattern.html
Strategy
Motivation
There are common situations when classes differ only in their behavior. For this cases is a good idea to isolate
the algorithms in separate classes in order to have the ability to select different algorithms at runtime.
Intent
Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the
algorithm vary independently from clients that use it.
Implementation
Strategy - defines an interface common to all supported algorithms. Context uses this interface to call the
algorithm defined by a ConcreteStrategy.
Context
The context object receives requests from the client and delegates them to the strategy object. Usually the
ConcreteStartegy is created by the client and passed to the context. From this point the clients interacts only
with the context.
Let's consider an application used to simulate and study robots interaction. For the beginning a simple
application is created to simulate an arena where robots are interacting. We have the following classes:
Robot - The robot is the context class. It keeps or gets context information such as position, close obstacles,
etc, and passes necessary information to the Strategy class.
In the main section of the application the several robots are created and several different behaviors are
created. Each robot has a different behavior assigned: 'Big Robot' is an aggressive one and attacks any other
robot found, 'George v.2.1' is really scared and run away in the opposite direction when it encounter another
robot and 'R2' is pretty calm and ignore any other robot. At some point the behaviors are changed for each
robot.
public interface IBehaviour
public int
}
public Robot(String
{
this.name =
}
public IBehaviour
{
return
}
public void
{
System.out.println(this.name + ": Based on
"the behaviour object decide
int command =
// ... send the command
System.out.println("\tThe result returned by behaviou
"is sent to the movement
" for the robot '"
}
Robot r1 = new
Robot r2 = new
Robot r3 =
r1.setBehaviour(new
r2.setBehaviour(new
r3.setBehaviour(new
r1.move();
r2.move();
r3.move();
System.out.println("\r\nNew behaviours:
"\r\n\t'Big Robot' gets real
"\r\n\t, 'George v.2.1' becomes really
"it's always attacked by
"\r\n\t and R2 keeps
r1.setBehaviour(new
r2.setBehaviour(new
r1.move();
r2.move();
r3.move();
}
}
When data should be passed the drawbacks of each method should be analyzed. For example, if some classes
are created to encapsulate additional data, a special care should be paid to what fields are included in the
classes. Maybe in the current implementation all required fields are added, but maybe in the future some new
strategy concrete classes require data from context which are not include in additional classes. Another fact
should be specified at this point: it's very likely that some of the strategy concrete classes will not use field
passed to the in the additional classes.
On the other side, if the context object is passed to the strategy then we have a tighter coupling between
strategy and context.
Hot points
The strategy design pattern splits the behavior (there are many behaviors) of a class from the class itself. This
has some advantages, but the main draw back is that a client must understand how the Strategies differ. Since
clients get exposed to implementation issues the strategy design pattern should be used only when the
variation in behavior is relevant to them.
https://www.java67.com/2014/12/strategy-pattern-in-java-with-example.html
The strategy pattern has found its place in JDK, and you know what I mean if you have sorted ArrayList in Java.
Yes, a combination of Comparator, Comparable, and Collections.sort() method are one of the best real-world
examples of the Strategy design pattern.
To understand it more, let's first find out what Strategy pattern is? The first clue is the name itself. The
strategy pattern defines a family of related algorithms l sorting algorithms like bubble sort, quicksort, insertion
sort and merge sort, or compression algorithm e.g. zip, gzip, tar, jar, encryption algorithm e.g. MD 5, AES etc
and lets the algorithm vary independently from clients that use it.
For example, you can use Strategy pattern to implement a method which sort numbers and allows the client to
choose any sorting algorithm at runtime, without modifying client's code. So essentially Strategy pattern
provides flexibility, extensible and choice. You should consider using this pattern when you need to select an
algorithm at runtime.
In Java, a strategy is usually implemented by creating a hierarchy of classes that extend from a base interface
known as Strategy. In this tutorial, we will learn some interesting things about Strategy pattern by writing a
Java program and demonstrating how it add value to your code. See Head First design pattern for more details
on this and other GOF patterns. This book is also updated to Java SE 8 on its 10th anniversary.
If you look at UML diagram of these two patterns they look identical but the intent of State pattern is to
facilitate state transition while the intent of Strategy pattern is to change the behavior of a class by changing
internal algorithm at runtime without modifying the class itself. That's why Strategy pattern is part of
behavioral patterns in GOF's original list.
You can correlate Strategy pattern with how people use a different strategy to deal with a different situation,
for example if you are confronted with a situation then you will either deal with it or run away two different
strategies.
If you have not read already then you should also read the Head First Design pattern, one of the best books to
learn practical use of design pattern in Java application. Most of my knowledge of GOF design patterns are
attributed to this book. It's been 10 years since the book was first released, thankfully it is now updated to
cover Java SE 8 as well.
For example, Rook will move only horizontal or vertical, Bishop will move diagonally, Pawn will move one cell
at a time and Queen can move horizontal, vertical and diagonally. The different strategy employed by
different pieces for movement are an implementation of Strategy interface and the code which moves pieces
is our Context class.
When we change piece, we don't need to change Context class. If a new piece is added, then also your code
which takes care of moving pieces will not require to be modified.
If you use the Strategy pattern, you will be adding a new Strategy by writing a new class that just needs to
implement the Strategy interface. Because of the open-closed principle violation, we cannot use Enum to
implement the Strategy pattern.
Though it has some advantage and suits well if you know major algorithms well in advance you need to modify
your Enum class to add new algorithms which is a violation of open-closed principle. To learn more see here.
Another example is java.util.Arrays#sort(T[], Comparator < ? super T > c) method which similar
to Collections.sort() method, except need array in place of List.
You can also see the classic Head First Design pattern book for more real-world examples of Strategy and other
GOF patterns.
/**
* Java Program to implement Strategy design pattern in Java.
* Strategy pattern allows you to supply different strategy without
* changing the Context class, which uses that strategy. You can
* also introduce new sorting strategy any time. Similar example
* is Collections.sort() method, which accept Comparator or Comparable
* which is actually a Strategy to compare objects in Java.
*
* @author WINDOWS 8
*/
interface Strategy {
public void sort(int[] numbers);
}
@Override
public void sort(int[] numbers) {
System.out.println("sorting array using bubble sort strategy");
@Override
public void sort(int[] numbers) {
System.out.println("sorting array using insertion sort strategy");
}
}
@Override
public void sort(int[] numbers) {
System.out.println("sorting array using quick sort strategy");
}
}
class MergeSort implements Strategy {
@Override
public void sort(int[] numbers) {
System.out.println("sorting array using merge sort strategy");
}
}
class Context {
private final Strategy strategy;
Output
sorting array using bubble sort strategy
sorting array using quick sort strategy
1) This pattern defines a set of related algorithm and encapsulate them in separated classes, and allows the
client to choose any algorithm at run time.
2) It allows adding a new algorithm without modifying existing algorithms or context class, which uses
algorithms or strategies.
4) The strategy pattern is based upon the Open Closed design principle of SOLID Principles of Object-Oriented
Design.
That's all about how to implement the Strategy pattern in Java. For your practice write a Java program to
implement encoding and allow the client to choose between different encoding algorithm like base 64. This
pattern is also very useful when you have situation where you need to behave differently depending upon type
like writing a method to process trades if trades are of type NEW, it will be processed differently, if it is
CANCEL then it will be processed differently and if its AMEND then also it will be handled differently, but
remember each time we need to process trade.
Further Learning
Design Pattern Library
From 0 to 1: Design Patterns - 24 That Matter - In Java
Java Design Patterns - The Complete Masterclass
SOLID Principles of Object-Oriented Design
Thanks for reading this article so far. If you like these design pattern interview questions then please share
with your friends and colleagues. If you have any questions or feedback then please drop a note.
P. S. - If you are looking for some free courses to learn Design Pattern and Software Architecture, I also suggest
you check out the Java Design Patterns and Architecture course by John Purcell on Udemy. It's completely
free, all you need to do is create an Udemy account to access this course.
Adapter Pattern
Adapter limits change when replacing classes
- Legacy code relies on the same interface as before
- Adapter maps the legacy calls to methods of new class
- 2 forms of the adapter – object adapter and class adapter
- 1 of several forms of refactoring pattern
https://www.geeksforgeeks.org/adapter-pattern/
Adapter Pattern
This pattern is easy to understand as the real world is full of adapters. For example consider a USB to Ethernet
adapter. We need this when we have an Ethernet interface on one end and USB on the other. Since they are
incompatible with each other. we use an adapter that converts one to other. This example is pretty analogous
to Object Oriented Adapters. In design, adapters are used when we have a class (Client) expecting some type
of object and we have an object (Adaptee) offering the same features but exposing a different interface.
To use an adapter:
1. The client makes a request to the adapter by calling a method on it using the target interface.
2. The adapter translates that request on the adaptee using the adaptee interface.
3. Client receive the results of the call and is unaware of adapter’s presence.
Definition:
The adapter pattern convert the interface of a class into another interface clients expect. Adapter lets classes
work together that couldn’t otherwise because of incompatible interfaces.
Class Diagram:
The client sees only the target interface and not the adapter. The adapter implements the target interface.
Adapter delegates all requests to Adaptee.
Example:
Suppose you have a Bird class with fly() , and makeSound()methods. And also a ToyDuck class with squeak()
method. Let’s assume that you are short on ToyDuck objects and you would like to use Bird objects in their
place. Birds have some similar functionality but implement a different interface, so we can’t use them directly.
So we will use adapter pattern. Here our client would be ToyDuck and adaptee would be Bird.
Below is Java implementation of it.
filter_none
edit
play_arrow
brightness_4
// Java implementation of Adapter pattern
interface Bird
{
// birds implement Bird interface that allows
// them to fly and make sounds adaptee interface
public void fly();
public void makeSound();
}
interface ToyDuck
{
// target interface
// toyducks dont fly they just make
// squeaking sound
public void squeak();
}
class Main
{
public static void main(String args[])
{
Sparrow sparrow = new Sparrow();
ToyDuck toyDuck = new PlasticToyDuck();
System.out.println("Sparrow...");
sparrow.fly();
sparrow.makeSound();
System.out.println("ToyDuck...");
toyDuck.squeak();
Here instead of having an adaptee object inside adapter (composition) to make use of its functionality adapter
inherits the adaptee.
Since multiple inheritance is not supported by many languages including java and is associated with many
problems we have not shown implementation using class adapter pattern.
Advantages:
Helps achieve reusability and flexibility.
Client class is not complicated by having to use a different interface and can use polymorphism to swap
between different implementations of adapters.
Disadvantages:
All requests are forwarded, so there is a slight increase in the overhead.
Sometimes many adaptations are required along an adapter chain to reach the type which is required.
References:
Head First Design Patterns ( Book )
This article is contributed by Sulabh Kumar. If you like GeeksforGeeks and would like to contribute, you can
also write an article and mail your article to [email protected]. See your article appearing on the
GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic
discussed above
https://refactoring.guru/design-patterns/adapter
Adapter
Intent
Adapter is a structural design pattern that allows objects with incompatible interfaces to collaborate.
Problem
Imagine that you’re creating a stock market monitoring app. The app downloads the stock data from multiple
sources in XML format and then displays nice-looking charts and diagrams for the user.
At some point, you decide to improve the app by integrating a smart 3rd-party analytics library. But there’s a
catch: the analytics library only works with data in JSON format.
You can’t use the analytics library “as is” because it expects the data in a format that’s incompatible with
your app.
You could change the library to work with XML. However, this might break some existing code that relies on
the library. And worse, you might not have access to the library’s source code in the first place, making this
approach impossible.
Solution
You can create an adapter. This is a special object that converts the interface of one object so that another
object can understand it.
An adapter wraps one of the objects to hide the complexity of conversion happening behind the scenes. The
wrapped object isn’t even aware of the adapter. For example, you can wrap an object that operates in meters
and kilometers with an adapter that converts all of the data to imperial units such as feet and miles.
Adapters can not only convert data into various formats but can also help objects with different interfaces
collaborate. Here’s how it works:
1. The adapter gets an interface, compatible with one of the existing objects.
2. Using this interface, the existing object can safely call the adapter’s methods.
3. Upon receiving a call, the adapter passes the request to the second object, but in a format and order
that the second object expects.
Sometimes it’s even possible to create a two-way adapter that can convert the calls in both directions.
Let’s get back to our stock market app. To solve the dilemma of incompatible formats, you can create XML-to-
JSON adapters for every class of the analytics library that your code works with directly. Then you adjust your
code to communicate with the library only via these adapters. When an adapter receives a call, it translates
the incoming XML data into a JSON structure and passes the call to the appropriate methods of a wrapped
analytics object.
Real-World Analogy
Structure
Object adapter
This implementation uses the object composition principle: the adapter implements the interface of one
object and wraps the other one. It can be implemented in all popular programming languages.
1. The Client is a class that contains the existing business logic of the program.
2. The Client Interface describes a protocol that other classes must follow to be able to collaborate with the
client code.
3. The Service is some useful class (usually 3rd-party or legacy). The client can’t use this class directly because it
has an incompatible interface.
4. The Adapter is a class that’s able to work with both the client and the service: it implements the client
interface, while wrapping the service object. The adapter receives calls from the client via the adapter
interface and translates them into calls to the wrapped service object in a format it can understand.
5. The client code doesn’t get coupled to the concrete adapter class as long as it works with the adapter via the
client interface. Thanks to this, you can introduce new types of adapters into the program without breaking
the existing client code. This can be useful when the interface of the service class gets changed or replaced:
you can just create a new adapter class without changing the client code.
Class adapter
This implementation uses inheritance: the adapter inherits interfaces from both objects at the same time.
Note that this approach can only be implemented in programming languages that support multiple
inheritance, such as C++.
1. The Class Adapter doesn’t need to wrap any objects because it inherits behaviors from both the client and the
service. The adaptation happens within the overridden methods. The resulting adapter can be used in place of
an existing client class.
Pseudocode
This example of the Adapter pattern is based on the classic conflict between square pegs and round holes.
method getRadius() is
// Return the radius of the hole.
class RoundPeg is
constructor RoundPeg(radius) { ... }
method getRadius() is
// Return the radius of the peg.
method getWidth() is
// Return the square peg width.
// An adapter class lets you fit square pegs into round holes.
// It extends the RoundPeg class to let the adapter objects act
// as round pegs.
class SquarePegAdapter extends RoundPeg is
// In reality, the adapter contains an instance of the
// SquarePeg class.
private field peg: SquarePeg
method getRadius() is
// The adapter pretends that it's a round peg with a
// radius that could fit the square peg that the adapter
// actually wraps.
return peg.getWidth() * Math.sqrt(2) / 2
Use the Adapter class when you want to use some existing class, but its interface isn’t compatible with the
rest of your code.
The Adapter pattern lets you create a middle-layer class that serves as a translator between your code and a
legacy class, a 3rd-party class or any other class with a weird interface.
Use the pattern when you want to reuse several existing subclasses that lack some common functionality
that can’t be added to the superclass.
You could extend each subclass and put the missing functionality into new child classes. However, you’ll need
to duplicate the code across all of these new classes, which smells really bad.
The much more elegant solution would be to put the missing functionality into an adapter class. Then you
would wrap objects with missing features inside the adapter, gaining needed features dynamically. For this to
work, the target classes must have a common interface, and the adapter’s field should follow that interface.
This approach looks very similar to the Decorator pattern.
How to Implement
1. Make sure that you have at least two classes with incompatible interfaces:
o A useful service class, which you can’t change (often 3rd-party, legacy or with lots of existing
dependencies).
o One or several client classes that would benefit from using the service class.
2. Declare the client interface and describe how clients communicate with the service.
3. Create the adapter class and make it follow the client interface. Leave all the methods empty for now.
4. Add a field to the adapter class to store a reference to the service object. The common practice is to initialize
this field via the constructor, but sometimes it’s more convenient to pass it to the adapter when calling its
methods.
5. One by one, implement all methods of the client interface in the adapter class. The adapter should delegate
most of the real work to the service object, handling only the interface or data format conversion.
6. Clients should use the adapter via the client interface. This will let you change or extend the adapters without
affecting the client code.
Single Responsibility Principle. You can separate the interface or data conversion code from the primary
business logic of the program.
Open/Closed Principle. You can introduce new types of adapters into the program without breaking the
existing client code, as long as they work with the adapters through the client interface.
The overall complexity of the code increases because you need to introduce a set of new interfaces and
classes. Sometimes it’s simpler just to change the service class so that it matches the rest of your code.
Bridge is usually designed up-front, letting you develop parts of an application independently of each other. On
the other hand, Adapter is commonly used with an existing app to make some otherwise-incompatible classes
work together nicely.
Adapter changes the interface of an existing object, while Decorator enhances an object without changing its
interface. In addition, Decorator supports recursive composition, which isn’t possible when you use Adapter.
Adapter provides a different interface to the wrapped object, Proxy provides it with the same interface,
and Decorator provides it with an enhanced interface.
Facade defines a new interface for existing objects, whereas Adapter tries to make the existing interface
usable. Adapter usually wraps just one object, while Facade works with an entire subsystem of objects.
Bridge, State, Strategy (and to some degree Adapter) have very similar structures. Indeed, all of these patterns
are based on composition, which is delegating work to other objects. However, they all solve different
problems. A pattern isn’t just a recipe for structuring your code in a specific way. It can also communicate to
other developers the problem the pattern solves.
Architectural Qualities
Performance
Ease of maintenance
Security
Testability
Usability
Industry work group that created the standard also created this website (please forgive the lack
of web design):
http://www.iso-architecture.org/ieee-1471/getting-started.html
https://en.wikipedia.org/wiki/ISO/IEC_42010
This page provides some starting points for using the Standard (previously IEEE 1471:2000).
The Standard
The best place to start is with the Standard itself. The published version of ISO/IEC/IEEE 42010 can
be obtained from ISO or from IEEE.
The Standard has several parts. The first part is a conceptual model of architecture descriptions (ADs).
The conceptual model, sometimes called the “metamodel”, defines a set of terms and their relations
[see Conceptual Model]. This metamodel has been widely used and discussed in industry and in the
academic literature [see Bibliography]. The conceptual model establishes terms, their definitions and
relations which are then used in expressing the requirements.
Architecture Descriptions
The second part of the Standard describes best practices for creating an architecture description.
The best practices are specified in the Standard as 24 requirements (written as “shalls”) [see ADs]. An
architecture description conforms to the Standard if it satisfies those requirements. Taken together,
the “shalls” imply a process for preparing an AD roughly like this:
1. Identify the system stakeholders who have a stake in the architecture and record them in the
AD. Be sure to consider: the users, operators, acquirers, owners, suppliers, developers,
builders and maintainers of the system – when applicable.
2. Identify the architecture concerns of each stakeholder. Be sure to consider these common
concerns – when applicable:
o purposes of the system;
o suitability of the architecture for achieving system’s purposes;
o feasibility of constructing the system;
o potential risks and impacts of the system to its stakeholders, throughout the life cycle;
and
o maintainability and evolvability of the system
3. Choose one or more architecture viewpoints for expressing the architecture, such that each
concern [see 2, above] is framed by at least one viewpoint.
4. Record those viewpoints in the AD and provide a rationale for each choice of viewpoint.
5. Document each chosen viewpoint by writing a viewpoint definition [see below] or citing a
reference to an existing definition. Each viewpoint definition links the architecture
stakeholders and concerns to the kinds of notations and models to be used. Each viewpoint
definition helps readers of the AD interpret the resulting view.
6. Produce architecture views of the system, one for each chosen viewpoint. Each view must
follow the conventions of its viewpoint and include:
o one or more models;
o identifying information;
7. Document correspondences between view and model elements. Analyze consistency across
all views. Record any known inconsistencies.
8. Record the AD the rationale(s) for architecture decisions made and give evidence of the
consideration of multiple architectures and the rationale for this choice.
Templates for architecture descriptions and for documenting viewpoints are available
[see: Templates].
Architecture Viewpoints
At the core of the Standard is the idea of architecture viewpoints. A viewpoint is a way of looking at
the architecture of a system – or a set of conventions for a certain kind of architecture modeling. An
architecture viewpoint is determined by:
The third part of the Standard specifies best practices for architecture frameworks.
The Standard only specifies best practices on the documentation of architectures. It is intended to be
used within a life cycle and/or within the context of a method or architecture framework – used to
develop ADs. The Standard may be used to structure an AD when applied with various architecture
frameworks. For example, the Zachman framework matrix essentially identifies a set of stakeholders
and concerns to be addressed, and the cell entries suggest the (viewpoint) notations, languages and
models. [See the Survey of Architecture Frameworks which includes a number of “viewpoint sets”,
for the application domains of enterprises, systems and software.
Architectural Styles
Layered architecture
Only communication is between adjacent layers
Layers form abstraction hierarchy
Dependencies are top down
Client server is an example of a layered architecture
Architectural styles convey essential aspects
Architectural style constraints design to benefit certain attributes
Ensure that all components within the system work together well
Incorporates ideas at the enterprise level rather than class design
Established architectural styles should be recognisable and repeatable
Views
A view captures a model meant to address some concern
An architecture is the combination of all views
Each view is defined by a single concern call a Viewpoint
Viewpoint
View is the vehicle for portraying architecture
Viewpoint is the best practice for portraying some specific concern
A set of viewpoints should address artefacts, development and execution
Perspective
For cross cutting concerns that don’t fit into a single view
Quality attributes come into play here
Apply to all the viewpoints in the system
Writing Scenarios
Architectural quality
A system must exhibit values of quality attributes to be successful
Quality concerns tend to cut across views
Documenting (and testing) these qualities is difficult
Environment
System is operating normally, system has fewer than 50 concurrent users. Internet latency is less than 100 ms
from customer browser to site
Artefact
WAGAU tomcat web site
Response
Round trip from customer clicking ‘add to cart’ button to customer browser update showing updated cart
Response measure
95% of the time under 2.5 seconds
99.9% of the time under 10 seconds
Assignment
The learner will select a preferred website. Considering the website, the learner will identify three
quality attributes that would be important to the site's success and write one reasonable scenario
for each quality attribute, presented in one of the formats presented in class.
Select a public website that you use enough to be familiar with what a typical user may want to
do. This website should not require the peer reviewer to sign up for an account or pay to use the
site in any way. The website should also not, to the best of your knowledge, serve malware, use
JavaScript to complete cryptocurrency mining, or any other negative practice that might harm the
peer reviewer.
Select three quality attributes that are likely to be important when deciding a website architecture
for the website you chose. You can use usability, security, performance, reliability, or any other
reasonable quality attribute as the basis of your selection. Briefly explain the importance of each
quality attribute as it relates to the software/service you selected.
Then, for each selected quality attribute, write one scenario that would help quantitatively assess
whether the software solution meets its goal. You shall write your scenarios using a format that
was presented in the lectures.
Security Perspective
Security – security is the set of processes and technologies that allow the owners of resources in the system to
reliably control who can perform actions on particular resources
Security Threats
Identify sensitive resources
Red team – people to come in and attack the system to test its security
Security Tactics
Follow best practice to ease the difficult of meeting and assessing security
Domain experts have identified and organised defence and analysis
Technical and social attacks and defences need to be considered
Security Audit
- Most organizations are not mature enough to organize active deterrence
- Auditing of access and behaviour is essential to discovering breach
- Logging of all secure authorization and authentication is crucial
Coding Style
Some of this is form over function, there are many defensible choices
Choices still have to be made – consistency is key
A variety of types of rule can be considered
Naming Conventions
Variable vs Routine variable_name vs RoutineName()
Classes vs Objects Widget my_widget
Member Variables _variable or variable
Filenames
Pick a format
Common to use underscores ( _ ) and dashes (-) between words
- My_shape.cc
- My-shape.cc
- Myshape.cc
Even extensions could need specification
Indentation
A timeless battle – tabs vs spaces (see the reading)
Generally speaking, just be consistent
Important – Indent once per nesting level
Special rules for when code word wraps at line length limit
Placement of braces
Use of whitespace (eg. Placing pointer symbols)
Can make a real difference in understanding
Double * a;
Double* a;
Double *a;
Coding Style
A matter of ease of understanding
Attempts to ease cognitive lead
Consistency allows reader to focus on logic
Automated processes and tools exist – beautifier
https://wiki.c2.com/?TabsVersusSpaces
Tabs Versus Spaces
This is one of the eternal HolyWars among programmers: should source-code lines be indented using tab
characters or space characters?
People generally don't mind reading code that is consistently indented using tabs, or code that is consistently
indented using spaces. The problems arise when some lines are indented with tabs, and others with spaces. In
such cases, the code will only display nicely if the reader has the same tab stop settings as the authors and if all
the authors used consistent settings. One way to solve this problem is to force everyone to be "tab people" or
"space people".
The case for tabs:
Consistent display across all display hardware, text viewer/editor software, and user configuration
options.
Gives author control over visual effect.
Tabs are not as "visible" (that is, a tab generally looks just like a bunch of spaces)
Other wrinkles:
In some programming languages and tools, tabs and spaces do have different meanings. This
argument doesn't apply to those languages, except in relation to LanguagePissingMatches.
VersionControl systems generally do consider whitespace to be significant. If a developer reformats a
file to have different indentation, makes a minor change, and then checks in the change, the system is
likely to treat it as if the developer changed every line in the file.
In PythonLanguage, indentation is semantically significant. Either tabs or spaces may be used for
indentation, though it's recommended that programmers use one or the other and stick to it. To
quote the language reference, "tabs are replaced (from left to right) by one to eight spaces such that
the total number of characters up to and including the replacement is a multiple of eight (this is
intended to be the same rule as used by Unix)."
In some editors, hitting the TAB key inserts a tab character at the current position. However,
programmers' editors can be configured to use the TAB key to trigger the editors' auto-indent
function, which may generate multiple tab characters or multiple spaces.
Some text editors can determine the proper tab settings for a file by analyzing it or by reading special
markup contained within. Such editors are nice for those people who have them, but they do not
solve the general problem.
Programmers also argue about whether it is acceptable to use tab characters inside source code lines for
purposes other than left-side indentation. People may use these to align elements of a multi-line statement or
function call, or to align end-of-line comments. The issues are similar to those above, except that tabs inside
lines cause addtional problems.
Most programming languages provide a special character sequence (such as "\t" in C, C++, Java, and others) to
allow a programmer to make the difference between tabs and spaces more obvious inside strings and
character literals. In such languages, it is generally considered bad style to put actual tab characters inside
string and character literals.
" With these three interpretations, the ASCII TAB character is essentially being used as a compression
mechanism, to make sequences of SPACE-characters take up less room in the file." The rant totally ignores the
etymology of the tab, and this sentence kind of sums that up: typewriters (with their tab stops, often custom-
settable) had no concern for how many bytes it took to encode the content they were putting onto paper. Tabs
were essentially being used as an indentation mechanism, to ensure that the indentation was consistent ("Did I
hit <space> six times or only five?").
Yeah, but if you *really* need parameter 1 to line up, you can do it like this:
do_something_nifty(
meaningful_name_source,
meaningful_name_for_dest,
case_insensitive,
other_relevant_parameters
);
which works with either spaces or tabs, both monospace and proportional fonts.
Tabs for indentation, spaces for alignment
Emacs: http://www.emacswiki.org/SmartTabs
Vim: http://www.vim.org/scripts/script.php?script_id=231
For source code, indentation can in most cases be generated from the code's structure. Don't represent
"indent N levels" with tabs or spaces, you already have all you need stored as "if (...) {" on the line above.
Remember, source code is not text.
Then WhatIsSourceCode ?
As you can see from the example, the difference between "indent 1 level" and "indent to the next tab stop"
sometimes means the difference between the number of tab stops used to achieve the same visual
representation.
...and if programming languages were XML might be written something like <tab/>.
Most programming languages ignore the tab character as a whitespace. You might as well just not write it at
all, especially if your programming language were XML.
The problem is information entropy: it is difficult and sometimes impossible to compute "indent distance" from
a file which has no tabs...
That's exactly why you have to know what other people's preferred "number of columns" is if you are using tab
characters in programming languages which treat tabs as whitespaces.
There are two uses of "indent to the next tab stop" in such programming languages:
The first is to duplicate in a visually efficient form the structural information already contained in the code ("a
poor man's PrettyPrinter"). You don't lose this information if you convert whitespaces to whitespaces, and a
good PrettyPrinter will do much more than your tab surrogate.
The second is to provide a reader with the visual representation which isn't contained in the code itself. In this
case converting tabs to the corresponding space-indents loses nothing, but losing the tab stops layout makes a
mess from the visual representation.
I might create an editor that uses colors instead of actually offsetting the beginning of lines!
And you might create an editor that uses colors instead of letters. But most editors in use don't do that, and
rightly so. -- NikitaBelenki
There are surely more "meanings" than these four that just haven't occurred to me yet. Replacing tabs with
spaces is an incredibly BRAINDAMAGED approach because:
Brian Cantwell-Smith presents a compelling model that is perhaps relevant: He suggests that we contemplate
three different sets:
Notation: A particular "marking". "Three", "3", and "III" are three (hah!) different notations for the
same thing. Some notations for "tab" include "<Tab/>, "\t", "0x0009" and "0x09" (there may be
others).
Symbol: A unique token manipulated by a formal system. Various programs have various symbols for
tab -- some of the more broken programs require very elaborate symbols (such as Zawinski's
proposed combination of comments and spaces).
Meaning: An interpretation that we apply to a symbol. Four are enumerated above.
My suggestion is that a "file" can use whatever Notation it likes for tab, so long as the system that reads it has
a mapping onto its tab Symbol, and we are then able to apply whatever Meaning we desire. My own
experience has been that embedding an 0x09 in an 8-bit ascii file is easiest to read by the largest variety of
programs that I use, and results in the most straightforward way for me to reflect the various meanings that
tab has.
For the "meaning" that you get out of tabs, you don't get much mileage. So someone who prefers 4-space tabs
can be happy viewing the code you wrote assuming 3-space tabs. Is it worth the anal-retentiveness? It's really
hard to get it exactly right all the time. I have never seen anyone so anal-retentive. Part of the problem is that
by default in most editors, tab characters are indistinguishable from spaces. So you don't even know if you've
got it right by just casually looking at your own code.
Having spent the slight majority of my 11 years as a software developer doing maintenance programming, and
having downloaded a lot of source from SourceForge, I can tell you the defacto standard is Tabs Randomly
Interspersed In Varying Densities. I don't recall ever seeing a correctly tabbed source file, but I am very
thankful when I see a source file with no tab characters at all. They are readable, whether the indentation is 2,
3, or 4. What makes it unreadable is those random tabs. First, I try to guess the author's tab settings. It's
usually 2, 3, 4, 6, or 8. If that fails, or if there are several source files in the same project by different authors
with different tab settings, I reformat it (with vi's automatic reformatting (=)). But that's not preferable, since
continuation lines and multi-line comments tend to get strangely indented by that.
So, you can continue hoping the whole world turns anal-retentive. But I'm with jwz.
Problem: tabs . . . solution: write a program to convert source to your ideal format using editable rules (for ease
of maintenance).
So, you can continue hoping the whole world turns anal-retentive. But I'm with jwz. -- AnonymousDonor
ProgrammingIsHard. Close counts in horseshoes, grenades, and atom bombs. Not programming. If you aren't
"anal-retentive", then why (how? --ed) are you programming?
Source code works the same way regardless of whether it has tabs or spaces. That is why most people don't
worry much about it: ProgrammingIsHard and they have more important things to do than futz around with
making sure their tabs are consistent. Maybe they should, but they don't, and your silly warfare metaphors
won't change that. That's the author's point - tabs make things unnecessarily complicated and life would be
easier if everyone used spaces. -- KrisJohnson
I hope you aren't working on the makefiles for my next major system build. Oversimplifying complex things is a
source of much of today's increasingly unreliable software. Inconsistent tab/space formats that break already
fragile IDE's (because the IDE developers in turn seem to rely on oversimplified mechanisms to identify
syntactic elements) drive whole development teams into using brute-force character editors such as VI and
EMACS more than two decades after superior technologies became widely available. If you need a convincing
illustration, try debugging even the most straightforward JavaScript code under Venkman, the latest greatest
state of the art JavaScriptdebugger from the Mozilla folks. I know, you don't need a debugger - all you have to
do is insert lots of print statements and then look at the log files. For that matter, tell me about how your
"uncomplicated" tab-to-space-converting java editor will let you develop and debug the tab-separated output
that your client needs to import into their accounting packages. I guess you'll just have a simple little
enhancement that will distinguish the tabs you're trying to emit to the output from the tabs you're indenting
your code with - or perhaps you'll just skip all that stupid wasted whitespace and left-align everything.
I thought it was pretty clear that this discussion is related only to those cases where tabs and spaces are
equivalent. We aren't talking about changing the syntax of make or destroying tab-delimited files. I don't think
any automated tab-to-space conversion tool would be so brain-damaged that it would convert the "\t"
sequence in a Java string to a space. If you are putting literal tab characters into your literal Java strings, rather
than using \t, you and the poor souls maintaining your code are going to get some nasty surprises someday.
-- KrisJohnson
Where did anything on this page limit this discussion "only to those cases where tabs and spaces are
equivalent"? It sounds to me as though your convention is painting you into a corner where you are required
to jump through very special hoops to use your "simplification". Are you sure the source display in your java
debugger understands your spaces and tab stops the same way as your editor? Don't you also advocate this
for all the other block-structured languages - C, C++, PERL, Smalltalk, etc? Where does JavaScript fit into that
mix? Don't you indent tags in XML? Do you ever line up the RHS's of blocks of initializations, for legibility?
Don't you sometimes like to be able to read javadoc headers, in source code, that have something
approximating a tabular format?
My argument is that your one very specific oversimplification breaks a huge number of things humans do with
code to make it legible. Perhaps if you didn't always convert tabs into spaces, you might have more first-hand
experience with the huge number of things you exclude by your conversion.
You've created a clumsy special case when the general works just fine.
The article by jwz which started this page is clearly about the visual formatting of source code. I believe
everything else on the page up until your comments falls in line with that.
Obviously, tabs are not the same as spaces and replacing all the tabs in the world with spaces would be a
disaster. I do not advocate any such position, and you insult me by assuming that I do.
I went for several years being a "tab person" and several years being a "space person". I am aware of the
strengths and weaknesses of both positions, and use whichever convention is favored by whatever people I am
working with. I have used both conventions with several block structured languages and have never run into a
problem with any debuggers or other tools with either convention. If you are aware of a specific real-world
case where using spaces instead of tabs to indent Java or C++ source code causes problems, I'd be interested to
hear about it. (BTW, why would a source debugger fail if space characters are present???)
I don't understand why you ask whether I ever indent code or align things vertically. Obviously I do or I wouldn't
be participating in this discussion. Yes, I indent tags in XML. Yes, I line up the RHS's of blocks of initializations
for legibility. Yes, I sometimes like to be able to read javadoc headers, in source code, that have something
approximating a tabular format. All these things can be done using tabs or spaces. What have I written that
implies I favor doing away with all indentation? -- KrisJohnson
Please consider the simple case of a javadoc comment. Here is an excerpt from the javadoc header for
java.lang.Object.clone() (I'm using version 1.47, 10/01/98 as distributed in VisualAge Java for this example).
* @exception CloneNotSupportedException if the object's class does not
* support the <code>Cloneable</code> interface. Subclasses
* that override the <code>clone</code> method can also
* throw this exception to indicate that an instance cannot
* be cloned.
* @exception OutOfMemoryError if there is not enough memory.
* @see java.lang.Cloneable
*/
protected native Object clone() throws CloneNotSupportedException;
Please note that I've added a leading space to each line in order to avoid extra wiki prettification. Now,
although this surely makes me guilty of being anal retentive, please note that the continuation of the
"@exception" line is misaligned. Let me ask which, to you, is more legible - the above, or the following:
* @exception CloneNotSupportedException if the object's class does not
* support the
<code>Cloneable</code>
* interface. Subclasses that
override
* the <code>clone</code>
method can also
* throw this exception to indicate
that an
* instance cannot be cloned.
*
* @exception OutOfMemoryError if there is not enough memory.
* @see java.lang.Cloneable
*/
protected native Object clone() throws CloneNotSupportedException;
My clients and coworkers generally prefer the latter. Your approach precludes it. Is this specific enough for
you?
I'd like to add, parenthetically, that I despise the need to embed multiple tabs in the continuation lines - I
would think that by 2002 I'd be able to set tab stops as I've been able to do in any word processor for nearly 30
years. But that is clearly asking too much of the developer tool community.
There are (and have been for some time) plenty of editors which allow one to set tabs at arbitrary
positions. The problem with setting them so that this particular documentation lines up "correctly", is
that it's going to mess up the tab positions elsewhere in your source file -- AndySawyer
Your second example is a great illustration of the basic problem with tabs: your example only looks good if tab
stops are set at 8 characters, whereas many other authors' examples only look good if tab stops are set at 3, 4,
or 5 characters. Here's your second example with spaces used instead of tabs:
* @exception CloneNotSupportedException if the object's class does not
* support the <code>Cloneable</code>
* interface. Subclasses that override
* the <code>clone</code> method can also
* throw this exception to indicate that an
* instance cannot be cloned.
*
* @exception OutOfMemoryError if there is not enough memory.
* @see java.lang.Cloneable
*/
protected native Object clone() throws CloneNotSupportedException;
This looks identical (ignoring the question marks inserted by wiki), and it doesn't suffer when displayed in an
editor with 4-character tab stops. (BTW, it has been a while since I've used javadoc; if it just doesn't grok
spaces, then I'm completely wrong here.) -- KrisJohnson
Of course you can substitute spaces for tabs in my example, I created my example by substituting tabs for
spaces in the original. Why do you think my second example breaks if tab stops are set to other than 8? I just
tried this out using my handy-dandy MSWord and a monospace font. Yes, its true that because of the brain-
damaged tab support offered by most editors (including the lame textedit box in our browsers), the number of
extra tabs has to be adjusted - but even that can be done programatically for about the same effort as the tab-
to-space conversion macro. In my opinion, a modern source code editor should simply provide for tab stops in
the same way that every word processor has done it for decades.
That's exactly the point I've been trying to make. You can't open your tabbed example in any old source code
editor and expect it to look like you intend it to. But you can open the spaced example in practically any editor
with a monospaced font and it will look fine. -- kj
Meanwhile, how much work is it when we now have to change text of the comment? The tab-free approach
guarantees that *every* character added or removed anywhere in the line prior to the rightmost tabstop
forces a compensating manual realignment of the intervening whitespace. By the way, the misalignment that
results from the Wiki question-mark provides an illustrative example - it broke your tab-free approach.
I've found that a lot of work is required to edit examples such as the above with tabs or with spaces. With tabs,
you may need to hit fewer keys to get things lined up, but you do still have to deal with a variable number of
tabs between columns. To avoid these issues, I tend to format things much like your first example, whether I'm
using tabs or spaces. -- kj
Your approach also precludes the use of proportionally-spaced fonts for rendering source code. Proportionally-
spaced fonts have been demonstrated to be more legible for hundreds of years. In a proportionally-spaced
font, there is no single width for a "space" - the equivalent of a tab stop is the only option.
There is no single width for a tab-stop either. I actually do write code using proportional fonts once in a while,
just for a change of pace. You still have the problem that you don't know how many tabs you need to manually
insert to get your columns to line up, and everything gets screwed up if you change the font later. -- kj
Finally, I would suggest that all of this tab stuff would be even more elegantly simplified by providing effective
table support (because our Javadoc examples are really just tables).
Is the horse dead yet?
Yeah, rigor is setting in. I'll just note that much of what you state here about benefits of tabs is contingent upon
having editors that handle tabs in a much different way than most existing source code editors do. I agree that
it would be great if we had programming languages with support for smarter word-processor-like editors that
knew how to lay out tables of vertically aligned columns with proportional fonts and which would
automatically resize and reflow themselves up as they were edited. Whenever we do get those, we probably
won't be using space characters or tab characters in the files to control the layout; it will use some sort of
structured format like HTML, XML, or RTF. Until we do get those editors, we have to choose plain-old-ASCII
workarounds for the tools we do have. Sometimes tabs work best, and sometimes spaces work best. There is
no correct answer - both options suck. -- KrisJohnson
An alternate view on tabs is that when a file is saved, tab-stops are at 8-character positions, and your editor
should be smart enough to allow that when you enter a tab character, it indents however far you want it to,
and converts runs of spaces to equivalent tabs and spaces. Obviously for (arguably smelly) file-formats like
make, automatic expansion and compression of whitespace as tabs would need to be turned off.
-- MartinRudat who read something like that in a programmer's editor somewhere.
Perhaps neither spaces or tabs should be in source code other than spaces to separate symbols. The source
code editor should be able to style the source text independently of the meaningful code based on styling
rules than can be specified by the user of the editor. In this way the code is just the code, and differences
between versions when being compared will only highlight meaningful differences.
Neither a tabber nor a spacer be!
https://wiki.c2.com/?BadCodingStandards
Cluttering source files with revision histories when it's all available from {cvs rcs sccs pvcs svn...}
Imposing coding standards from one language on another, e.g. requiring variables to be declared at
the beginning of a method in Java, because that's how you do it in C.
Every method should have a header comment describing all parameters, callers, and callees (I'm not
kidding). See MassiveFunctionHeaders
o There've been some that require descriptions of all local variables in a method header as
well.
In C++ every "if", "for", and "while" should be followed by a compound block {} because someone
once put a stray semicolon after an if (do I get a special award for reporting this one? It doesn't even
solve the problem). See BracesAroundBlocks for more discussion of this point.
Importing HungarianNotation into Java "because that's what was useful in C".
HungarianNotation.
o HungarianNotation as it is commonly known is one of the ass-stupidest notions ever inflicted
on programmers and should never have escaped from the dark corner where it was
whelped. HungarianNotation as it was originally designed is a pretty good idea that can help
you 'see' dirty code better by encoding a variable's semantics in its name. See?
Not type. Semantics. Read the wiki page on the issue for more information.
Worrying more about the placement of braces than about the clarity of the code. (Personally, I prefer
K&R style. I like to reserve vertical whitespace for places where I'm separating things for clarity. But, I
can read almost any style and will default to maintaining the style of the original author. Second
choice is to convert the style using a pretty-printer.)
LeastCommonDenominatorRules. I've been told not to use a given language construct (e.g.,
inheritance) because a maintenance programmer might not know what it does.
A variant of the above: "Don't use DesignPatterns because less sophisticated programmers may not
understand them"
Never DocumentYourReasoning. Use DesignPatterns, then don't add a quick comment about why you
choose that particular one among alternative patterns.
Constant for the empty string, StringUtils.EMPTY_STRING, and guess who had to change all the legacy
code to bring it up to "standard". -- BenArnold
o Ben, were they using EclipseIde and getting tired of the non-internationalized string
warnings? Living with the warnings is painful. Turning off the warnings loses a useful check.
Putting in the non-i18n comments is really painful. Replacing the empty string is, as you've
seen, time-consuming. However, why did you have so many empty strings in your code
anyway? --EricJablow
It might have been because concatenation with the empty string behaves like a null-
safe toString().
All field names begin with an underscore, in Java... because it was their C++ standard.
o And it was their C++ because they saw that vendor libraries did it, and concluded that it must
be good.
Which is interesting because the C++ standard explicitly states that you
should NOT do this, because these names are reserved for vendor use, and so may
not work if you use them in your application programs!
o On the other hand, if your editor does name completion, typing an underscore produces a
list of field names.
One of the traps of BadCodingStandards, when paired with ineffectively led code reviews, is that they give
non-productive developers a tool for obstructing productive developers.
Funny how the same deficient beliefs melt away when the developers in question are programming together
trying to pass a UnitTest.
Maybe, maybe not. If the "productive" developer merely appears productive because he can sling thousands
of lines of bad code in a day, he will end up costing you more in the long run. The "unproductive" developer
who spots the defects in the code may just be saving your bacon.
And now for something completely different! Not a 'bad standard' for coding, but standard for 'bad coding',
especially in Java. Serious advice for those who want JobSecurity from obfuscated
code: HowToWriteUnmaintainableCode, by RoedyGreen.
-- DickBotting
Hrm... Points 22 and 47 remind me of the X window system.
I'm gonna try really, really hard not to rub too many people's fur the wrong way here. Coding standards - no
matter how bad - are a Good Thing from where I sit. At least the client has thought about it a little bit. Even if
one of their pimply-faced, aged 23, fresh grads dragged something off of an Internet page and thrashed it into
a three page document that's better than nothing at all. With a coding standard you have something to point
to and say, "See? It's right there." I will code to anybody's coding standard no matter how bizarre or obtuse.
Having even a stupid one eliminates a lot of "religious wars" before they get any steam whatsoever. At my
current client I have had to point out a lot of the flaws in the existing standard, of course. The doc gets revised
whenever holes show up. Simple. -- MartySchrader
I'm sure you're not [making up the try/catch macros] because I've seen worse. As I recall, the GEM developer's
kit (circa mid-80's) had a file called portab.h that went something like this:
#define begin {
#define end }
How anyone could possibly think that by making their code look vaguely like Pascal makes it more portable is
beyond me. -- AlexValdez
There was a similar pack of definitions in the source code for the Bourne shell I once looked at... made C look
just like Pascal. -- DickBotting
[As a matter of fact, you can thank Steve Bourne himself for that. He was an Algol fan and did not care for C
braces and such. We made fun of him for that breach of good sense (of redefining the language with macros,
not for preferring Algol) even back in the 1970s.]
I once programmed C at a company where I found, in an .h file, this bit of egregious macro definition:
#define TRUE 1
#define FALSE 0
... and I thought that was bad enough, until a week later, in another .h file, I came across:
#define TRUE 0
#define FALSE 1
Way back in 1969, after I had learned PL/1 and then moved from OS/360 to DOS (which had a rather less-
capable compiler), I worked in an organization where [for efficiency reasons] they insisted that rather than use
procedure calls you had to set variables, set a return label into a label variable, and then GOTO the subroutine
(which would return using a GOTO to its global return label variable). Like:
param1 = "hello"
param2 = "world"
p1ret = next
goto proc1
next:
/* continue program */
I don't think they had much idea of the importance of maintainability on that project.
They probably had scars from previous environments. And maybe the procedure calls were worse than we
would imagine. I mis-read your comment at first and it jogged some memories and I found this: "The original
IBM Algol compiler did a GETMAIN system call on every procedure call"
on http://compilers.iecc.com/comparch/article/96-07-031 ...which discusses why programmer time wasn't
considered valuable compared with efficiency.
My boss requires this. Fortunately, he doesn't get too upset when I "forget".
Try this: It takes about 2 seconds longer to scroll to the code you are looking for. If you average looking at 300
routines a day (including re-visits to the same one), then you spend 2 x 300 = 600 seconds or about 10 minutes
a day scrolling past them. That's about 0.17 hours a day. If a year averages about 260 work days, then those
banners require a total of about 44 hours a year (0.17 x 260). If your salary plus company overhead is about
$60/hr, then those banners cost your company about $2640 per year (60 x 44). Present that cost analysis to
your boss, and see what he says. Ask him if the marketing value of the banners makes up for that amount.
Please report back to this wiki his/her reaction.
I had a co-worker complain about how my code did not conform to the "standard" 80 characters per line
because their editor (Emacs) was setup to wrap text at 80 characters. I write code on dual flat 20" panel
screens both running at 1600 x 1200. I can see well over 80 character per line easily. A lots of the code I write
is significantly more legible at greater then 80 characters. I don't have to purposely create variable names x,y,z,
x1,y1,z2, etc just so I can do computations in a small space. I can name these variable: centerPointX,
centerPointY, rotatingPointX, rotatingPointY, etc.. Is there anyone else out there who agrees? And for you die
hard 80 line people, why can't you get a bigger screen. Seeing more code will make you more productive.
Arggg... -- AnonymousDonor
I DO have a bigger screen. That's so I can see my 80-column Emacs window PLUS another window (e.g., a
browser displaying JavaDoc pages).
What did people who had only 80-column screens do then? Did they have a 40-column editor open,
so they could use their other utilities too? We have these things called depth-arrangable windows for
a reason. --SamuelFalvo
I don't stick to a hard & fast 80 chars, but I think it's good to keep it under about 90. This has more to do
with TenWordLine than legacy terminal displays. Your eyes have to read sideways if the lines get too long,
rather than skimming downwards.
I do tend to break this a lot when the line isn't all that important, for example unit tests and required
parameters where there presence is required by the method name. I never read those anyway, so I don't need
to worry too much about readability.
Why not break the line vertically, eg. after assignment statements or operators, instead of using short variable
names? -- JonathanTang
The 80 character rule is particularly pointless. Based on monitors, eyesight, and screen space taken up by IDEs,
one may be able to type fewer or more than 80 characters on a line. Go ahead and type in code that best fits
the editior environment currently in use. I usually do not worry about a couple of characters that go past the
end of the screen and only fix the format if I find it too annoying. Don't try to predict what some unknown
person may use in the future to read and edit code. Finally, get on board with the rest of the team and using a
common editor and configuration. Don't expect everyone else to do handstands to allow you to use your
"clearly superior" code editor. --WayneMack
FWIW, I stick to a 99 character right column in all my code. Why? So I can print it out without having anything
wrap onto the next line. (I print using Courier New@9pt on A4 paper with line numbers so I can do code
reviews of my own stuff on the train!) -- BevanArps.
What is the advantage of having everyone on the team use the same editor, that outweighs the advantage of
everyone on the team using an editor they find comfortable and productive? -- GarethMcCaughan
If you've ever had to pair with a co-developer who was having some problems navigating the code,
you'd very quickly realize a different IDE or editor would be a source of great frustration. --
SamuelFalvo
o It's not just the same IDE, it's the same IDE configuration. I'm comfortable with a 9pt font,
but my coworker's eyesight is such that he needs at least 36pt. But I'm not about to start
trying to limit my lines to 64 characters - I've got more important things to worry about.
If you're writing open-source software that can and will be hacked on by anyone who wants to bother, it might
be polite to not assume everyone in the universe has a 1600x1200 screen. Besides, life can take one weird
places, and it might be nice to have code one can comfortably mess with while SSH'd into your server from a
VT220 terminal (true situation). At least, that's my excuse for sticking to more-or-less 80 columns.
-- SimonHeath
I don't seem to recall this being an issue back when 40 column screens were the norm, moving
towards 80 column screens. Remember, we have 80 column standards only because IBM mainframes
chose 80 columns per punchcard. --SamuelFalvo
Horizontal scrolling can make code very hard to read. (Unless your language is optimized for horizontally
written content.) Unless everyone agrees to the same limit, some will be left horizontal scrolling or everyone
will have to adopt to identical environments.
I've found having a horizontal limit helps to guide me to avoid cramming multiple concepts into one line.
Compare
process_parts(numWidgets + numGadgets, isInitialized && isIdle, context-
>get_rules_for(PROCESSING));
To
numParts = numWidgets + numGadgets;
isReady = isInitialized && isIdle;
processRules = context->get_rules_for(PROCESSING);
process_parts(numParts, isReady, processRules);
Or
process_parts(
numWidgets + numGadgets, // Number of parts
isInitialized && isIdle, // Ready?
context->get_rules_for(PROCESSING)); // Process Rules
Some places strictly restrict lines to 74 characters. No more, or your code will not be allowed.
120 seems to work well for most wide and double-screen environments -- considering multiple windows, side
panels and all.
Limiting width to 80 seems to create problems with reasonably named variables and methods, even if nested
conditions and loops are kept to a reasonable level (line one or two). Maybe it would work with very
highlyobject-oriented code, but it doesn't seem to work well with most Java I've seen -- at least not
without MAJOR rework. -- JeffGrigg
PaulGraham and some other writers like to maintain a total text width of a few inches for scannability. I
suspect the same applies to code. The rest of the screen spacec can be used for utilities or for a vertical split
and another batch of code. --JesseMillikan
I would be interested in seeing how he structures his Lisp code, because it is not easy to maintain only a few
inches when you consider how deeply Lisp code nests. Most Lisp coding conventions I've seen prefer
a diagonal layout that can trivially exceed 80 columns, leading to such abhorrences as:
( ....lots of stuff and nesting here; vertical line is right-hand edge of screen ....
(cond |
((condition-1) |
(result-1)) |
((condition-2) |
(result-2)) |
(t |
result-3))) |
)))))...)) |
I look at code like this and cannot help but think that a combination of better factoring and a MUCH wider
screen would be absolutely critical to good, legible Lisp coding. --SamuelFalvo
Actually, no, I don't see any side effects here, because it is not clear that b is a locally defined and used variable
or not. And, a certainly isn't being modified out from the conditional testing. So, no, there is insufficient data to
say whether there are any side effects happening here. --SamuelFalvo
The lifetime and usage of b doesn't matter. The ++ operator usually has the side effect of changing b's value
(same with --). Even if these operators don't have side effects, there is still the printf statements. --
MartinShobe
I agree with MartinShobe. The 'printf' is cause for a side-effect all on its lonesome (or a 'computation-effect' if
you don't like calling intentional communications 'side-effects', but 'SideEffect' is the term that has come into
wide use). And the statements 'b++' and 'b--' also have a consequence that qualifies for the term 'SideEffect'
(having an effect that extends beyond the statement), with a possible exception being if you overloaded those
operators for 'b' to mean something functional. This is true even if you immediately discard 'b' upon exiting the
procedure in which it is declared, though in that case you might have been able to say (if you also rid yourself
of the 'printf') that the procedure (as a whole) had no side-effects.
In general, if you program with statements rather than expressions, there must be SideEffects. After all, a
statement without any SideEffects is a statement that can be optimized by excising it entirely from your
program.
What what you're saying is correct, the original post did not specify any particular context. Every response to
mine so far has established a more specific context. Does ++ and -- perform a side-effect? Sure. At the
statement level. But across a whole procedure, the answer may not be so clear.
Nonetheless it is quite clear that the above is programmed with side-effects, and it is extremely
reasonable to justify a claim that there are side-effects (by pointing at them). Consider that whole
programs can have the emergent property of being side-effect free but still be written in a side-effect-
abundant manner. Since we're talking about 'coding-standards' we're not talking about the emergent
property... just the nature of the code.
Statements with no side-effects are still useful. if-statements are side-effect free, unless you use sloppily
written macros. All a side-effect is, is something that is not implicit in the inputs and outputs. There is no law
that says a side-effect-free computation must do nothing at all.
If-statements are of the form if(condition) then(statements-if-true) else(statements-if-false), and
(when executed) are very rarely side-effect free - that would require both the if-true and if-false sub-
statements be side-effect free. And, like any other statement with no side-effect, you could optimize
such a statement by being entirely rid of it. If-expressions, OTOH, can be useful even when side-effect
free (if(condition) then return(expression-if-true) else return(expression-if-false)).
And I think that the definition of 'SideEffect' (for a function or expression, any non-local change
beyond returning a value or set of explicitly defined values) pretty much means that a completely
'SideEffect-free' function can, at most, provide exactly the return value and do nothing else. It's worth
noting that no function can be completely and truly and absolutely 'SideEffect-free' in the ideal sense
because that would require: no change in time, no change in energy expended, no change in memory
consumed, no change in anything - just an instantaneous and zero-cost return of the appropriate
value that, indeed, "does nothing at all". Of course, to make the term practical, side-effects of
computation itself (energy consumed, time, memory, heat production, etc.) are not generally
considered 'SideEffects' unless they somehow become critical to the systems with which you're
working (e.g. HardRealTime).
As I understand it, only difference between SideEffect-free and ReferentialTransparency is that with
side-effect-free you can actually look at the world and mutable state and time and otherwise accept
inputs that weren't in your explicit input set (and with referential transparency, you cannot). A side-
effect-free function can return a different value on every call, even given the same inputs, but it still
can't 'do' or 'change' anything.
That being said, I'm still utterly bamboozled why this is here at all; why should side-effects of the form b++/b--
have any bearing on good/bad programming standards, as illustrated above? If the author had written:
void someProcedure(int a) {
if (a == 5) {
printf("a is equal to 5\n");
b++;
} else {
printf("a is not equal to 5\n");
b--;
}
}
this is obviously a bad thing, since the post-increment/decrement operators are touching stuff that is not
explicitly passed into the procedure as a variable. Rewriting this in a side-effect-free form:
int someFunction(int a) {
int aEquals5 = a == 5;
char *perhaps = aEquals5? " " : " not ";
printf("a is%sequal to 5\n", perhaps);
return b + (aEquals5 ? 1 : -1);
}
if (foo) {
// ...
} // end if
void foo() {
// ...
} // end foo
We have somebody at work that does this excessively, even for single line blocks. Here are my issues with it:
I can understand the reason for using this kind of comment for very long blocks and where many braces close
at the same time, but honestly most IDEs can highlight matching braces/parens for you, which renders this
whole thing obsolete.
And if you find the need to this, I'd suggest you look to refactor your code to make it smaller/less nested.
In order to get a sense of the importance of coding style, this assignment asks the learner to
review an established (i.e. documented) style guide. Learners will identify three style rules that
they agree with and three they disagree with, and provide an explanation of why for each.
Select three items of the style guide that you agree with and, for each, explain why. If there are
not enough items you agree with, give your best estimation as to the reason behind the selection
(in your own words) and the benefits it provides. Note: "There is no reason" and "There are no
benefits" are not acceptable answers here.
Select three items of the style guide that you do not agree with and, for each, explain why. If
there are not enough items you disagree with, give your best estimation as to why someone
might disagree and what possible downside there is to using it.
Debugging
The art of removing defects
2 major forms – print statements and debugging tools
Difficulty in visualizing state of the program
Error Messages
Error messages are notoriously cryptic in many languages
Even simple mistakes can result in massive walls of error text
Customizing error messages can also help
Static Analysis
Types of analysis
- Unused variables
- Empty catch blocks
- Unnecessary object creation
- Duplicated code
- Incorrect Boolean operator
- Some typos
- Use of null pointers
- Division by zero
- Overflows
- Out of bounds checks
- Information leak
- Use of unsanitized input
- Many more…
FindBugs looks for bugs in Java programs. It is based on the concept of bug patterns. A bug pattern is a code
idiom that is often an error. Bug patterns arise for a variety of reasons:
FindBugs uses static analysis to inspect Java bytecode for occurrences of bug patterns. Static analysis means
that FindBugs can find bugs by simply inspecting a program's code: executing the program is not necessary.
This makes FindBugs very easy to use: in general, you should be able to use it to look for bugs in your code
within a few minutes of downloading it. FindBugs works by analyzing Java bytecode (compiled class files), so
you don't even need the program's source code to use it. Because its analysis is sometimes imprecise,
FindBugs can report false warnings, which are warnings that do not indicate real errors. In practice, the rate of
false warnings reported by FindBugs is less than 50%.
FindBugs supports a plugin architecture allowing anyone to add new bug detectors. The publications
page contains links to articles describing how to write a new detector for FindBugs. If you are familiar with
Java bytecode you can write a new FindBugs detector in as little as a few minutes.
FindBugs is free software, available under the terms of the Lesser GNU Public License. It is written in Java, and
can be run with any virtual machine compatible with Sun's JDK 1.5. It can analyze programs written for any
version of Java. FindBugs was originally developed by Bill Pugh and David Hovemeyer. It is maintained by Bill
Pugh, and a team of volunteers.
FindBugs uses BCEL to analyze Java bytecode. As of version 1.1, FindBugs also supports bug detectors written
using the ASM bytecode framework. FindBugs uses dom4jfor XML manipulation.
Coverity Scan is a service by which Synopsys provides the results of analysis on open source coding projects to
open source code developers that have registered their products with Coverity Scan.
Synopsys, the development testing leader, is the trusted standard for companies that need to protect their
brands and bottom lines from software failures. Coverity Scan is powered by Coverity® Quality Advisor.
Coverity Quality Advisor surfaces defects identified by the Coverity Static Analysis Verification Engine (Coverity
SAVE®) for fast and easy remediation.
Synopsys offers the results of the analysis completed by Coverity Quality Advisor on registered projects at no
charge to registered open source developers.
Static analysis is a set of processes for finding source code defects and vulnerabilities.
In static analysis, the code under examination is not executed. As a result, test cases and specially designed
input datasets are not required. Examination for defects and vulnerabilities is not limited to the lines of code
that are run during some number of executions of the code, but can include all lines of code in the codebase.
Additionally, Synopsys's implementation of static analysis can follow all the possible paths of execution
through source code (including interprocedurally) and find defects and vulnerabilities caused by the
conjunction of statements that are not errors independent of each other.
Some examples of defects and vulnerabilities found by Coverity Quality Advisor include:
resources leaks
dereferences of NULL pointers
incorrect usage of APIs
use of uninitialized data
memory corruptions
buffer overruns
control flow issues
error handling issues
incorrect expressions
concurrency issues
insecure data handling
unsafe use of signed values
use of resources that have been freed
The consequences of each type of defect or vulnerability are dependent on the specific instance. For example,
unsafe use of signed values may cause crashes, lead to unexpected behavior, or lead to an exploitable security
vulnerability.
Your project may already be registered with Coverity Scan, please check at scan.coverity.com. If your project is
not already listed, sign up and click on Add project and register your new project Finally fill out the resulting
form and click Submit.
Register your project with Coverity Scan by completing the project registration form found
at scan.coverity.com. Upon your completion of project registration (including acceptance of the Scan User
Agreement) and your receipt of confirmation of registration of your project, you will be able to download the
Software required to submit a build of your code for analysis by Coverity Scan. You may then download the
Software, complete a build and submit your Registered Project build for analysis and review in Coverity Scan.
Coverity Scan is only available for use with open source projects that are registered with Coverity Scan.
To be considered an open source project, your project must meet all of the following criteria (adopted from
the Open Source Initiative definition of open source):
1. Your project must be freely redistributable. Your project license terms must not restrict any party
from selling or giving away the project as a component of an aggregate software distribution
containing programs from several different sources. The license terms must not require a royalty or
other fee for such sale.
2. Your project license terms must include a license to the project source code. Your project must
include source code, and must allow distribution in source code as well as compiled form. Where
some form of a product is not distributed with source code, there must be a well-publicized means of
obtaining the source code for no more than a reasonable reproduction cost preferably, downloading
via the Internet without charge. The source code must be the preferred form in which a programmer
would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms
such as the output of a preprocessor or translator are not allowed.
3. Users must be able to make and distribute modified and derivative versions of your project. Your
project license terms must allow for user modifications and derived works, and must allow
modifications and derived works to be distributed under the same terms as the license of the original
software.
4. Integrity of your project source code may be restricted. Your project license terms may restrict
source-code from being distributed in modified form only if the license allows the distribution of
"patch files" with the source code for the purpose of modifying the program at build time. Your
license terms must explicitly permit distribution of your project built from modified source code. Your
license terms may require derived works to carry a different name or version number from the
original project.
5. No Discrimination Against Persons or Groups. Your project license terms must not discriminate
against any person or group of persons.
6. No Discrimination Against Fields of Endeavor. Your project license terms must not restrict anyone
from making use of your project in a specific field of endeavor. For example, it may not restrict your
project from being used in a business, or from being used for genetic research.
7. Project license rights must apply to distribution of your project. The rights attached to your project
must apply to all to whom your project is redistributed without the need for execution of an
additional project license by those parties.
8. Your project license terms must not limit use of your project to a specific product or distribution.
The rights attached to the project must not depend on your project being part of a particular software
distribution. If your project is extracted from that distribution and used or distributed within the
terms of your license terms, all parties to whom your project is redistributed should have the same
rights as those that are granted in conjunction your original project distribution.
9. Your project license terms must not restrict other software. Your project license terms must not
place restrictions on other software that is distributed along with your project.
10. Your project license terms must be technology-neutral. No provision of your project license terms
may be predicated on any individual technology or style of interface.
Projects initiated and maintained by for-profit corporations, or projects with license terms outside the
foregoing guidelines, may be approved at Synopsys's discretion.
If your project is already a Registered Project in Scan, but you are not yet a registered user of Scan, you
can register with Scan, and upon registration, you can click on Add Project, find your Registered Project in the
project table, and request access to the Registered Project of your choice. You will be granted access subject to
approval by the Registered Project owner or Scan administrator.
Up to 28 builds per week, with a maximum of 4 builds per day, for projects with fewer than 100K lines
of code
Up to 21 builds per week, with a maximum of 3 builds per day, for projects with 100K to 500K lines of
code
Up to 14 builds per week, with a maximum of 2 build per day, for projects with 500K to 1 million lines
of code
Up to 7 builds per week, with a maximum of 1 build per day, for projects with more than 1 million
lines of code
Once a project reaches the maximum builds per week, additional build requests will be rejected. You will be
able to re-submit the build request the following week.
Please contact [email protected] if you have any special requirements.
What are the terms applicable to my use of Scan?
Your use of Scan, the analysis provided in connection with the Scan service, and any software provided by
Synopsys for your use of Scan are subject to the terms and conditions of the Coverity Scan User Agreement,
which may be found here. Your use of Coverity Scan constitutes your acceptance of the terms and conditions
of the Scan User Agreement, the Scan Terms of Use, and any additional terms and conditions stated in your
email (delivered to you upon successful registration of your project).
Generally, access to the detailed analysis results for most Registered Projects is granted only to members of
the Registered Project approved by the Registered Project administrator, to ensure that potential security
defects in the Registered Project may be resolved before the general public sees them.
Coverity Scan uses the Responsible Disclosure approach. Scan provides the analysis results to the project
developers only, and do not reveal details to the public until an issue has been resolved. For a thorough
discussion of Responsible Disclosure, you can refer to comments by Bruce Schneier, or Matt Blaze, or
the Wikipedia article on Full Disclosure
Since projects that do not resolve their outstanding defects are leaving their users exposed to the
consequences of those flaws, Synopsys will work to encourage a project to resolve all of their defects.
Synopsys may set a deadline for the publication of all the analysis results for a project.
In the discussion of Full Disclosure and Responsible Disclosure, focus has always been on the topic of handling
individual coding issues where the impact is somewhat well understood. In the case of automated code testing
tools, the best practices have not been discussed. Testing tools may find large numbers of issues, and those
counts include a range of different levels of impact. Since the results require triage by a developer, they can
sometimes languish - including those defects whose security implications are exposing end-users' systems. In
order to push for those issues to be resolved, in the same spirit as the individual issue disclosure policies,
Synopsys may set planned publication dates for the full analysis results of a project. Projects may negotiate
with us about the date, if they are making progress on resolving the outstanding issues.
Registered Projects, which are also part of Eclipse Foundation, can participate in the Coverity Scan service by
using the Coverity Scan plugin on the Hudson server. This plugin sends build and source code management
information to Coverity Scan server. Coverity Scan server builds and analyzes the code in the cloud for
Registered Projects which are part of Eclipse Foundation, and makes results available online.
Manual Steps:
Background process:
Coverity Scan plugin will send the information about your build environment to Coverity Scan server,
when you build your project in Hudson Server.
Coverity Scan server builds, analyzes and commits the results into Scan database, and results will be
available online
Summary of the defects found during the analysis is available on Hudson server under "Build History"
Login to Coverity Scan to view or triage the defects.
Any contributor of the project can register with Coverity Scan to get access to the analysis result.
Coverity Scan began in collaboration with Stanford University with the launch of Scan occurring on March 6,
2006. During the first year of operation, over 6,000 software defects were fixed across 50 C and C++ projects
by open source developers using the analysis results from the Coverity Scan service.
The project was initially launched under a contract with the Department of Homeland Security to harden open
source software which provides critical infrastructure for the Internet. The National Cyberspace Strategy
document details their priorities to:
· Develop Systems with Fewer Vulnerabilities and Assess Emerging Technologies for Vulnerabilities
DHS had no day-to-day involvement in the Scan project, and the three year contract was completed in 2009.
The result has been overwhelming. With over 6,000 defects fixed in the first year - averaging over 16 fixes
every day of the year, recognition of benefits from the Scan service has been growing steadily. Requests for
access to the results and inclusion of additional projects have shown that the open source community
recognizes the benefits of Scan.
In response, Synopsys is continuing to fund the Scan beyond the requirements of the DHS contract, which
expired in 2009. New registered users will continue to be able to register projects and be given access to the
Scan analysis results on an ongoing basis (time and resources permitting).
Commenting
Documenting in code
- Documenting is key to understanding, now and later
- Self documenting code get updated dynamically
- Comments provide context that code cannot
- Both forms are likely necessary
Commits
Store the newest version of the code-base
Allows inclusion of descriptive notes (‘commit message’)
Can be used for traceability
Branching
Version Control
- Effectively a must on modern software development
- Store backups with contextual history
- Connect changes to bug tracking
- Allows multiple developers to work on the same project at once
https://guides.github.com/introduction/git-handbook/
Build Process
Simple approach using the IDE or command line to compile and execute
Simple, effective, no prior preparation needed
Repeatability, consistency, errors are all concerns at the command line
Understanding and modifiability are issues in an IDE
Automated Build
Develop files which control the build process
Invoke tool commands to accomplish common tasks
Some primary options – Gradle, Maven, Ant
Make Example
Ant Example
Build Process
Modern build tools provide reproducible tasks on demand
Make was the beginning and is still used widely, especially for C/C++
Ant provided the first ‘modern’ build tool
Maven and Gradle are the prevailing build tools for Java
Intro to Make
https://www.gnu.org/software/make/
GNU Make
GNU Make is a tool which controls the generation of executables and other non-source files of a program from
the program's source files.
Make gets its knowledge of how to build your program from a file called the makefile, which lists each of the
non-source files and how to compute it from other files. When you write a program, you should write a
makefile for it, so that it is possible to use Make to build and install the program.
Capabilities of Make
Make enables the end user to build and install your package without knowing the details of how that
is done -- because these details are recorded in the makefile that you supply.
Make figures out automatically which files it needs to update, based on which source files have
changed. It also automatically determines the proper order for updating files, in case one non-source
file depends on another non-source file.
As a result, if you change a few source files and then run Make, it does not need to recompile all of
your program. It updates only those non-source files that depend directly or indirectly on the source
files that you changed.
Make is not limited to any particular language. For each non-source file in the program, the makefile
specifies the shell commands to compute it. These shell commands can run a compiler to produce an
object file, the linker to produce an executable, ar to update a library, or TeX or Makeinfo to format
documentation.
Make is not limited to building a package. You can also use Make to control installing or deinstalling a
package, generate tags tables for it, or anything else you want to do often enough to make it worth
while writing down how to do it.
A rule in the makefile tells Make how to execute a series of commands in order to build a target file from
source files. It also specifies a list of dependencies of the target file. This list should include all files (whether
source files or other targets) which are used as inputs to the commands in the rule.
commands
...
When you run Make, you can specify particular targets to update; otherwise, Make updates the first target
listed in the makefile. Of course, any other target files needed as input for generating these targets must be
updated first.
Make uses the makefile to figure out which target files ought to be brought up to date, and then determines
which of them actually need to be updated. If a target file is newer than all of its dependencies, then it is
already up to date, and it does not need to be regenerated. The other target files do need to be updated, but
in the right order: each target file must be regenerated before it is used in regenerating other targets.
GNU Make has many powerful features for use in makefiles, beyond what other Make versions have. It can
also regenerate, use, and then delete intermediate files which need not be saved.
GNU Make also has a few simple features that are very convenient. For example, the -o file option which says
``pretend that source file file has not changed, even though it has changed.'' This is extremely useful when you
add a new macro to a header file. Most versions of Make will assume they must therefore recompile all the
source files that use the header file; but GNU Make gives you a way to avoid the recompilation, in the case
where you know your change to the header file does not require it.
However, the most important difference between GNU Make and most versions of Make is that GNU Make is
free software.
We have developed conventions for how to write Makefiles, which all GNU packages ought to follow. It is a
good idea to follow these conventions in your program even if you don't intend it to be GNU software, so that
users will be able to build your package just like many other packages, and will not need to learn anything
special before doing so.
These conventions are found in the chapter ``Makefile conventions'' (147 k characters) of the GNU Coding
Standards (147 k characters).
Downloading Make
Make can be found on the main GNU ftp server: http://ftp.gnu.org/gnu/make/ (via HTTP)
and ftp://ftp.gnu.org/gnu/make/ (via FTP). It can also be found on the GNU mirrors; please use a mirror if
possible.
Documentation
Documentation for Make is available online, as is documentation for most GNU software. You may also find
more information about Make by running info make orman make, or by looking
at /usr/share/doc/make/, /usr/local/doc/make/, or similar directories on your system. A brief summary is
available by running make --help.
Mailing lists
bug-make is used to discuss most aspects of Make, including development and enhancement
requests, as well as bug reports.
Announcements about Make and most other GNU software are made on info-gnu (archive).
Security reports that should not be made immediately public can be sent directly to the maintainer. If there is
no response to an urgent issue, you can escalate to the generalsecurity mailing list for advice.
Getting involved
Development of Make, and GNU in general, is a volunteer effort, and you can contribute. For information,
please read How to help GNU. If you'd like to get involved, it's a good idea to join the discussion mailing list
(see above).
Test releases
Trying the latest test release (when available) is always appreciated. Test releases of Make can
be found at http://alpha.gnu.org/gnu/make/ (via HTTP) andftp://alpha.gnu.org/gnu/make/ (via
FTP).
Development
For development sources, issue trackers, and other information, please see the Make project
page at savannah.gnu.org.
Translating Make
To translate Make's messages into other languages, please see the Translation Project page for
Make. If you have a new translation of the message strings, or updates to the existing strings,
please have the changes made in this repository. Only translations from this site will be
incorporated into Make. For more information, see the Translation Project.
Maintainer
Make is currently being maintained by Paul Smith. Please use the mailing lists for contact.
https://www.softwaretestinghelp.com/apache-ant-selenium-tutorial-23/
A quick look at Ant and Maven, with a much longer view into Gradle.
https://www.journaldev.com/7971/gradle
Gradle is a project build and automation tool for java based applications; something like ivy, ant and maven.
Build tools help us to reduce our development and test time and hence increase productivity.
Table of Contents[hide]
1 Gradle
Gradle
Grunt and Gulp are two popular build and automation tools for javascript projects like NodeJS or jQuery. Gant
is a build tool mainly used for Groovy based applications. SBT stands for Scala Build Tool. SBT is mainly used for
Scala based applications.
Play
Unmute
Loaded: 34.61%
Skip Ad
Now-a-days, most of the java projects are using either maven or gradle build tools because of their benefits
when compared to ant.
Before discussing about Gradle, first we will go through some shortcomings of ant and maven. Then we will
discuss about why we need Gradle build tool to our projects.
Drawbacks of Ant
The following are main drawbacks of using ant build tools in our project.
1. We need to write ant build scripts using XML. If we want to automate a complex project, then we need to
write a lot of logic in XML files.
2. When we execute complex and large project’s ant build, it produces a lot of verbose at console.
3. There is no built-in ant project structure. Since we can use any build structure for our projects, new developers
find it hard to understand the project struct and build scripts.
4. It is very tough to write some complex logic using if-then-else statements.
5. We need to maintain all jars with required version in version control. There is no support for dependency
management.
Because of these many drawbacks, now-a-days most of the projects have restructured and they are using
maven build tool. Even though Maven provides the following benefits compare to ant, but still it has some
drawbacks.
Advantages of Maven
Drawbacks of Maven
1. Maven follows some pre-defined build and automation lifecycle. Sometimes it may not fit to our project needs.
2. Even though we can write custom maven life cycle methods, but it’s a bit tough and verbose.
3. Maven build scripts are a bit tough to maintain for very complex projects.
Maven has solved most of the ant build tool issues, but still it has some drawbacks. To overcome these issues,
we need to use Gradle build tool. Now we can start our discussion on Gradle.
What is Gradle?
Gradle is an opensource build and automation tool for java based projects. Using Gradle, we can reduce
project development time and increase productivity.
Gradle is a multi-language, multi-platform, multi-project and multi-channel build and automation software.
Gradle advantages
Gradle will provide the following advantages compared to ant and maven. That’s why all new projects are
using Gradle as build tool.
1. Like Maven, Gradle is also an expressive, declarative, and maintainable build Tool.
2. Just like maven, Gradle also supports dependency management.
3. Gradle provides very scalable and high-performance builds.
4. Gradle provides standard project layout and lifecycle, but it’s full flexibility. We have the option to fully
configure the defaults. This is where it’s better than maven.
5. It’s very easy to use gradle tool and implement custom logic in our project.
6. Gradle supports the project structure that consists of more than one project to build deliverable.
7. It is very easy to integrate existing ant/maven project with Gradle.
8. It is very easy to migrate from existing ant/maven project to Gradle.
Gradle combines most of the popular build tool plus points into a single build tool.
In simple terminology, Gradle is built by taking all the good things from ant, maven, ivy and gant.
That means Gradle combines best features of all popular build tools:
NOTE: Gant is a Groovy based build system that internally uses ant tasks. In Gant, we need to write build
scripts in Groovy, but no XML. Refer it’s official website for more information.
“Make the impossible possible, make the possible easy, and make the easy elegant”.
By reading the Gradle’s Motto, we can understand that main goal and advantages of Gradle build tool.
As of now, Gradle works as build and automation tool for the following projects.
Now we are clear that Gradle is the best build and automation tool to use in our projects. We are going to use
Gradle in my Spring Boot Tutorial. As of today, Gradle’s latest and stable version is 4.1 released on 07-Aug-
2017. We are going to use this version to run our examples.
Gradle Install
1. Download latest Gradle software from https://gradle.org/ by clicking “Download Gradle 2.4” button as shown
below
set GRADLE_HOME=D:\gradle-2.4
set PATH=D:\gradle-2.4\bin;%PATH%
4. Open CMD Prompt and check the Setup is done successfully or not.
We can use either of the following two commands to know the Gradle version.
D:\>gradle -v
Or
D:\>gradle --version
By observing this output at Command Prompt as shown in above screenshot, we can say that Gradle
installation is successful.
https://technologyconversations.com/2014/06/18/build-tools/
Test Selection
Overall Goal – Select the minimum number of tests which identify defects better than random testing
Definitions
Test case
Test data
Test suite
SUT
Actual output
Expected output
Oracle
Code Coverage
Manual testing
Automated testing
Test Selection
Test Adequacy
Numerous Approaches
Manual testing
Exception testing
Boundary testing
Randomized testing
Coverage testing
Requirements testing
Specification testing
Safety testing
Security testing
Performance testing
Randomized testing
Using randomized input values to identify defects in isolated code
Works quite well in finding defects
Automation of test generation allows many random tests to be run
Code Coverage
Using coverage metrics to identify code not covered
Helps discover use cases and execution paths missed in test planning
Generate tests cases to meet coverage criteria
Requirements Testing
End to end tests which highlight common and important user actions
Written without knowledge of code structure
Helps determine when you are ‘done’
Test Selection
Finding the smallest number of tests that find the most defects
Variety of methods, many of them using automated means
Can be used together eg. randomized coverage testing
Significant research in the area is on-going
https://www.guru99.com/code-coverage.html
Statement Coverage
Decision Coverage
Branch Coverage
Toggle Coverage
FSM Coverage
Statement Coverage
Statement coverage is a white box test design technique which involves execution of all the executable
statements in the source code at least once. It is used to calculate and measure the number of statements in
the source code which can be executed given the requirements.
Statement coverage is used to derive scenario based upon the structure of the code under test.
In White Box Testing, the tester is concentrating on how the software works. In other words, the tester will be
concentrating on the internal working of source code concerning control flow graphs or flow charts.
Generally in any software, if we look at the source code, there will be a wide variety of elements like
operators, functions, looping, exceptional handlers, etc. Based on the input to the program, some of the code
statements may not be executed. The goal of Statement coverage is to cover all the possible path's, line, and
statement in the code.
Scenario to calculate Statement Coverage for given source code. Here we are taking two different scenarios to
check the percentage of statement coverage for each scenario.
Source Code:
Scenario 1:
If A = 3, B = 9
The statements marked in yellow color are those which are executed as per the scenario
Scenario 2:
If A = -3, B = -9
The statements marked in yellow color are those which are executed as per the scenario.
But overall if you see, all the statements are being covered by 2nd scenario's considered. So we can conclude
that overall statement coverage is 100%.
1. Unused Statements
2. Dead Code
3. Unused Branches
4. Missing Statements
Decision Coverage
Decision coverage reports the true or false outcomes of each Boolean expression. In this coverage, expressions
can sometimes get complicated. Therefore, it is very hard to achieve 100% coverage.
That's why there are many different methods of reporting this metric. All these methods focus on covering the
most important combinations. It is very much similar to decision coverage, but it offers better sensitivity to
control flow.
Demo(int a) {
If (a> 5)
a=a*3
Print (a)
}
Scenario 1:
Value of a is 2
The code highlighted in yellow will be executed. Here the "No" outcome of the decision If (a>5) is checked.
Scenario 2:
Value of a is 6
The code highlighted in yellow will be executed. Here the "Yes" outcome of the decision If (a>5) is checked.
1 2 2 50%
2 6 18 50%
Branch Coverage
In the branch coverage, every outcome from a code module is tested. For example, if the outcomes are binary,
you need to test both True and False outcomes.
It helps you to ensure that every possible branch from each decision condition is executed at least a single
time.
By using Branch coverage method, you can also measure the fraction of independent code segments. It also
helps you to find out which is sections of code don't have any branches.
To learn branch coverage, let's consider the same example used earlier
Demo(int a) {
If (a> 5)
a=a*3
Print (a)
}
1 2 2 50% 33%
2 6 18 50% 67%
Condition Coverage
Conditional coverage or expression coverage will reveal how the variables or subexpressions in the conditional
statement are evaluated. In this coverage expressions with logical operands are only considered.
For example, if an expression has Boolean operations like AND, OR, XOR, which indicated total possibilities.
Conditional coverage offers better sensitivity to the control flow than decision coverage. Condition coverage
does not give a guarantee about full decision coverage
Example:
TT
FF
TF
FT
Y=4
B=4
Finite state machine coverage is certainly the most complex type of code coverage method. This is because it
works on the behavior of the design. In this coverage method, you need to look for how many time-specific
states are visited, transited. It also checks how many sequences are included in a finite state machine.
This is certainly the most difficult answer to give. In order to select a coverage method, the tester needs to
check that the
The higher the probability that defects will cause costly production failures, the more severe the level of
coverage you need to choose.
Code coverage tells you how well the source code has been Functional coverage measures how well the functionalit
exercised by your test bench. design has been covered by your test bench.
Clover Clover also reduces testng time by only running the test
cover the application code which was modified since the
build.
Emma EMMA supports class, method, line, and base block cov
aggregated source file, class, and method levels.
CoView and CoAnt Coding Software is a code coverage tool for metrics, mo
creation, code testability, path & branch coverage, etc.
Even when any specific feature is not implemented in design, code coverage still report 100%
coverage.
It is not possible to determine whether we tested all possible values of a feature with the help of code
coverage
Code coverage is also not telling how much and how well you have covered your logic
In the case when the specified function hasn't implemented, or a not included from the specification,
then structure-based techniques cannot find that issue.
Summary
Code coverage is a measure which describes the degree of which the source code of the program has
been tested
It helps you to measure the efficiency of test implementation
Five Code Coverage methods are 1.) Statement Coverage 2.) Condition Coverage 3) Branch Coverage
4) Toggle Coverage 5) FSM Coverage
Statement coverage involves execution of all the executable statements in the source code at least
once
Decision coverage reports the true or false outcomes of each Boolean expression
In the branch coverage, every outcome from a code module is tested
Conditional will reveal how the variables or subexpressions in the conditional statement are evaluated
Finite state machine coverage is certainly the most complex type of code coverage method
In order to select a coverage method, the tester needs to check the cost of the potential penalty, lost
reputation, lost sale, etc.
Code coverage tells you how well the source code has been exercised by your test bench while
Functional coverage measures how well the functionality of the design has been covered
Cobertura, JTest, Clover, Emma, Kalistick are few important code coverage tools
Code Coverage allows you to create extra test cases to increase coverage
Code Coverage does not help you to determine whether we tested all possible values of a feature
An example of how to calculate the MC/DC code coverage for a test suite.
https://www.verifysoft.com/en_example_mcdc.html
A variety of standards for the minimum acceptable code coverage for software in various
regulated industries, based on the importance of the software to the continued safe operation of
the product.
After looking at some of these, I hope you appreciate the FDA standard as much as I do.
https://www.bullseye.com/minimum.html
Summary
Code coverage of 70-80% is a reasonable goal for system test of most projects with most coverage metrics.
Use a higher goal for projects specifically organized for high testability or that have high failure costs.
Minimum code coverage for unit testing can be 10-20% higher than for system testing.
Introduction
Empirical studies of real projects found that increasing code coverage above 70-80% is time consuming and
therefore leads to a relatively slow bug detection rate. Your goal should depend on the risk assessment and
economics of the project. Consider the following factors.
Cost of failure. Raise your goal for safety-critical systems or where the cost of a failure is high, such as
products for the medical or automotive industries, or widely deployed products.
Resources. Lower your goal if testers are spread thin or inadequately trained. If your testers are
unfamiliar with the application, they may not recognize a failure even if they cover the associated
code.
Testable design. Raise your goal if your system has special provisions for testing such as a method for
directly accessing internal functionality, bypassing the user interface.
Development cycle status. Lower your goal if you are maintaining a legacy system where the original
design engineers are no longer available.
Many projects set no particular minimum percentage required code coverage. Instead they use code coverage
analysis only to save time. Measuring code coverage can quickly find those areas overlooked during test
planning.
Defer choosing a code coverage goal until you have some measurements in hand. Before measurements are
available, testers often overestimate their code coverage by 20-30%.
Although 100% code coverage may appear like a best possible effort, even 100% code coverage is estimated to
only expose about half the faults in a system. Low code coverage indicates inadequate testing, but high code
coverage guarantees nothing.
In a large system, achieving 100% code coverage is generally not cost effective. Some reasons are listed below.
Some test cases are expensive to reproduce but are highly improbable. The cost to benefit ratio does
not justify repeating these tests simply to record the code coverage.
Checks may exist for unexpected error conditions. Layers of code might obscure whether errors in
low-level code propagate up to higher level code. An engineer might decide that handling all errors
creates a more robust solution than tracing the possible errors.
Unreachable code in the current version might become reachable in a future version. An engineer
might address uncertainty about future development by investing a little more effort to add some
capability that is not currently needed.
Code shared among several projects is only partially utilized by the project under test.
Generally, the tester should stop increasing code coverage when the tests become contrived. When you focus
more and more on making the coverage numbers better, your motivation shifts away from finding bugs.
You can attain higher code coverage during unit testing than in integration testing or system testing. During
unit testing, the tester has more facilities available, such as a debugger to manipulate data and conditional
compilation to simulate error conditions.
Likewise, higher code coverage is possible during integration testing than in system testing. During integration
testing, the test harness often provides more precise control and capability than the system user interface.
Therefore it makes sense to set progressively lower goals for unit testing, integration testing, and system
testing. For example, 90% during unit testing, 80% during integration testing, and 70% during system testing.
Coverage Metrics
The information in this paper applies to code coverage metrics that consider control structures independently.
Specifically, these are:
Although some of these metrics are less sensitive to control flow than others, they all correlate statistically at a
large scale.
Formal Standards
DO-178B
The aviation standard DO-178B requires 100% code coverage for safety critical systems. This standard specifies
progressively more sensitive code coverage metrics for more critical systems.
These requirements consider neither the probability of a failure nor the cost of performing test cases.
IEC 61508
The safety integrity level (SIL) relates to the probability of unsafe failure. Determining the SIL involves a lengthy
risk analysis.
This standard recommends but does not require 100% coverage. It specifies you should explain any uncovered
code.
This standard does not define the coverage metrics and does not distinguish between condition coverage and
MC/DC.
ISO 26262
ISO 26262:2011 "Road vehicles -- Functional safety" requires measuring code coverage, and specifies that if
the level achieved "is considered insufficient", then a rationale must be provided. The standard recommends
different coverage metrics for unit testing than for integration testing. In both cases, the strenuousness of the
recommendations relates to the criticality.
For unit testing, three coverage metrics are recommended, shown in the table below. The standard does not
provide definitions for these metrics.
Methods ASIL
For integration testing, two metrics are recommended, shown in the table below.
Methods ASIL
A (least critical) B C D (most critical)
The standard defines function coverage as the percentage of executed software functions, and call coverage as
the percentage of executed software function calls.
The automotive safety integrity level (ASIL) is based on the probability of failure, effect on vehicle
controllability, and severity of harm. The ASIL does not correlate directly to the SIL of IEC 61508.
The code coverage requirements are contained in part 6 "Product development at the software level."
ANSI/IEEE 1008-1987
The IEEE Standard for Software Unit Testing section 3.1.2 specifies 100% statement coverage as a
completeness requirement. Section A9 recommends 100% branch coverage for code that is critical or has
inadequate requirements specification. Although this document is quite old, it was reaffirmed in 2002.
Test Adequacy
Overall Goal – Determine whether a test suite is adequate to ensure the correctness or desired level of
dependability that we want for our software*
- this is very difficult
o in fact for correctness it is generally impossible
Approximating adequacy
Instead of measuting adequacy directly we measure how well we have covered some aspect of the
- program structure
- program inputs
- requirements
- etc
This measurement provides a way to determine (lack of) thoroughness of a test suite
Adequacy criteria
Adequacy criterion = set of test obligations
A test suite satisfies an adequacy criterion if
- all the tests success (pass)
- every test obligation in the criterion is satisfied by at least one of the test cases in the test suite
- Example;
o The statement coverage adequacy criterion is satisfied by test suite S for program P if;
Each executable statement in P is executed by at least one test case in S and
The outcome of each test execution was ‘pass’
You would not buy a house just because it’s ‘up to code’ but you might avoid if it’s not
Measuring Coverage
% of satisfied test obligations can be useful
- Progress toward a thorough test suite
- Trouble spots requiring more attention
Or a dangerous seduction
- Coverage is only a proxy for thoroughness or adequacy
- Its easy to improve coverage without improving a test suite (much easier than designing good test
cases)
- The only measure that really matters is (cost) effectiveness
Test Adequacy
- How to define a notion of ‘thoroughness’ of a test suite
- Defined in terms of covering some information
- Derived from many sources both simple and sophisticated
- Code coverage likely the most common such adequacy criteria
Tests up front
Focus on what the code should do, not how it will be implemented
Tests are not thrown together at the last minute
‘test-first students on average wrote more tests and tended to be more productive’
Deployment
Continuous Integration
- Automating quality controls work through tools
- Each process runs and reports whenever a developer completes an action
- Live reports give managers and developers visualization of quality
Jenkins
- The big player in continuous integration and automation servers
- Plug ins determine the capabilities of the server
- One of the readings is a link to more information on the Jenkins Pipeline
https://jenkins.io/doc/pipeline/tour/getting-started/
This guided tour introduces you to the basics of using Jenkins and its main feature, Jenkins Pipeline. This tour
uses the "standalone" Jenkins distribution, which runs locally on your own machine.
Prerequisites
For this tour, you will require:
A machine with:
o 256 MB of RAM, although more than 512MB is recommended
o 10 GB of drive space (for Jenkins and your Docker image)
The following software installed:
o Java 8 or 11 (either a JRE or Java Development Kit (JDK) is fine)
o Docker (navigate to Get Docker at the top of the website to access the Docker download
that’s suitable for your platform)
Download and run Jenkins
1. Download Jenkins.
2. Open up a terminal in the download directory.
3. Run java -jar jenkins.war --httpPort=8080.
4. Browse to http://localhost:8080.
5. Follow the instructions to complete the installation.
When the installation is complete, you can start putting Jenkins to work!
Jenkins Pipeline
https://jenkins.io/doc/book/pipeline/
Pipeline
Chapter Sub-Sections
Getting started with Pipeline
Using a Jenkinsfile
Running Pipelines
Branches and Pull Requests
Using Docker with Pipeline
Extending with Shared Libraries
Pipeline Development Tools
Pipeline Syntax
Pipeline Best Practices
Scaling Pipelines
Pipeline CPS Method Mismatches
Table of Contents
What is Jenkins Pipeline?
o Declarative versus Scripted Pipeline syntax
Why Pipeline?
Pipeline concepts
o Pipeline
o Node
o Stage
o Step
Pipeline syntax overview
o Declarative Pipeline fundamentals
o Scripted Pipeline fundamentals
Pipeline example
This chapter covers all recommended aspects of Jenkins Pipeline functionality, including how to:
get started with Pipeline - covers how to define a Jenkins Pipeline (i.e. your Pipeline) through Blue
Ocean, through theclassic UI or in SCM,
create and use a Jenkinsfile - covers use-case scenarios on how to craft and construct your Jenkinsfile,
work with branches and pull requests,
use Docker with Pipeline - covers how Jenkins can invoke Docker containers on agents/nodes (from
a Jenkinsfile) to build your Pipeline projects,
extend Pipeline with shared libraries,
use different development tools to facilitate the creation of your Pipeline, and
work with Pipeline syntax - this page is a comprehensive reference of all Declarative Pipeline syntax.
For an overview of content in the Jenkins User Handbook, see User Handbook overview.
A continuous delivery (CD) pipeline is an automated expression of your process for getting software from
version control right through to your users and customers. Every change to your software (committed in
source control) goes through a complex process on its way to being released. This process involves building the
software in a reliable and repeatable manner, as well as progressing the built software (called a "build")
through multiple stages of testing and deployment.
Pipeline provides an extensible set of tools for modeling simple-to-complex delivery pipelines "as code" via
thePipeline domain-specific language (DSL) syntax. [1]
The definition of a Jenkins Pipeline is written into a text file (called a Jenkinsfile) which in turn can be
committed to a project’s source control repository. [2] This is the foundation of "Pipeline-as-code"; treating the
CD pipeline a part of the application to be versioned and reviewed like any other code.
Creating a Jenkinsfile and committing it to source control provides a number of immediate benefits:
Automatically creates a Pipeline build process for all branches and pull requests.
Code review/iteration on the Pipeline (along with the remaining source code).
Audit trail for the Pipeline.
Single source of truth [3] for the Pipeline, which can be viewed and edited by multiple members of the
project.
While the syntax for defining a Pipeline, either in the web UI or with a Jenkinsfile is the same, it is generally
considered best practice to define the Pipeline in a Jenkinsfile and check that in to source control.
Declarative and Scripted Pipelines are constructed fundamentally differently. Declarative Pipeline is a more
recent feature of Jenkins Pipeline which:
Why Pipeline?
Jenkins is, fundamentally, an automation engine which supports a number of automation patterns. Pipeline
adds a powerful set of automation tools onto Jenkins, supporting use cases that span from simple continuous
integration to comprehensive CD pipelines. By modeling a series of related tasks, users can take advantage of
the many features of Pipeline:
Code: Pipelines are implemented in code and typically checked into source control, giving teams the
ability to edit, review, and iterate upon their delivery pipeline.
Durable: Pipelines can survive both planned and unplanned restarts of the Jenkins master.
Pausable: Pipelines can optionally stop and wait for human input or approval before continuing the
Pipeline run.
Versatile: Pipelines support complex real-world CD requirements, including the ability to fork/join,
loop, and perform work in parallel.
Extensible: The Pipeline plugin supports custom extensions to its DSL [1] and multiple options for
integration with other plugins.
While Jenkins has always allowed rudimentary forms of chaining Freestyle Jobs together to perform sequential
tasks, [4] Pipeline makes this concept a first-class citizen in Jenkins.
Building on the core Jenkins value of extensibility, Pipeline is also extensible both by users with Pipeline Shared
Libraries and by plugin developers. [5]
The flowchart below is an example of one CD scenario easily modeled in Jenkins Pipeline:
Pipeline concepts
The following concepts are key aspects of Jenkins Pipeline, which tie in closely to Pipeline syntax (see
the overviewbelow).
Pipeline
A Pipeline is a user-defined model of a CD pipeline. A Pipeline’s code defines your entire build process, which
typically includes stages for building an application, testing it and then delivering it.
Node
A node is a machine which is part of the Jenkins environment and is capable of executing a Pipeline.
Stage
A stage block defines a conceptually distinct subset of tasks performed through the entire Pipeline (e.g.
"Build", "Test" and "Deploy" stages), which is used by many plugins to visualize or present Jenkins Pipeline
status/progress.[6]
Step
A single task. Fundamentally, a step tells Jenkins what to do at a particular point in time (or "step" in the
process). For example, to execute the shell command make use the sh step: sh 'make'. When a plugin extends
the Pipeline DSL, [1] that typically means the plugin has implemented a new step.
The following Pipeline code skeletons illustrate the fundamental differences between Declarative Pipeline
syntaxand Scripted Pipeline syntax.
Be aware that both stages and steps (above) are common elements of both Declarative and Scripted Pipeline
syntax.
1. Schedules the steps contained within the block to run by adding an item to the Jenkins queue. As
soon as an executor is free on a node, the steps will run.
2. Creates a workspace (a directory specific to that particular Pipeline) where work can be done on files
checked out from source control.
Caution: Depending on your Jenkins configuration, some workspaces may not get automatically
cleaned up after a period of inactivity. See tickets and discussion linked from JENKINS-2111 for more
information.
Jenkinsfile (Scripted Pipeline)
node {
stage('Build') {
//
}
stage('Test') {
//
}
stage('Deploy') {
//
}
}
Execute this Pipeline or any of its stages, on any available agent.
Defines the "Build" stage. stage blocks are optional in Scripted Pipeline syntax. However,
implementing stageblocks in a Scripted Pipeline provides clearer visualization of each `stage’s subset of
tasks/steps in the Jenkins UI.
Perform some steps related to the "Build" stage.
Defines the "Test" stage.
Perform some steps related to the "Test" stage.
Defines the "Deploy" stage.
Perform some steps related to the "Deploy" stage.
Pipeline example
Here is an example of a Jenkinsfile using Declarative Pipeline syntax - its Scripted syntax equivalent can be
accessed by clicking the Toggle Scripted Pipeline link below:
SonarQube
https://www.sonarqube.org/
http://www.sqale.org/details/details-indices-indicators
For any element of the source code portfolio, the SQALE Method defines the following set of indices:
The SQALE Business Index: SBII which measures the importance of the debt
Consolidated and density indices (see the complete list in the definition document)
The SQALE Method defines 4 synthesised indicators. They provide powerful analysis capabilities and support
optimised decisions for managing source code quality and Technical Debt.
SonarQube Open Source Project Hosting
https://sonarcloud.io/explore/projects
ovirt-root on SonarCloud
https://sonarcloud.io/dashboard?id=org.ovirt.engine%3Aroot
Canary
- Continuous pipeline are no simple feat
- Many additional tools exist to aid in such automated systems
- Canary helps deploy new code slowly to user groups incrementally
Netflix’s Spinnaker
https://netflixtechblog.com/global-continuous-delivery-with-spinnaker-2a6896c23ba7
Spinnaker
https://www.spinnaker.io/guides/tutorials/videos/
Textbook in the field
https://martinfowler.com/books/continuousDelivery.html
In the late 90's I paid a visit to Kent Beck, then working in Switzerland for an insurance company. He showed
me around his project and one of the interesting aspects of his highly disciplined team was the fact that they
deployed their software into production every night. This regular deployment gave them many advantages:
written software wasn't waiting uselessly before it was used, they could respond quickly to problems and
opportunities, and the rapid turn-around led to a much deeper relationship between them, their business
customer, and their final customers.
In the last decade I've worked at ThoughtWorks and a common theme of our projects has been reducing that
cycle time between idea and usable software. I see plenty of project stories and they almost all involve a
determined shortening of that cycle. While we don't usually do daily deliveries into production, it's now
common to see teams doing bi-weekly releases.
Dave and Jez have been part of that sea-change, actively involved in projects that have built a culture of
frequent, reliable deliveries. They and our colleagues have taken organizations that struggled to deploy
software once a year, into the world of Continuous Delivery, where releasing becomes routine.
The foundation for the approach, at least for the development team, is Continuous Integration (CI). CI keeps a
development team in sync with each other, removing the delays due to integration issues. A couple of years
ago Paul Duvall wrote the book on CI within this series. But CI is just the first step. Software that's been
successfully integrated into a mainline code stream still isn't software that's out in production doing its job.
Dave and Jez's book pick up the story from CI to deal with that 'last mile', describing how to build the
deployment pipelines that turn integrated code into production software.
This kind of delivery thinking has long been a forgotten corner of software development, falling into a hole
between developers and operations teams. So it's no surprise that the techniques in this book rest upon
bringing these teams together, a harbinger of the nascent but growing "devops" movement. This process also
involves testers, as testing is a key element of ensuring error-free releases. Threading through it all is a high
degree of automation so things can be done quickly and without error.
Getting all this working takes effort, but benefits are profound. Long, high intensity releases become a thing of
the past. Customers of software see ideas rapidly turned into working code that they can use every day.
Perhaps most importantly we remove one of the biggest sources of baleful stress in software development.
Nobody likes those tense weekends trying to get a system upgrade released before Monday dawns.
It seems to me that a book that can show you how to deliver your software frequently and without the usual
stresses is a no-brainer to read. For your team's sake, I hope you agree.
http://guides.beanstalkapp.com/deployments/best-practices.html
Introduction
This guide is aimed to help you better understand how to better deal with deployments in your development
workflow and provide some best practices for deployments. Sometimes a bad production deployment can ruin
all the effort you invested in a development process. Having a solid deployment workflow can become one of
the greatest advantages of your team.
Before you start, I recommend reading our Developing and Deploying with Branches guide first to get a general
idea of how branches should be setup in your repository to be able to fully utilize tips from this guide. It’s a
great read.
Note on Development Branch
In this guide you will see a lot of references to a branch called development. In your repository you can
use master (Git), trunk (Subversion) or default (Mercurial) for the same purpose, there’s no need to create a
branch specifically called “development”. I chose this name because it’s universal for all version control
systems.
The Workflow
Deployments should be treated as part of a development workflow, not as an afterthought. If you are
developing a web site or an application, your workflow will usually include at least three environments:
Development, Staging and Production. In that case the workflow might look like this:
Developers work on bugs and features in separate branches. Really minor updates can be committed
directly to the stable development branch.
Once features are implemented, they are merged into the staging branch and deployed to the Staging
environment for quality assurance and testing.
After testing is complete, feature branches are merged into the development branch.
On the release date, the development branch is merged into production and then deployed to the
Production environment.
Let’s take a closer look at each environment to see what are the most efficient way to deploy each one of
them.
Development Environment
If you make web applications, you don’t need a remote development environment, every developer should
have their own local setup.
We noticed in Beanstalk that some teams have Development environments set up with automatic
deployments on every commit or push. While this gives developers a small advantage of not installing the site
or the application on their computers to perform testing locally, it also wastes a lot of time. Every tiny change
must be committed, pushed, deployed, and only then it can be verified. If the change was made by mistake, a
developer will have to revert it, push it, then redeploy.
Testing on a local computer removes the need to commit, push and deploy completely. Every change can be
verified locally first, then, once it’s more or less stable, it can be pushed to a Staging environment for proper
quality assurance testing.
We do not recommend using deployments for rapidly changing development environments. Running your
software locally is the best choice for that sort of testing.
Staging Environment
Once the features are implemented and considered fairly stable, they get merged into the staging branch and
then automatically deployed to the Staging environment. This is when quality assurance kicks in: testers go to
staging servers and verify that the code works as intended.
It is very handy to have a separate branch called staging to represent your staging environment. It will allow
developers to deploy multiple branches to the same server simultaneously, simply by merging everything that
needs to be deployed to the staging branch. It will also help testers understand what exactly is on staging
servers at the moment, just by looking inside the staging branch.
We recommend to deploy to the staging environment automatically on every commit or push.
Production Environment
Once the feature is implemented and tested, it can be deployed to production. If the feature was implemented
in a separate branch, it should be merged into a stable development branch first. The branches should be
deleted after they are merged to avoid confusion between team members.
The next step is to make a diff between the production and development branches to take a quick look at the
code that will be deployed to production. This gives you one last chance to spot something that’s not ready or
not intended for production. Stuff like debugger breakpoints, verbose logging or incomplete features.
Once the diff review is finished, you can merge the development branch into production and then initialize a
deployment of the production branch to your Production environment by hand. Specify a meaningful message
for your deployment so that your team knows exactly what you deployed.
Make sure to only merge development branch into production when you actually plan to deploy. Don’t merge
anything into production in advance. Merging on time will make files in your production branch match files on
your actual production servers and will help everyone better understand the state of your production
environment.
We recommend always deploying major releases to production at a scheduled time, of which the whole
team is aware of. Find the time when your application is least active and use that time to roll out updates. This
may sound obvious, but make sure that it’s not too late, because someone needs to be around after the
deployment for at least a few hours to monitor the application and make sure the deployment went fine.
Urgent production fixes can be deployed at any time.
After deployment finishes make sure to verify it. It is best to check all the features or fixes that you deployed to
make sure they work properly in production. It is a big win if your deployment tool can send an email to all
team members with a summary of changes after every deployment. This helps team members to understand
what exactly went live and how to communicate it to customers. Beanstalk does this for you automatically.
Your deployment to production is now complete, pop champagne and celebrate with your team!
Rolling Back
Sometimes deployments don’t go as planned and things break. In that case you have the possibility to rollback.
However, you should be as careful with rollbacks as with production deployments themselves. Sometimes a
rollback bring more havoc than the issue it was trying to fix. So first of all stay calm and don’t make any sudden
moves. Before performing a rollback, answer the following questions:
Did it break because of the code that I deployed, or did something else break?
You can only rollback files that you deployed, so if the source of the issues is something else a rollback won’t
be much help.
If the answer to both questions is “yes”, you can rollback safely. After rollback is done, make sure to fix the bug
that you discovered and commit it to either the development branch (if it was minor) or a separate bug-fix
branch. Then proceed with the regular bug-fix branch → staging; bug-fix → development → production
integration workflow.
It is important to merge the bug-fix branch to both the development and production branches in this case,
because your production branch should never include anything that doesn’t exist in your stable development
branch. The development branch is where developers work all day, so if your fix is only in the production
branch they will never see it and it can cause confusion.
Permissions
Every developer should be able to deploy to the Staging environment. They just need to make sure they don’t
overwrite each other’s changes when they do. That's exactly why the staging branch is a great help: all changes
from all developers are getting merged into it so it contains all of them.
Your Production environment, ideally, should only be accessible to a limited number of experienced
developers. These guys should always be prepared to fix the servers immediately after a deployment went
rogue.
Conclusion
We’ve been using this workflow in our team internally for many years to deploy Beanstalk and Postmark. Some
of these things were learned the hard way, through broken production servers. These days our production
deployments are incredibly smooth and don’t cause any stress at all. (And we deploy to several dozens of
servers simultaneously!) We really hope this guide will help you streamline your deployments too.
http://radar.oreilly.com/2009/03/continuous-deployment-5-eas.html
Beyond ‘Continuous’
https://martinfowler.com/bliki/BlueGreenDeployment.html
https://martinfowler.com/bliki/CanaryRelease.html