Building Software: An Artful Science: Michael Hogarth, MD
Building Software: An Artful Science: Michael Hogarth, MD
Michael Hogarth, MD
Software development is risky
“To err is human, to really foul things up requires a computer”
53% of those are completed cost an average of 189% of their original estimates.
42% of completed projects - have their original set of proposed features and functions.
$170 million dollar project was cancelled -- SAIC reaped more than $100 million
Problems
delayed by over a year. In 2004, the system was 1/10th of the functionality intended and thus largely unusable after
$170 spent
SAIC delivered what FBI requested, the requesting was flawed, poorly planned, not tied to scheduled deliverables
Now what?
http://www.washingtonpost.com/wp-dyn/content/article/2006/08/17/AR2006081701485_pf.html
Causes of the VCF Failure
Changing requirements (conceived before 9/11, after 9/11 requirements were
altered significantly)
14 different managers over the project lifetime (2 years)
Poor oversight by the primary ‘owner’ of the project (FBI) - did not oversee
construction closely
Did not pay attention to new, better commercial products -- kept head in the sand
because it “had to be built fast”
Hardware was purchased first, waiting on software (common problem) -- if
software is delayed, hardware is “legacy” quickly
http://www.inf.ed.ac.uk/teaching/courses/seoc2/2004_2005/slides/failures.pdf
Washington State Licensing Dept
1990 - Washington State License Application Mitigation Project
$41.8 million over 5 years to automate the State’s vehicle registration and license renewal process
1993 - after $51 million, the original design and requirements were expected to be obsolete when
finally built
Causes
ambitious
Feb 2003 - U.S. Treasury Dept. mailed 50,000 Social Security checks without beneficiary names.
Checks had to be ‘cancelled’ and reissued...
May 2005 - Toyota had to install a software fix on 20,000 hybrid Prius vehicles due to problems with
invalid engine warning lights. It is estimated that the automobile industry spends $2-$3billion/year
fixing software problems
Sept 2006 - A U.S. Government student loan service software error made public the personal data of
21,000 borrowers on it’s web site
2008 - new Terminal 5 at Heathrow Airport -New automated baggage routing system leads to over
20,000 bags being put in temporary storage...
does it really matter?
Software bugs can kill...
http://www.wired.com/software/coolapps/news/2005/11/69355
When users inadvertently cause
disaster
http://www.wired.com/software/coolapps/news/2005/11/69355?currentPage=2
How does this happen?
Many of the runaway projects are ‘overly ambitious’ -- a major issue (senior
management has unrealistic expectations of what can be done)
Most projects failed because of multiple problems/issues, not one.
Most problems/issues were management related.
In spite of obvious signs of the runaway software project (72% of project members
are aware), only 19% of senior management is aware
Risk management, an important part of identifying trouble and managing it, was
NOT done in any fashion in 55% of major runaway projects.
Causes of failure
Project objectives not fully specified -- 51%
Bad planning and estimating -- 48%
Technology is new to the organization -- 45%
Inadequate/no project management methods -- 42%
Insufficient senior staff on the team -- 42%
Poor performance by suppliers of software/hardware (contractors) --
42%
http://members.cox.net/johnsuzuki/softfail.htm
The cost of IT failures
2006 - $1 Trillion dollars spent on IT hardware, software,
and services worldwide...
18% of all IT projects will be abandoned before delivery
(18% of $1 trillion = $180 billion?)
53% will be delivered late or have cost overruns
1995 - Standish estimated the U.S. spent $81 billion for
cancelled software projects.....
Conclusions
IT projects are more likely to be unsuccessful than
successful
Only 1 in 5 software projects bring full satisfaction
(succeed)
The larger the project, the more likely the failure
http://www.it-cortex.com/Stat_Failure_Rate.htm#The%20Robbins-Gioia%20Survey%20(2001)
Software as engineering
Software has been viewed more as “art” than engineering
has lead to lack of structured methods and organization for building software systems
Software Analysis
Requirements Analysis
Specification Development
Testing
Deployment
Documentation
Maintenance
Software Facts and Figures
Maintenance consumes 40-80% of software costs during the lifetime of a software system
-- the most important part of the lifecycle
Error correction accounts for 17% of software maintenance costs
Enhancement is responsible for 60% of software maintenance costs -- most of the cost is
adding new capability to old software, NOT ‘fixing’ it.
Relative time spent on phases of the lifecycle
Development -- defining requirements (15%), design (20%), programming (20%),
testing and error removal (40%), documentation (5%)
Maintenance -- defining the change (15%), documentation review (5%), tracing logic
(25%), implementing the change (20%), testing (30%), updating documentation (5%)
http://en.wikipedia.org/wiki/Scrum_(development)
Scrum and useable software...
A key feature of Scrum is the idea that one creates useable software with each iteration
It forces the team to architect “the real thing” from the start -- not a “prototype” that is only
developed for demonstration purposes
For example, a system would start by using the planned architecture (web based
application using java 2 enterprise architecture, oracle database, etc...)
It helps to uncover many potential problems with the architecture, particularly one that
requires a number of integrated components (drivers that don’t work, connections between
machines, software compatibility with the operating system, digital certificate
compatibility or usability, etc...)
It allows users and management to actually use the software as it is being built....
invaluable!
Scrum team roles
Pigs and Chickens -- think scrambled eggs and bacon -- the chicken is supportive,
but the pig is committed.
Scrum “pigs” are committed the building the software regularly and frequently
Scrum Master -- the one who acts as a project manager and removes impediments to the
team delivering the sprint goal. Not the leader of the team, but buffer between team and
any chickens or distracting influences.
Product owner -- the person who has commissioned the project/software. Also known as
the “sponsor” of the project.
No problems are swept under the carpet -- nobody is penalized for uncovering a
problem
http://en.wikipedia.org/wiki/Scrum_(development)
Typical Scrum Artifacts
Spring Burn Down Chart
a chart showing the features for that sprint and the daily progress in
completing these
Product Backlog
a list of the high level requirements (in plain ‘user speak’)
Sprint Backlog
A list of tasks to be completed during the sprint
Agile methods and systems
Agile works well for small to medium sized projects (around 50,000 - 100,000
lines of source code)
Difficult to implement in large, complex system development with hundreds of
developers in multiple teams
Requires each team be given “chunks of work” that they can develop
Integration is key -- need to use standard components and standards for coding,
interconnecting, data modeling so each team does not create their own naming
conventions and interfaces to their components.
Quality assurance
The MOST IMPORTANT ASPECT of software development
Quality Assurance does not start with “testing”
Quality Assurance starts at the requirements gathering stage
“software faults” -- when the software does not perform as the user intended
bugs
requirements are good/accurate, but the programming causes a crash or other
abnormal state that is unexpected
requirements were wrong, programming was correct -- still a bug from the
user’s perspective
Some facts about bugs
Bugs in the form of poor requirements gathering or poor communication
with programmers is by far the most expense in a software development
effort
Bugs caught at the requirements or design stage are cheap
Bugs caught in the testing phase are expensive to fix
Bugs not caught are VERY EXPENSIVE in many ways
loss of customers/user trust
need to “fix” it quick -- lends itself to yet more problems because
everyone is panicking to get it fixed asap.
Software testing
System Testing
“black box” testing
“white box” testing
Regression Testing
Black box testing
Treats software as a black-box without knowledge of its interior
workings
It focuses simply on testing the functionality according to the
requirements
Tester inputs data, and sees the output from the process
White box testing
Tester has knowledge of the internal data structures and algorithms
Code Coverage - The tester creates tests to cause all statements in the program to be executed at
least once
Mutation Testing - software code is created that modifies the software slightly to emulate typical
user mistakes (using the wrong operator or variable name). Meant to test whether code is ever
used.
Fault injection - Introduce faults in the system on purpose to test error handling. Makes sure the
error occurs as expected and the system handles the error rather than crashing or causing an
incorrect state or response.
Static testing - primarily syntax checking and manual reading of the code to check errors (code
inspections, walkthroughs, code reviews)
Test Plan
Outlines the ways in which tests will be developed, the naming and classification for the various
failed tests (critical, show stopper, minor, etc..)
Outlines the features to be tested, the approach to be used, suspension criteria (the conditions
under which a test fails)
Describes the environment -- the test environment, including hardware, networking, databases,
software, operating system, etc..
Acceptance criteria - an objective quality standard that the software must meet in order to be
considered ready for release (minimum defect count and severity levels, minimum test
coverage, etc...)
Steps -- list of steps describing how to perform the test (log in, select patient A,
select medication list, pick Amoxicillin, click ‘submit to pharmacy’, etc..)
Expected results - describe the expected results up front so the tester knows
whether it failed or passed.
Regression testing
designed to find ‘software regressions’ -- when previously working functionality is
now not working because of changes made in other parts of the system
As software is versioned, this is the most common type of bug or “fault”
The list of ‘regression tests’ grows
a test for the functions in all previous versions
a test for any previously found bugs -- create a test to test that scenario
DeMarco and Lister. Waltzing with Bears: Managing Risk on Software Projects. 2003.
But don’t be blind to risk
Sometimes those who are big risk takers have a tendency to
emphasize positive thinking by ignoring the consequences of the
risk they are taking
If there are things that could go wrong, don’t be blind to them --
they exist and you need to recognize them.
If you don’t think of it, you could be blind-sided by it
DeMarco and Lister. Waltzing with Bears: Managing Risk on Software Projects. 2003.
Examples of risks
“Risk management often gives you more reality than you want.”
-- Mike Evans, Senior VP, ASC Corporation
Some may or may not have alternative actions to avoid or mitigate the risk if it comes to pass --
“is there a feasible plan B”
“Problem” -- a risk is a problem that is yet to occur, a problem is a risk that has occurrred
“Risk transition” -- when a risk becomes a problem, thus it is said the risk ‘materialized’
“Transition indicator” -- things that suggest the risk may transition to a problem. Example -- Russia
masses troops on the Georgian border...
DeMarco and Lister. Waltzing with Bears: Managing Risk on Software Projects. 2003.
Managing risks
Mitigation - steps you take before the transition or after to make corrections (if
possible) or to minimize the impact of the now “problem”.
Steps in risk management
risk discovery
exposure analysis (impact analysis)
contingency planning -- creating planB, planC, etc.. as options to engage if the risk
materializes
mitigation -- steps taken before transition to make contingency actions possible
transition monitoring -- tracking of managed risks, looking for transitions and
materializations (risk management meetings).
DeMarco and Lister. Waltzing with Bears: Managing Risk on Software Projects. 2003.
Common software project risks
Schedule flaw - almost always due to neglecting work or minimizing work that is necessary
Scope creep (requirements inflation) or scope shifting (because of market conditions or changes
in business requirements) -- inevitable -- don’t believe you can keep scope ‘frozen’ for very
long
recognize it, create a mitigation strategy, recognize transition, and create a contingency
for example, if requirements need to be added or changed, need to make sure ‘management’
is aware of the consequences and adjustments are made in capacity, expectation, timeline,
budget.
It is not bad to change scope -- it is bad to change scope and believe nothing else needs to
change
DeMarco and Lister. Waltzing with Bears: Managing Risk on Software Projects. 2003.
“Post mortem” evaluations
No project is “100% successful” -- they all have problems, some have less than
others, some have fatal problems.
It is critical to evaluate projects after they are completed to characterize common
risks/problems and establishing methods of mitigation before the next project
Capability Maturity Model (CMM)
A measure of the ‘maturity’ of an organization in how they
approach projects
Originally developed as a tool for assessing the ability of
government contractors processes to perform a contracted software
project (can they do it?)
Maturity Levels -- 1-5. Level 5 is where a process is optimized by
continuous process improvement
CMM in detail
Level 1 - Ad hoc: -- processes are undocumented and in a state of dynamic change,
everything is ‘ad hoc’
Level 2 - Repeatable: -- some processes are repeatable with possibly consistent reults
Level 3 - Defined: -- set of defined and documented standard processes subject to
improvement over time
Level 4 - Managed: --using process metrics to control the process. Management can
iddntify ways to adjust and adapt the process
Level 5 - Optimized: -- process improvement objectives are established (post mortem
evaluation...), and process improvements are developed to address common causes of
process variation.
Why medical software is hard...
Courtesy Dr. Andy Coren, Health Information Technology: A Clinician’s View. 2008
Healthcare IT failures
Hard to discover -- nobody airs dirty laundry
West Virginia -- system has to be removed a week after
implementation
Mt Sinai -- 6 weeks after implementation, system is “rolled
back” due to staff complaints