Security in Computing
Security in Computing
Program Security
In this chapter:
In the first two chapters, we learned about the need for computer security and we studied encryption, a
fundamental tool in implementing many kinds of security controls. In this chapter, we begin to study how
to apply security in computing. We start with why we need security at the program level and how we can
achieve it.
In one form or another, protecting programs is at the heart of computer security. So we need to ask two
important questions:
In later chapters, we will examine particular types of programsincluding operating systems, database
management systems, and network implementationsand the specific kinds of security issues that are
raised by the nature of their design and functionality. In this chapter, we address more general themes,
most of which carry forward to these special-purpose systems. Thus, this chapter not only lays the
groundwork for future chapters but also is significant on its own.
This chapter deals with the writing of programs. It defers to a later chapter what may be a much larger
issue in program security: trust. The trust problem can be framed as follows: Presented with a finished
program, for example, a commercial software package, how can you tell how secure it is or how to use it
in its most secure way? In part the answer to these questions is independent, third-party evaluations,
presented for operating systems (but applicable to other programs, as well) in Chapter 5. The reporting
and fixing of discovered flaws is discussed in Chapter 9, as are liability and software warranties. For
now, however, the unfortunate state of commercial software development is largely a case of trust your
source, and buyer beware.
SECURE PROGRAMS
Consider what we mean when we say that a program is "secure." We saw in Chapter 1 that security
implies some degree of trust that the program enforces expected confidentiality, integrity, and
availability. From the point of view of a program or a programmer, how can we look at a software
component or code fragment and assess its security? This question is, of course, similar to the problem
of assessing software quality in general. One way to assess security or quality is to ask people to name
the characteristics of software that contribute to its overall security. However, we are likely to get
different answers from different people. This difference occurs because the importance of the
characteristics depends on who is analyzing the software. For example, one person may decide that
code is secure because it takes too long to break through its security controls. And someone else may
decide code is secure if it has run for a period of time with no apparent failures. But a third person may
decide that any potential fault in meeting security requirements makes code insecure.
An assessment of security can also be influenced by someone's general perspective on software
quality. For example, if your manager's idea of quality is conformance to specifications, then she might
consider the code secure if it meets security requirements, whether or not the requirements are
complete or correct. This security view played a role when a major computer manufacturer delivered all
its machines with keyed locks, since a keyed lock was written in the requirements. But the machines
were not secure, because all locks were configured to use the same key! Thus, another view of security
is fitness for purpose; in this view, the manufacturer clearly had room for improvement.
In general, practitioners often look at quantity and types of faults for evidence of a product's quality (or
lack of it). For example, developers track the number of faults found in requirements, design, and code
inspections and use them as indicators of the likely quality of the final product. Sidebar 3-1 explains the
importance of separating the faultsthe causes of problemsfrom the failuresthe effects of the
faults.
Fixing Faults
One approach to judging quality in security has been fixing faults. You might argue that a module in
which 100 faults were discovered and fixed is better than another in which only 20 faults were
discovered and fixed, suggesting that more rigorous analysis and testing had led to the finding of the
larger number of faults. Au contraire, challenges your friend: a piece of software with 100 discovered
faults is inherently full of problems and could clearly have hundreds more waiting to appear. Your
friend's opinion is confirmed by the software testing literature; software that has many faults early on is
likely to have many others still waiting to be found.
Early work in computer security was based on the paradigm of "penetrate and patch," in which
analysts searched for and repaired faults. Often, a top-quality "tiger team" would be convened to test a
system's security by attempting to cause it to fail.
The test was considered to be a "proof" of security; if the system withstood the attacks, it was
considered secure. Unfortunately, far too often the proof became a counterexample, in which not just
one but several serious security problems were uncovered. The problem discovery in turn led to a rapid
effort to "patch" the system to repair or restore the security. (See Schell's analysis in [SCH79].)
However, the patch efforts were largely useless, making the system less secure rather than more
secure because they frequently introduced new faults. There are three reasons why.
The pressure to repair a specific problem encouraged a narrow focus on the fault itself and not
on its context. In particular, the analysts paid attention to the immediate cause of the failure
and not to the underlying design or requirements faults.
The fault often had nonobvious side effects in places other than the immediate area of the fault.
The fault could not be fixed properly because system functionality or performance would suffer
as a consequence.
Unexpected Behavior
The inadequacies of penetrate-and-patch led researchers to seek a better way to be confident that code
meets its security requirements. One way to do that is to compare the requirements with the behavior.
That is, to understand program security, we can examine programs to see whether they behave as their
designers intended or users expected. We call such unexpected behavior a program security flaw; it is
inappropriate program behavior caused by a program vulnerability. Unfortunately, the terminology in the
computer security field is not consistent with the IEEE standard described in Side-bar 3-1; there is no
direct mapping of the terms "vulnerability" and "flaw" into the characterization of faults and failures. A
flaw can be either a fault or failure, and a vulnerability usually describes a class of flaws, such as a
buffer overflow. In spite of the inconsistency, it is important for us to remember that we must view
vulnerabilities and flaws from two perspectives, cause and effect, so that we see what fault caused the
problem and what failure (if any) is visible to the user. For example, a Trojan horse may have been
injected in a piece of codea flaw exploiting a vulnerabilitybut the user may not yet have seen the
Trojan horse's malicious behavior. Thus, we must address program security flaws from inside and
outside, to find causes not only of existing failures but also of incipient ones. Moreover, it is not enough
to identify these problems. We must also determine how to prevent harm caused by possible flaws.
Program security flaws can derive from any kind of software fault. That is, they cover everything from a
misunderstanding of program requirements to a one-character error in coding or even typing. The flaws
can result from problems in a single code component or from the failure of several programs or program
pieces to interact compatibly through a shared interface. The security flaws can reflect code that was
intentionally designed or coded to be malicious, or code that was simply developed in a sloppy or
misguided way. Thus, it makes sense to divide program flaws into two separate logical categories:
inadvertent human errors versus malicious, intentionally induced flaws.
Program controls apply at the level of the individual program and programmer. When we test a
system, we try to make sure that the functionality prescribed in the requirements is
implemented in the code. That is, we take a "should do" checklist and verify that the code does
what it is supposed to do. However, security is also about preventing certain actions: a
"shouldn't do" list. It is almost impossible to ensure that a program does precisely what its
designer or user intended, and nothing more. Regardless of designer or programmer intent, in
a large and complex system, the number of pieces that have to fit together properly interact in
an unmanageably large number of ways. We are forced to examine and test the code for
typical or likely cases; we cannot exhaustively test every state and data combination to verify a
system's behavior. So sheer size and complexity preclude total flaw prevention or mediation.
Programmers intending to implant malicious code can take advantage of this incompleteness
and hide some flaws successfully, despite our best efforts.
Sidebar 3-2 Dramatic Increase in Cyber Attacks
Carnegie Mellon University's Computer Emergency Response Team (CERT) tracks the
number and kinds of vulnerabilities and cyber attacks reported worldwide. Part of CERT's
mission is to warn users and developers of new problems and also to provide information on
ways to fix them. According to the CERT coordination center, fewer than 200 known
vulnerabilities were reported in 1995, and that number ranged between 200 and 400 from 1996
to 1999. But the number increased dramatically in 2000, with over 1,000 known vulnerabilities
in 2000, almost 2,420 in 2001, and an expectation of at least 3,750 in 2002 (over 1,000 in the
first quarter of 2002).
How does that translate into cyber attacks? The CERT reported 3,734 security incidents in
1998, 9,859 in 1999, 21,756 in 2000, and 52,658 in 2001. But in the first quarter of 2002 there
were already 26,829 incidents, so it seems as if the exponential growth rate will continue
[HOU02]. Moreover, as of June 2002, Symantec's Norton antivirus software checked for
61,181 known virus patterns, and McAfee's product could detect over 50,000 [BER01]. The
Computer Security Institute and the FBI cooperate to take an annual survey of approximately
500 large institutions: companies, government organizations, and educational institutions
[CSI02]. Of the respondents, 90 percent detected security breaches, 25 percent identified
between two and five events, and 37 percent reported more than ten. By a different count, the
Internet security firm Riptech reported that the number of successful Internet attacks was 28
percent higher for Janu-aryJune 2002 compared with the previous six-month period [RIP02].
A survey of 167 network security personnel revealed that more than 75 percent of government
respondents experienced attacks to their networks; more than half said the attacks were
frequent. However, 60 percent of respondents admitted that they could do more to make their
systems more secure; the respondents claimed that they simply lacked time and staff to
address the security issues [BUS01]. In the CSI/FBI survey, 223, or 44 percent of respondents,
could and did quantify their loss from incidents; their losses totaled over $455,000,000.
It is clearly time to take security seriously, both as users and developers.
2.
Programming and software engineering techniques change and evolve far more rapidly than do
computer security techniques. So we often find ourselves trying to secure last year's
technology while software developers are rapidly adopting today'sand next year's
technology.
Still, the situation is far from bleak. Computer security has much to offer to program security. By
understanding what can go wrong and how to protect against it, we can devise techniques and tools to
secure most computer applications.
Types of Flaws
To aid our understanding of the problems and their prevention or correction, we can define categories
that distinguish one kind of problem from another. For example, Landwehr et al. [LAN94] present a
taxonomy of program flaws, dividing them first into intentional and inadvertent flaws. They further divide
intentional flaws into malicious and nonmalicious ones. In the taxonomy, the inadvertent flaws fall into
six categories:
This list gives us a useful overview of the ways programs can fail to meet their security requirements.
We leave our discussion of the pitfalls of identification and authentication for Chapter 4, in which we also
investigate separation into execution domains. In this chapter, we address the other categories, each of
which has interesting examples.
Buffer Overflows
A buffer overflow is the computing equivalent of trying to pour two liters of water into a one-liter pitcher:
Some water is going to spill out and make a mess. And in computing, what a mess these errors have
made!
Definition
A buffer (or array or string) is a space in which data can be held. A buffer resides in memory. Because
memory is finite, a buffer's capacity is finite. For this reason, in many programming languages the
programmer must declare the buffer's maximum size so that the compiler can set aside that amount of
space.
Let us look at an example to see how buffer overflows can happen. Suppose a C language program
contains the declaration:
char sample[10];
The compiler sets aside 10 bytes to store this buffer, one byte for each of the ten elements of the array,
sample[0] through sample[9]. Now we execute the statement:
sample[10] = 'A';
The subscript is out of bounds (that is, it does not fall between 0 and 9), so we have a problem. The
nicest outcome (from a security perspective) is for the compiler to detect the problem and mark the error
during compilation. However, if the statement were
sample[i] = 'A';
we could not identify the problem until i was set during execution to a too-big subscript. It would be
useful if, during execution, the system produced an error message warning of a subscript out of bounds.
Unfortunately, in some languages, buffer sizes do not have to be predefined, so there is no way to
detect an out-of-bounds error. More importantly, the code needed to check each subscript against its
potential maximum value takes time and space during execution, and the resources are applied to catch
a problem that occurs relatively infrequently. Even if the compiler were careful in analyzing the buffer
declaration and use, this same problem can be caused with pointers, for which there is no reasonable
way to define a proper limit. Thus, some compilers do not generate the code to check for exceeding
bounds.
Let us examine this problem more closely. It is important to recognize that the potential overflow causes
a serious problem only in some instances. The problem's occurrence depends on what is adjacent to
the array sample. For example, suppose each of the ten elements of the array sample is filled with the
letter A and the erroneous reference uses the letter B, as follows:
Security Implication
Let us suppose that a malicious person understands the damage that can be done by a buffer overflow;
that is, we are dealing with more than simply a normal, errant programmer. The malicious programmer
looks at the four cases illustrated in Figure 3-1 and thinks deviously about the last two: What data values
could the attacker insert just after the buffer so as to cause mischief or damage, and what planned
instruction codes could the system be forced to execute? There are many possible answers, some of
which are more malevolent than others. Here, we present two buffer overflow attacks that are used
frequently. (See [ALE96] for more details.) First, the attacker may replace code in the system space.
Remember that every program is invoked by the operating system and that the operating system may
run with higher privileges than those of a regular program. Thus, if the attacker can gain control by
masquerading as the operating system, the attacker can execute many commands in a powerful role.
Therefore, by replacing a few instructions right after returning from his or her own procedure, the
attacker can get control back from the operating system, possibly with raised privileges. If the buffer
overflows into system code space, the attacker merely inserts overflow data that correspond to the
machine code for instructions.
On the other hand, the attacker may make use of the stack pointer or the return register. Subprocedures
calls are handled with a stack, a data structure in which the most recent item inserted is the next one
removed (last arrived, first served). This structure works well because procedure calls can be nested,
with each return causing control to transfer back to the immediately preceding routine at its point of
execution. Each time a procedure is called, its parameters, the return address (the address immediately
after its call), and other local values are pushed onto a stack. An old stack pointer is also pushed onto
the stack, and a stack pointer register is reloaded with the address of these new values. Then, control is
transferred to the subprocedure.
As the subprocedure executes, it fetches parameters that it finds by using the address pointed to by the
stack pointer. Typically, the stack pointer is a register in the processor. Therefore, by causing an
overflow into the stack, the attacker can change either the old stack pointer (changing the context for the
calling procedure) or the return address (causing control to transfer where the attacker wants when the
subprocedure returns). Changing the context or return address allows the attacker to redirect execution
to a block of code the attacker wants.
In both these cases, a little experimentation is needed to determine where the overflow is and how to
control it. But the work to be done is relatively smallprobably a day or two for a competent analyst.
These buffer overflows are carefully explained in a paper by Mudge [MUD95] of the famed l0pht
computer security group.
An alternative style of buffer overflow occurs when parameter values are passed into a routine,
especially when the parameters are passed to a web server on the Inter-net. Parameters are passed in
the URL line, with a syntax similar to
http://www.somesite.com/subpage/userinput&parm1=(808)5551212&parm2=2004Jan01
In this example, the page userinput receives two parameters, parm1 with value (808)555-1212 (perhaps
a U.S. telephone number) and parm2 with value 2004Jan01 (perhaps a date). The web browser on the
caller's machine will accept values from a user who probably completes fields on a form. The browser
encodes those values and transmits them back to the server's web site.
The attacker might question what the server would do with a really long telephone number, say, one
with 500 or 1000 digits. But, you say, no telephone in the world has such a telephone number; that is
probably exactly what the developer thought, so the developer may have allocated 15 or 20 bytes for an
expected maximum length telephone number. Will the program crash with 500 digits? And if it crashes,
can it be made to crash in a predictable and usable way? (For the answer to this question, see
Litchfield's investigation of the Microsoft dialer program [LIT99].) Passing a very long string to a web
server is a slight variation on the classic buffer overflow, but no less effective.
As noted above, buffer overflows have existed almost as long as higher-level programming languages
with arrays. For a long time they were simply a minor annoyance to programmers and users, a cause of
errors and sometimes even system crashes. Rather recently, attackers have used them as vehicles to
cause first a system crash and then a controlled failure with a serious security implication. The large
number of security vulnerabilities based on buffer overflows shows that developers must pay more
attention now to what had previously been thought to be just a minor annoyance.
Incomplete Mediation
Incomplete mediation is another security problem that has been with us for decades. Attackers are
exploiting it to cause security problems.
Definition
Consider the example of the previous section:
http://www.somesite.com/subpage/userinput&parm1=(808)5551212&parm2=2004Jan01
The two parameters look like a telephone number and a date. Probably the client's (user's) web browser
enters those two values in their specified format for easy processing on the server's side. What would
happen if parm2 were submitted as 1800Jan01? Or 1800Feb30? Or 2048Min32? Or 1Aardvark2Many?
Something would likely fail. As with buffer overflows, one possibility is that the system would fail
catastrophically, with a routine's failing on a data type error as it tried to handle a month named "Min" or
even a year (like 1800) which was out of range. Another possibility is that the receiving program would
continue to execute but would generate a very wrong result. (For example, imagine the amount of
interest due today on a billing error with a start date of 1 Jan 1800.) Then again, the processing server
might have a default condition, deciding to treat 1Aardvark2Many as 3 July 1947. The possibilities are
endless.
One way to address the potential problems is to try to anticipate them. For instance, the programmer in
the examples above may have written code to check for correctness on the client's side (that is, the
user's browser). The client program can search for and screen out errors. Or, to prevent the use of
nonsense data, the program can restrict choices only to valid ones. For example, the program supplying
the parameters might have solicited them by using a drop-down box or choice list from which only the
twelve conventional months would have been possible choices. Similarly, the year could have been
tested to ensure that the value was between 1995 and 2005, and date numbers would have to have
been appropriate for the months in which they occur (no 30th of February, for example). Using these
verification techniques, the programmer may have felt well insulated from the possible problems a
careless or malicious user could cause.
However, the program is still vulnerable. By packing the result into the return URL, the programmer left
these data fields in a place accessible to (and changeable by) the user. In particular, the user could edit
the URL line, change any parameter values, and resend the line. On the server side, there is no way for
the server to tell if the response line came from the client's browser or as a result of the user's editing
the URL directly. We say in this case that the data values are not completely mediated: The sensitive
data (namely, the parameter values) are in an exposed, uncontrolled condition.
Security Implication
Incomplete mediation is easy to exploit, but it has been exercised less often than buffer overflows.
Nevertheless, unchecked data values represent a serious potential vulnerability.
To demonstrate this flaw's security implications, we use a real example; only the name of the vendor
has been changed to protect the guilty. Things, Inc., was a very large, international vendor of consumer
products, called Objects. The company was ready to sell its Objects through a web site, using what
appeared to be a standard e-commerce application. The management at Things decided to let some of
its in-house developers produce the web site so that its customers could order Objects directly from the
web.
To accompany the web site, Things developed a complete price list of its Objects, including pictures,
descriptions, and drop-down menus for size, shape, color, scent, and any other properties. For example,
a customer on the web could choose to buy 20 of part number 555A Objects. If the price of one such
part were $10, the web server would correctly compute the price of the 20 parts to be $200. Then the
customer could decide whether to have the Objects shipped by boat, by ground transportation, or sent
electronically. If the customer were to choose boat delivery, the customer's web browser would
complete a form with parameters like these:
http://www.things.com/order/final&custID=101&part=555A
&qy=20&price=10&ship=boat&shipcost=5&total=205
So far, so good; everything in the parameter passage looks correct. But this procedure leaves the
parameter statement open for malicious tampering. Things should not need to pass the price of the
items back to itself as an input parameter; presumably Things knows how much its Objects cost, and
they are unlikely to change dramatically since the time the price was quoted a few screens earlier.
A malicious attacker may decide to exploit this peculiarity by supplying instead the following URL, where
the price has been reduced from $205 to $25:
http://www.things.com/order/final&custID=101&part=555A
&qy=20&price=1&ship=boat&shipcost=5&total=25
Surprise! It worked. The attacker could have ordered Objects from Things in any quantity at any price.
And yes, this code was running on the web site for a while before the problem was detected. From a
security perspective, the most serious concern about this flaw was the length of time that it could have
run undetected. Had the whole world suddenly made a rush to Things's web site and bought Objects at
a fraction of their price, Things probably would have noticed. But Things is large enough that it would
never have detected a few customers a day choosing prices that were similar to (but smaller than) the
real price, say 30 percent off. The e-commerce division would have shown a slightly smaller profit than
other divisions, but the difference probably would not have been enough to raise anyone's eyebrows;
the vulnerability could have gone unnoticed for years. Fortunately Things hired a consultant to do a
routine review of its code, and the consultant found the error quickly.
This web program design flaw is easy to imagine in other web settings. Those of us interested in
security must ask ourselves how many similar problems are there in running code today? And how will
those vulnerabilities ever be found?
Definition
Access control is a fundamental part of computer security; we want to make sure that only those who
should access an object are allowed that access. (We explore the access control mechanisms in
operating systems in greater detail in Chapter 4.) Every requested access must be governed by an
access policy stating who is allowed access to what; then the request must be mediated by an access
policy enforcement agent. But an incomplete mediation problem occurs when access is not checked
universally. The time-of-check to time-of-use (TOCTTOU) flaw concerns mediation that is performed
with a "bait and switch" in the middle. It is also known as a serialization or synchronization flaw.
To understand the nature of this flaw, consider a person's buying a sculpture that costs $100. The buyer
removes five $20 bills from a wallet, carefully counts them in front of the seller, and lays them on the
table. Then the seller turns around to write a receipt. While the seller's back is turned, the buyer takes
back one $20 bill. When the seller turns around, the buyer hands over the stack of bills, takes the
receipt, and leaves with the sculpture. Between the time when the security was checked (counting the
bills) and the access (exchanging the sculpture for the bills), a condition changed: what was checked is
no longer valid when the object (that is, the sculpture) is accessed.
A similar situation can occur with computing systems. Suppose a request to access a file were
presented as a data structure, with the name of the file and the mode of access presented in the
structure. An example of such a structure is shown in Figure 3-2.
The data structure is essentially a "work ticket," requiring a stamp of authorization; once authorized, it
will be put on a queue of things to be done. Normally the access control mediator receives the data
structure, determines whether the access should be allowed, and either rejects the access and stops or
allows the access and forwards the data structure to the file handler for processing.
To carry out this authorization sequence, the access control mediator would have to look up the file
name (and the user identity and any other relevant parameters) in tables. The mediator could compare
the names in the table to the file name in the data structure to determine whether access is appropriate.
More likely, the mediator would copy the file name into its own local storage area and compare from
there. Comparing from the copy leaves the data structure in the user's area, under the user's control.
It is at this point that the incomplete mediation flaw can be exploited. While the mediator is checking
access rights for the file my_file, the user could change the file name descriptor to your_file, the value
shown in Figure 3-3. Having read the work ticket once, the mediator would not be expected to reread
the ticket before approving it; the mediator would approve the access and send the now-modified
descriptor to the file handler.
Security Implication
The security implication here is pretty clear: Checking one action and performing another is an example
of ineffective access control. We must be wary whenever there is a time lag, making sure that there is
no way to corrupt the check's results during that interval.
Fortunately, there are ways to prevent exploitation of the time lag. One way to do so is to use digital
signatures and certificates. As described in Chapter 2, a digital signature is a sequence of bits applied
with public key cryptography, so that many peopleusing a public keycan verify the authenticity of the
bits, but only one personusing the corresponding private keycould have created them. In this case,
the time of check is when the person signs, and the time of use is when anyone verifies the signature.
Suppose the signer's private key is disclosed some time before its time of use. In that case, we do not
know for sure that the signer did indeed "sign" the digital signature; it might have been a malicious
attacker acting with the private key of the signer. To counter this vulnerability, a public key cryptographic
infrastructure includes a mechanism called a key revocation list, for reporting a revoked public keyone
that had been disclosed, was feared disclosed or lost, became inoperative, or for any other reason
should no longer be taken as valid. The recipient must check the key revocation list before accepting a
digital signature as valid.
and dangerous. As we will see in the next section, innocuous-seeming program flaws can be exploited
by malicious attackers to plant intentionally harmful code
How can such a situation arise? When you last installed a major software package, such as a word
processor, a statistical package, or a plug-in from the Internet, you ran one command, typically called
INSTALL or SETUP. From there, the installation program took control, creating some files, writing in
other files, deleting data and files, and perhaps renaming a few that it would change. A few minutes and
a quite a few disk accesses later, you had plenty of new code and data, all set up for you with a
minimum of human intervention. Other than the general descriptions on the box, in the documentation
files, or on the web pages, you had absolutely no idea exactly what "gifts" you had received. You hoped
all you received was good, and it probably was. The same uncertainty exists when you unknowingly
download an application, such as a Java applet or an ActiveX control, while viewing a web site.
Thousands or even millions of bytes of programs and data are transferred, and hundreds of
modifications may be made to your existing files, all occurring without your explicit consent or
knowledge.
You are likely to have been affected by a virus at one time or another, either because your computer
was infected by one or because you could not access an infected system while its administrators were
cleaning up the mess one made. In fact, your virus might actually have been a worm: The terminology of
malicious code is sometimes used imprecisely. A virus is a program that can pass on malicious code to
other nonmalicious programs by modifying them. The term "virus" was coined because the affected
program acts like a biological virus: It infects other healthy subjects by attaching itself to the program
and either destroying it or coexisting with it. Because viruses are insidious, we cannot assume that a
clean program yesterday is still clean today. Moreover, a good program can be modified to include a
copy of the virus program, so the infected good program itself begins to act as a virus, infecting other
programs. The infection usually spreads at a geometric rate, eventually overtaking an entire computing
system and spreading to all other connected systems.
A virus can be either transient or resident. A transient virus has a life that depends on the life of its host;
the virus runs when its attached program executes and terminates when its attached program ends.
(During its execution, the transient virus may have spread its infection to other programs.) A resident
virus locates itself in memory; then it can remain active or be activated as a stand-alone program, even
after its attached program ends.
A Trojan horse is malicious code that, in addition to its primary effect, has a second, nonobvious
malicious effect.1 As an example of a computer Trojan horse,
A logic bomb is a class of malicious code that "detonates" or goes off when a specified condition
occurs. A time bomb is a logic bomb whose trigger is a time or date.
A trapdoor or backdoor is a feature in a program by which someone can access the program other
than by the obvious, direct call, perhaps with special privileges. For instance, an automated bank teller
program might allow anyone entering the number 990099 on the keypad to process the log of
everyone's transactions at that machine. In this example, the trapdoor could be intentional, for
maintenance purposes, or it could be an illicit way for the implementer to wipe out any record of a crime.
A worm is a program that spreads copies of itself through a network. The primary difference between a
worm and a virus is that a worm operates through networks, and a virus can spread through any
medium (but usually uses copied program or data files). Additionally, the worm spreads copies of itself
as a stand-alone program, whereas the virus spreads copies of itself as a program that attaches to or
embeds in other programs.
White et al. [WHI89] also define a rabbit as a virus or worm that self-replicates without bound, with the
intention of exhausting some computing resource. A rabbit might create copies of itself and store them
on disk, in an effort to completely fill the disk, for example.
These definitions match current careful usage. The distinctions among these terms are small, and often
the terms are confused, especially in the popular press. The term "virus" is often used to refer to any
piece of malicious code. Furthermore, two or more forms of malicious code can be combined to produce
a third kind of problem. For instance, a virus can be a time bomb if the viral code that is spreading will
trigger an event after a period of time has passed. The kinds of malicious code are summarized in Table
3-1.
Characteristics
Virus
Trojan horse
Logic bomb
Time bomb
Trapdoor
Worm
Rabbit
Because "virus" is the popular name given to all forms of malicious code and because fuzzy lines exist
between different kinds of malicious code, we will not be too restrictive in the following discussion. We
want to look at how malicious code spreads, how it is activated, and what effect it can have. A virus is a
convenient term for mobile malicious code, and so in the following sections we use the term "virus"
almost exclusively. The points made apply also to other forms of malicious code.
Appended Viruses
A program virus attaches itself to a program; then, whenever the program is run, the virus is activated.
This kind of attachment is usually easy to program.
In the simplest case, a virus inserts a copy of itself into the executable program file before the first
executable instruction. Then, all the virus instructions execute first; after the last virus instruction, control
flows naturally to what used to be the first program instruction. Such a situation is shown in Figure 3-4.
Document Viruses
Currently, the most popular virus type is what we call the document virus, which is implemented within
a formatted document, such as a written document, a database, a slide presentation, or a spreadsheet.
These documents are highly structured files that contain both data (words or numbers) and commands
(such as formulas, formatting controls, links). The commands are part of a rich programming language,
including macros, variables and procedures, file accesses, and even system calls. The writer of a
document virus uses any of the features of the programming language to perform malicious actions.
The ordinary user usually sees only the content of the document (its text or data), so the virus writer
simply includes the virus in the commands part of the document, as in the integrated program virus.
It is hard to detect.
It is not easily destroyed or deactivated.
It spreads infection widely.
Few viruses meet all these criteria. The virus writer chooses from these objectives when deciding what
the virus will do and where it will reside.
Just a few years ago, the challenge for the virus writer was to write code that would be executed
repeatedly so that the virus could multiply. Now, however, one execution is enough to ensure
widespread distribution. Many viruses are transmitted by e-mail, using either of two routes. In the first
case, some virus writers generate a new e-mail message to all addresses in the victim's address book.
These new messages contain a copy of the virus so that it propagates widely. Often the message is a
brief, chatty, non-specific message that would encourage the new recipient to open the attachment from
a friend (the first recipient). For example, the subject line or message body may read "I thought you
might enjoy this picture from our vacation." In the second case, the virus writer can leave the infected
file for the victim to forward unknowingly. If the virus's effect is not immediately obvious, the victim may
pass the infected file unwittingly to other victims.
Let us look more closely at the issue of viral residence.
One-Time Execution
The majority of viruses today execute only once, spreading their infection and causing their effect in that
one execution. A virus often arrives as an e-mail attachment of a document virus. It is executed just by
being opened.
Memory-Resident Viruses
Some parts of the operating system and most user programs execute, terminate, and disappear, with
their space in memory being available for anything executed later. For very frequently used parts of the
operating system and for a few specialized user programs, it would take too long to reload the program
each time it was needed. Such code remains in memory and is called "resident" code. Examples of
resident code are the routine that interprets keys pressed on the keyboard, the code that handles error
conditions that arise during a program's execution, or a program that acts like an alarm clock, sounding
a signal at a time the user determines. Resident routines are sometimes called TSRs or "terminate and
stay resident" routines.
Virus writers also like to attach viruses to resident code because the resident code is activated many
times while the machine is running. Each time the resident code runs, the virus does too. Once
activated, the virus can look for and infect uninfected carriers. For example, after activation, a boot
sector virus might attach itself to a piece of resident code. Then, each time the virus was activated it
might check whether any removable disk in a disk drive was infected and, if not, infect it. In this way the
virus could spread its infection to all removable disks used during the computing session.
and transmitted from one user to another, a practice that spreads the infection. Finally, executing code
in a library can pass on the viral infection to other transmission media. Compilers, loaders, linkers,
runtime monitors, runtime debuggers, and even virus control programs are good candidates for hosting
viruses because they are widely shared.
Virus Signatures
A virus cannot be completely invisible. Code must be stored somewhere, and the code must be in
memory to execute. Moreover, the virus executes in a particular way, using certain methods to spread.
Each of these characteristics yields a telltale pattern, called a signature, that can be found by a
program that knows to look for it. The virus's signature is important for creating a program, called a
virus scanner, that can automatically detect and, in some cases, remove viruses. The scanner
searches memory and long-term storage, monitoring execution and watching for the telltale signatures
of viruses. For example, a scanner looking for signs of the Code Red worm can look for a pattern
containing the following characters:
/default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
%u9090%u6858%ucbd3
%u7801%u9090%u6858%ucdb3%u7801%u9090%u6858
%ucbd3%u7801%u9090
%u9090%u8190%u00c3%u0003%ub00%u531b%u53ff
%u0078%u0000%u00=a
HTTP/1.0
When the scanner recognizes a known virus's pattern, it can then block the virus, inform the user, and
deactivate or remove the virus. However, a virus scanner is effective only if it has been kept up-to-date
with the latest information on current viruses. Side-bar 3-4 describes how viruses were the primary
security breach among companies surveyed in 2001.
Sidebar 3-4 The Viral Threat
Information Week magazine reports that viruses, worms, and Trojan horses represented the primary
method for breaching security among the 4,500 security professionals surveyed in 2001 [HUL01c].
Almost 70 percent of the respondents noted that virus, worm, and Trojan horse attacks occurred in the
12 months before April 2001. Second were the 15 percent of attacks using denial of service;
telecommunications or unauthorized entry was responsible for 12 percent of the attacks. (Multiple
responses were allowed.) These figures represent establishments in 42 countries throughout North
America, South America, Europe, and Asia.
Storage Patterns
Most viruses attach to programs that are stored on media such as disks. The attached virus piece is
invariant, so that the start of the virus code becomes a detectable signature. The attached piece is
always located at the same position relative to its attached file. For example, the virus might always be
at the beginning, 400 bytes from the top, or at the bottom of the infected file. Most likely, the virus will be
at the beginning of the file, because the virus writer wants to obtain control of execution before the bona
fide code of the infected program is in charge. In the simplest case, the virus code sits at the top of the
program, and the entire virus does its malicious duty before the normal code is invoked. In other cases,
the virus infection consists of only a handful of instructions that point or jump to other, more detailed
instructions elsewhere. For example, the infected code may consist of condition testing and a jump or
call to a separate virus module. In either case, the code to which control is transferred will also have a
recognizable pattern. Both of these situations are shown in Figure 3-9.
Execution Patterns
A virus writer may want a virus to do several things at the same time, namely, spread infection, avoid
detection, and cause harm. These goals are shown in Table 3-2, along with ways each goal can be
addressed. Unfortunately, many of these behaviors are perfectly normal and might otherwise go
undetected. For instance, one goal is modifying the file directory; many normal programs create files,
delete files, and write to storage media. Thus, there are no key signals that point to the presence of a
virus.
Most virus writers seek to avoid detection for themselves and their creations. Because a disk's boot
sector is not visible to normal operations (for example, the contents of the boot sector do not show on a
directory listing), many virus writers hide their code there. A resident virus can monitor disk accesses
and fake the result of a disk operation that would show the virus hidden in a boot sector by showing the
data that should have been in the boot sector (which the virus has moved elsewhere).
There are no limits to the harm a virus can cause. On the modest end, the virus might do nothing; some
writers create viruses just to show they can do it. Or the virus can be relatively benign, displaying a
message on the screen, sounding the buzzer, or playing music. From there, the problems can escalate.
One virus can erase files, another an entire disk; one virus can prevent a computer from booting, and
another can prevent writing to disk. The damage is bounded only by the creativity of the virus's author.
How It Is Caused
Modify directory
Rewrite data
Append to data
Append data to self
Spread infection
Intercept interrupt
Intercept operating system call (to format disk,
for example)
Modify system file
Modify ordinary executable program
Infect disks
Transmission Patterns
A virus is effective only if it has some means of transmission from one location to another. As we have
already seen, viruses can travel during the boot process, by attaching to an executable file or traveling
within data files. The travel itself occurs during execution of an already infected program. Since a virus
can execute any instructions a program can, virus travel is not confined to any single medium or
execution pattern. For example, a virus can arrive on a diskette or from a network connection, travel
during its host's execution to a hard disk boot sector, reemerge next time the host computer is booted,
and remain in memory to infect other diskettes as they are accessed.
Polymorphic Viruses
The virus signature may be the most reliable way for a virus scanner to identify a virus. If a particular
virus always begins with the string 47F0F00E08 (in hexadecimal) and has string 00113FFF located at
word 12, it is unlikely that other programs or data files will have these exact characteristics. For longer
signatures, the probability of a correct match increases.
If the virus scanner will always look for those strings, then the clever virus writer can cause something
other than those strings to be in those positions. For example, the virus could have two alternative but
equivalent beginning words; after being installed, the virus will choose one of the two words for its initial
word. Then, a virus scanner would have to look for both patterns. A virus that can change its
appearance is called a polymorphic virus. (Poly means "many" and morph means "form".) A two-form
polymorphic virus can be handled easily as two independent viruses. Therefore, the virus writer intent
on preventing detection of the virus will want either a large or an unlimited number of forms so that the
number of possible forms is too large for a virus scanner to search for. Simply embedding a random
number or string at a fixed place in the executable version of a virus is not sufficient, because the
signature of the virus is just the constant code excluding the random part. A polymorphic virus has to
randomly reposition all parts of itself and randomly change all fixed data. Thus, instead of containing the
fixed (and therefore searchable) string "HA! INFECTED BY A VIRUS," a polymorphic virus has to
change even that pattern sometimes.
Trivially, assume a virus writer has 100 bytes of code and 50 bytes of data. To make two virus instances
different, the writer might distribute the first version as 100 bytes of code followed by all 50 bytes of data.
A second version could be 99 bytes of code, a jump instruction, 50 bytes of data, and the last byte of
code. Other versions are 98 code bytes jumping to the last two, 97 and three, and so forth. Just by
moving pieces around the virus writer can create enough different appearances to fool simple virus
scanners. Once the scanner writers became aware of these kinds of tricks, however, they refined their
signature definitions.
A more sophisticated polymorphic virus randomly intersperses harmless instructions throughout its
code. Examples of harmless instructions include addition of zero to a number, movement of a data value
to its own location, or a jump to the next instruction. These "extra" instructions make it more difficult to
locate an invariant signature.
A simple variety of polymorphic virus uses encryption under various keys to make the stored form of the
virus different. These are sometimes called encrypting viruses. This type of virus must contain three
distinct parts: a decryption key, the (encrypted) object code of the virus, and the (unencrypted) object
code of the decryption routine. For these viruses, the decryption routine itself or a call to a decryption
library routine must be in the clear, and so that becomes the signature.
To avoid detection, not every copy of a polymorphic virus has to differ from every other copy. If the virus
changes occasionally, not every copy will match a signature of every other copy.
The only way to prevent the infection of a virus is not to share executable code with an infected source.
This philosophy used to be easy to follow because it was easy to tell if a file was executable or not. For
example, on PCs, a .exe extension was a clear sign that the file was executable. However, as we have
noted, today's files are more complex, and a seemingly nonexecutable file may have some executable
code buried deep within it. For example, a word processor may have commands within the document
file; as we noted earlier, these commands, called macros, make it easy for the user to do complex or
repetitive things. But they are really executable code embedded in the context of the document.
Similarly, spreadsheets, presentation slides, and other office- or business-related files can contain code
or scripts that can be executed in various waysand thereby harbor viruses. And, as we have seen, the
applications that run or use these files may try to be helpful by automatically invoking the executable
code, whether you want it run or not! Against the principles of good security, e-mail handlers can be set
to automatically open (without performing access control) attachments or embedded code for the
recipient, so your e-mail message can have animated bears dancing across the top.
Another approach virus writers have used is a little-known feature in the Microsoft file design. Although a
file with a .doc extension is expected to be a Word document, in fact, the true document type is hidden
in a field at the start of the file. This convenience ostensibly helps a user who inadvertently names a
Word document with a .ppt (Power-Point) or any other extension. In some cases, the operating system
will try to open the associated application but, if that fails, the system will switch to the application of the
hidden file type. So, the virus writer creates an executable file, names it with an inappropriate extension,
and sends it to the victim, describing it is as a picture or a necessary code add-in or something else
desirable. The unwitting recipient opens the file and, without intending to, executes the malicious code.
More recently, executable code has been hidden in files containing large data sets, such as pictures or
read-only documents. These bits of viral code are not easily detected by virus scanners and certainly
not by the human eye. For example, a file containing a photograph may be highly granular; if every
sixteenth bit is part of a command string that can be executed, then the virus is very difficult to detect.
Since you cannot always know which sources are infected, you should assume that any outside source
is infected. Fortunately, you know when you are receiving code from an outside source; unfortunately, it
is not feasible to cut off all contact with the outside world.
In their interesting paper comparing computer virus transmission with human disease transmission,
Kephart et al. [KEP93] observe that individuals' efforts to keep their computers free from viruses lead to
communities that are generally free from viruses because members of the community have little
(electronic) contact with the outside world. In this case, transmission is contained not because of limited
contact but because of limited contact outside the community. Governments, for military or diplomatic
secrets, often run disconnected network communities. The trick seems to be in choosing one's
community prudently. However, as use of the Internet and the World Wide Web increases, such
separation is almost impossible to maintain.
Nevertheless, there are several techniques for building a reasonably safe community for electronic
contact, including the following:
Use only commercial software acquired from reliable, well-established vendors. There is
always a chance that you might receive a virus from a large manufacturer with a name
everyone would recognize. However, such enterprises have significant reputations that could
be seriously damaged by even one bad incident, so they go to some degree of trouble to keep
their products virus-free and to patch any problem-causing code right away. Similarly, software
distribution companies will be careful about products they handle.
Test all new software on an isolated computer. If you must use software from a questionable
source, test the software first on a computer with no hard disk, not connected to a network, and
with the boot disk removed. Run the software and look for unexpected behavior, even simple
behavior such as unexplained figures on the screen. Test the computer with a copy of an up-todate virus scanner, created before running the suspect program. Only if the program passes
these tests should it be installed on a less isolated machine.
Open attachments only when you know them to be safe. What constitutes "safe" is up to you,
as you have probably already learned in this chapter. Certainly, an attachment from an
unknown source is of questionable safety. You might also distrust an attachment from a known
source but with a peculiar message.
Make a recoverable system image and store it safely. If your system does become infected,
this clean version will let you reboot securely because it overwrites the corrupted system files
with clean copies. For this reason, you must keep the image write-protected during reboot.
Prepare this image now, before infection; after infection it is too late. For safety, prepare an
extra copy of the safe boot image.
Make and retain backup copies of executable system files. This way, in the event of a virus
infection, you can remove infected files and reinstall from the clean backup copies (stored in a
secure, offline location, of course).
Use virus detectors (often called virus scanners) regularly and update them daily. Many of the
virus detectors available can both detect and eliminate infection from viruses. Several scanners
are better than one, because one may detect the viruses that others miss. Because scanners
search for virus signatures, they are constantly being revised as new viruses are discovered.
New virus signature files, or new versions of scanners, are distributed frequently; often, you
can request automatic downloads from the vendor's web site. Keep your detector's signature
file up-to-date.
Viruses can infect only Microsoft Windows systems. False. Among students and office
workers, PCs are popular computers, and there may be more people writing software (and
viruses) for them than for any other kind of processor. Thus, the PC is most frequently the
target when someone decides to write a virus. However, the principles of virus attachment and
infection apply equally to other processors, including Macintosh computers, Unix workstations,
and mainframe computers. In fact, no writeable stored-program computer is immune to
possible virus attack. As we noted in Chapter 1, this situation means that all devices containing
computer code, including automobiles, airplanes, microwave ovens, radios, televisions, and
radiation therapy machines have the potential for being infected by a virus.
Viruses can modify "hidden" or "read only" files. True. We may try to protect files by using two
operating system mechanisms. First, we can make a file a hidden file so that a user or program
listing all files on a storage device will not see the file's name. Second, we can apply a readonly protection to the file so that the user cannot change the file's contents. However, each of
these protections is applied by software, and virus software can override the native software's
protection. Moreover, software protection is layered, with the operating system providing the
most elementary protection. If a secure operating system obtains control before a virus
contaminator has executed, the operating system can prevent contamination as long as it
blocks the attacks the virus will make.
Viruses can appear only in data files, or only in Word documents, or only in programs. False.
What are data? What is an executable file? The distinction between these two concepts is not
always clear, because a data file can control how a program executes and even cause a
program to execute. Sometimes a data file lists steps to be taken by the program that reads the
data, and these steps can include executing a program. For example, some applications
contain a configuration file whose data are exactly such steps. Similarly, word processing
document files may contain startup commands to execute when the document is opened; these
startup commands can contain malicious code. Although, strictly speaking, a virus can activate
and spread only when a program executes, in fact, data files are acted upon by programs.
Clever virus writers have been able to make data control files that cause programs to do many
things, including pass along copies of the virus to other data files.
Viruses spread only on disks or only in e-mail. False. File-sharing is often done as one user
provides a copy of a file to another user by writing the file on a transportable disk. However,
any means of electronic file transfer will work. A file can be placed in a network's library or
posted on a bulletin board. It can be attached to an electronic mail message or made available
for download from a web site. Any mechanism for sharing filesof programs, data, documents,
and so forthcan be used to transfer a virus.
Viruses cannot remain in memory after a complete power off/power on reboot. True. If a virus
is resident in memory, the virus is lost when the memory loses power. That is, computer
2
memory (RAM) is volatile, so that all contents are deleted when power is lost. However,
viruses written to disk certainly can remain through a reboot cycle and reappear after the
reboot. Thus, you can receive a virus infection, the virus can be written to disk (or to network
storage), you can turn the machine off and back on, and the virus can be reactivated during the
reboot. Boot sector viruses gain control when a machine reboots (whether it is a hardware or
software reboot), so a boot sector virus may remain through a reboot cycle because it activates
immediately when a reboot has completed.
Viruses cannot infect hardware. True. Viruses can infect only things they can modify; memory,
executable files, and data are the primary targets. If hardware contains writeable storage (socalled firmware) that can be accessed under program control, that storage is subject to virus
attack. There have been a few
Viruses can be malevolent, benign, or benevolent. True. Not all viruses are bad. For example,
a virus might locate uninfected programs, compress them so that they occupy less memory,
and insert a copy of a routine that decompresses the program when its execution begins. At
the same time, the virus is spreading the compression function to other programs. This virus
could substantially reduce the amount of storage required for stored programs, possibly by up
to 50 percent. However, the compression would be done at the request of the virus, not at the
request, or even knowledge, of the program owner.
To see how viruses and other types of malicious code operate, we examine four types of malicious code
that affected many users worldwide: the Brain, the Internet worm, the Code Red worm, and web bugs.
What It Does
The Brain, like all viruses, seeks to pass on its infection. This virus first locates itself in upper memory
and then executes a system call to reset the upper memory bound below itself, so that it is not disturbed
as it works. It traps interrupt number 19 (disk read) by resetting the interrupt address table to point to it
and then sets the address for interrupt number 6 (unused) to the former address of the interrupt 19. In
this way, the virus screens disk read calls, handling any that would read the boot sector (passing back
the original boot contents that were moved to one of the bad sectors); other disk calls go to the normal
disk read handler, through interrupt 6.
The Brain virus appears to have no effect other than passing its infection, as if it were an experiment or
a proof of concept. However, variants of the virus erase disks or destroy the file allocation table (the
table that shows which files are where on a storage medium).
How It Spreads
The Brain virus positions itself in the boot sector and in six other sectors of the disk. One of the six
sectors will contain the original boot code, moved there from the original boot sector, while two others
contain the remaining code of the virus. The remaining three sectors contain a duplicate of the others.
The virus marks these six sectors "faulty" so that the operating system will not try to use them. (With
low-level calls, you can force the disk drive to read from what the operating system has marked as bad
sectors.) The virus allows the boot process to continue.
Once established in memory, the virus intercepts disk read requests for the disk drive under attack. With
each read, the virus reads the disk boot sector and inspects the fifth and sixth bytes for the hexadecimal
value 1234 (its signature). If it finds that value, it concludes the disk is infected; if not, it infects the disk
as described in the previous paragraph.
Lehigh University, the nVIR viruses that sprang from prototype code posted on bulletin boards, and the
Scores virus that was first found at NASA in Washington D.C. circulated more widely and with greater
effect. Fortunately, most viruses seen to date have a modest effect, such as displaying a message or
emitting a sound. That is, however, a matter of luck, since the writers who could put together the simpler
viruses obviously had all the talent and knowledge to make much more malevolent viruses.
There is no general cure for viruses. Virus scanners are effective against today's known viruses and
general patterns of infection, but they cannot counter tomorrow's variant. The only sure prevention is
complete isolation from outside contamination, which is not feasible; in fact, you may even get a virus
from the software applications you buy from reputable vendors.
What It Did
Judging from its code, Morris programmed the Internet worm to accomplish three main objectives:
1.
2.
3.
How It Worked
The worm exploited several known flaws and configuration failures of Berkeley version 4 of the Unix
operating system. It accomplishedor had code that appeared to try to accomplishits three
objectives.
Where to spread. The worm had three techniques for locating potential machines to victimize. It first
tried to find user accounts to invade on the target machine. In parallel, the worm tried to exploit a bug in
the finger program and then to use a trapdoor in the sendmail mail handler. All three of these security
flaws were well known in the general Unix community.
The first security flaw was a joint user and system error, in which the worm tried guessing passwords
and succeeded when it found one. The Unix password file is stored in encrypted form, but the ciphertext
in the file is readable by anyone. (This visibility is the system error.) The worm encrypted various
popular passwords and compared their ciphertext against the ciphertext of the stored password file. The
worm tried the account name, the owner's name, and a short list of 432 common passwords (such as
"guest," "password," "help," "coffee," "coke," "aaa"). If none of these succeeded, the worm used the
dictionary file stored on the system for use by application spelling checkers. (Choosing a recognizable
password is the user error.) When it got a match, the worm could log in to the corresponding account by
presenting the plaintext password. Then, as a user, the worm could look for other machines to which the
user could obtain access. (See the article by Robert T. Morris, Sr. and Ken Thompson [MOR79] on
selection of good passwords, published a decade before the worm.) The second flaw concerned fingerd,
the program that runs continuously to respond to other computers' requests for information about
system users. The security flaw involved causing the input buffer to overflow, spilling into the return
address stack. Thus, when the finger call terminated, fingerd executed instructions that had been
pushed there as another part of the buffer overflow, causing the worm to be connected to a remote
shell.
The third flaw involved a trapdoor in the sendmail program. Ordinarily, this program runs in the
background, awaiting signals from others wanting to send mail to the system. When it receives such a
signal, sendmail gets a destination address, which it verifies, and then begins a dialog to receive the
message. However, when running in debugging mode, the worm caused sendmail to receive and
execute a command string instead of the destination address.
Spread infection. Having found a suitable target machine, the worm would use one of these three
methods to send a bootstrap loader to the target machine. This loader consisted of 99 lines of C code to
be compiled and executed on the target machine. The bootstrap loader would then fetch the rest of the
worm from the sending host machine. There was an element of good computer securityor stealth
built into the exchange between the host and the target. When the target's bootstrap requested the rest
of the worm, the worm supplied a one-time password back to the host. Without this password, the host
would immediately break the connection to the target, presumably in an effort to ensure against "rogue"
bootstraps (ones that a real administrator might develop to try to obtain a copy of the rest of the worm
for subsequent analysis).
Remain undiscovered and undiscoverable. The worm went to considerable lengths to prevent its
discovery once established on a host. For instance, if a transmission error occurred while the rest of the
worm was being fetched, the loader zeroed and then deleted all code already transferred and exited.
As soon as the worm received its full code, it brought the code into memory, encrypted it, and deleted
the original copies from disk. Thus, no traces were left on disk, and even a memory dump would not
readily expose the worm's code. The worm periodically changed its name and process identifier so that
no single name would run up a large amount of computing time.
The Internet worm was benign in that it only spread to other systems but did not destroy any part of
them. It collected sensitive data, such as account passwords, but it did not retain them. While acting as
a user, the worm could have deleted or overwritten files, distributed them elsewhere, or encrypted them
and held them for ransom. The next worm may not be so benign.
The worm's effects stirred several people to action. One positive outcome from this experience was
development in the United States of an infrastructure for reporting and correcting malicious and
nonmalicious code flaws. The Internet worm occurred at about the same time that Cliff Stoll [STO89]
reported his problems in tracking an electronic intruder (and his subsequent difficulty in finding anyone
to deal with the case). The computer community realized it needed to organize. The resulting Computer
Emergency Response Team (CERT) at Carnegie Mellon University was formed; it and similar response
centers around the world have done an excellent job of collecting and disseminating information on
malicious code attacks and their countermeasures. System administrators now exchange information on
problems and solutions. Security comes from informed protection and action, not from ignorance and
inaction.
What It Did
There are several versions of Code Red, malicious software that propagates itself on web servers
running Microsoft's Internet Information Server (IIS) software. Code Red takes two steps: infection and
propagation. To infect a server, the worm takes advantage of a vulnerability in Microsoft's IIS. It
overflows the buffer in the dynamic link library idq.dll to reside in the server's memory. Then, to
propagate, Code Red checks IP addresses on port 80 of the PC to see if that web server is vulnerable.
How It Worked
The Code Red worm looked for vulnerable personal computers running Microsoft IIS software.
Exploiting the unchecked buffer overflow, the worm crashed Windows NT-based servers but executed
code on Windows 2000 systems. The later versions of the worm created a trapdoor on an infected
server; then, the system was open to attack by other programs or malicious users. To create the
trapdoor, Code Red copied %windir%\cmd.exe to four locations:
c:\inetpub\scripts\root.ext
c:\progra~1\common~1\system\MSADC\root.exe
d:\inetpub\scripts\root.ext
d:\progra~1\common~1\system\MSADC\root.exe
Code Red also included its own copy of the file explorer.exe, placing it on the c: and d: drives so that
Windows would run the malicious copy, not the original copy. This Trojan horse first ran the original,
untainted version of explorer.exe, but it modified the system registry to disable certain kinds of file
protection and to ensure that some directories have read, write, and execute permission. As a result, the
Trojan horse had a virtual path that could be followed even when explorer.exe was not running. The
Trojan horse continues to run in background, resetting the registry every 10 minutes; thus, even if a
system administrator notices the changes and undoes them, the changes are applied again by the
malicious code.
To propagate, the worm created 300 or 600 threads (depending on the variant) and tried for 24 or 48
hours to spread to other machines. After that, the system was forcibly rebooted, flushing the worm in
memory but leaving the backdoor and Trojan horse in place.
To find a target to infect, the worm's threads worked in parallel. Although the early version of Code Red
targeted www.whitehouse.gov, later versions chose a random IP address close to the host computer's
own address. To speed its performance, the worm used a nonblocking socket so that a slow connection
would not slow down the rest of the threads as they scanned for a connection.
What They Do
A web bug, sometimes called a pixel tag, clear gif, one-by-one gif, invisible gif, or beacon gif, is a
hidden image on any document that can display HTML tags, such as a
Sidebar 3-5 Is the Cure Worse Than the Disease?
These days, a typical application program such as a word processor or spreadsheet package is sold to
its user with no guarantee of quality. As problems are discovered by users or developers, patches are
made available to be downloaded from the web and applied to the faulty system. This style of "quality
control" relies on the users and system administrators to keep up with the history of releases and
patches and to apply the patches in a timely manner. Moreover, each patch usually assumes that earlier
patches can be applied; ignore a patch at your peril.
For example, Forno [FOR01] points out that an organization hoping to secure a web server running
Windows NT 4.0's IIS had to apply over 47 patches as part of a service pack or available as a download
from Microsoft. Such stories suggest that it may cost more to maintain an application or system than it
cost to buy the application or system in the first place! Many organizations, especially small businesses,
lack the resources for such an effort. As a consequence, they neglect to fix known system problems,
which can then be exploited by hackers writing malicious code.
Blair [BLA01] describes a situation shortly after the end of the Cold War when the United States
discovered that Russia was tracking its nuclear weapons materials by using a paper-based system. That
is, the materials tracking system consisted of boxes of paper filled with paper receipts. In a gesture of
friendship, the Los Alamos National Lab donated to Russia the Microsoft software it uses to track its
own nuclear weapons materials. However, experts at the renowned Kurchatov Institute soon discovered
that over time some files become invisible and inaccessible! In early 2000, they warned the United
States. To solve the problem, the United States told Russia to upgrade to the next version of the
Microsoft software. But the upgrade had the same problem, plus a security flaw that would allow easy
access to the database by hackers or unauthorized parties.
imes patches themselves create new problems as they are fixing old ones. It is well known in the
software reliability community that testing and fixing sometimes reduce reliability, rather than improve it.
And with the complex interactions between software packages, many computer system managers prefer
to follow the adage "if it ain't broke, don't fix it," meaning that if there is no apparent failure, they would
rather not risk causing one from what seems like an unnecessary patch. So there are several ways that
the continual bug-patching approach to security may actually lead to a less secure product than you
started with.
web page, an HTML e-mail message, or even a spreadsheet. Its creator intends the bug to be invisible,
unseen by users but very useful nevertheless because it can track the activities of a web user.
For example, if you visit the Blue Nile home page, www.bluenile.com, the following web bug code is
automatically downloaded as a one-by-one pixel image from Avenue A, a marketing agency:
and more.
This information can be used to track where and when you read a document, what your buying habits
are, or what your personal information may be. More maliciously, the web bug can be cleverly used to
review the web server's log files and determine your IP addressopening your system to hacking via
the target IP address.
Trapdoors
A trapdoor is an undocumented entry point to a module. The trapdoor is inserted during code
development, perhaps to test the module, to provide "hooks" by which to connect future modifications or
enhancements or to allow access if the module should fail in the future. In addition to these legitimate
uses, trapdoors can allow a programmer access to a program once it is placed in production.
Examples of Trapdoors
Because computing systems are complex structures, programmers usually develop and test systems in
a methodical, organized, modular manner, taking advantage of the way the system is composed of
modules or components. Often, each small component of the system is tested first, separate from the
other components, in a step called unit testing, to ensure that the component works correctly by itself.
Then, components are tested together during integration testing, to see how they function as they
send messages and data from one to the other. Rather than paste all the components together in a "big
bang" approach, the testers group logical clusters of a few components, and each cluster is tested in a
way that allows testers to control and understand what might make a component or its interface fail. (For
a more detailed look at testing, see Pfleeger [PFL01].)
To test a component on its own, the developer or tester cannot use the surrounding routines that
prepare input or work with output. Instead, it is usually necessary to write "stubs" and "drivers," simple
routines to inject data in and extract results from the component being tested. As testing continues,
these stubs and drivers are discarded because they are replaced by the actual components whose
functions they mimic. For example, the two modules MODA and MODB in Figure 3-10 are being tested
with the driver MAIN and the stubs SORT, OUTPUT, and NEWLINE.
the three possibilities. A careless programmer may allow a failure simply to fall through the CASE
without being flagged as an error. The fingerd flaw exploited by the Morris worm occurs exactly that
way: A C library I/O routine fails to check whether characters are left in the input buffer before returning
a pointer to a supposed next character.
Hardware processor design provides another common example of this kind of security flaw. Here, it
often happens that not all possible binary opcode values have matching machine instructions. The
undefined opcodes sometimes implement peculiar instructions, either because of an intent to test the
processor design or because of an oversight by the processor designer. Undefined opcodes are the
hardware counterpart of poor error checking for software.
As with viruses, trapdoors are not always bad. They can be very useful in finding security flaws. Auditors
sometimes request trapdoors in production programs to insert fictitious but identifiable transactions into
the system. Then, the auditors trace the flow of these transactions through the system. However,
trapdoors must be documented, access to them should be strongly controlled, and they must be
designed and used with full understanding of the potential consequences.
Causes of Trapdoors
Developers usually remove trapdoors during program development, once their intended usefulness is
spent. However, trapdoors can persist in production programs because the developers
The first case is an unintentional security blunder, the next two are serious exposures of the system's
security, and the fourth is the first step of an outright attack. It is important to remember that the fault is
not with the trapdoor itself, which can be a very useful technique for program testing, correction, and
maintenance. Rather, the fault is with the system development process, which does not ensure that the
trapdoor is "closed" when it is no longer needed. That is, the trapdoor becomes a vulnerability if no one
notices it or acts to prevent or control its use in vulnerable situations.
In general, trapdoors are a vulnerability when they expose the system to modification during execution.
They can be exploited by the original developers or used by anyone who discovers the trapdoor by
accident or through exhaustive trials. A system is not secure when someone believes that no one else
would find the hole.
Salami Attack
We noted in Chapter 1 an attack known as a salami attack. This approach gets its name from the way
odd bits of meat and fat are fused together in a sausage or salami. In the same way, a salami attack
merges bits of seemingly inconsequential data to yield powerful results. For example, programs often
disregard small amounts of money in their computations, as when there are fractional pennies as
interest or tax is calculated.
Such programs may be subject to a salami attack, because the small amounts are shaved from each
computation and accumulated elsewheresuch as the programmer's bank account! The shaved
amount is so small that an individual case is unlikely to be noticed, and the accumulation can be done
so that the books still balance overall. However, accumulated amounts can add up to a tidy sum,
supporting a programmer's early retirement or new car. It is often the resulting expenditure, not the
shaved amounts, that gets the attention of the authorities.
31 to get the interest for the month. Thus, the total interest for 31 days is 31/365*0.065*102.87 =
$0.5495726. Since banks deal only in full cents, a typical practice is to round down if a residue is less
than half a cent, and round up if a residue is half a cent or more. However, few people check their
interest computation closely, and fewer still would complain about having the amount $0.5495 rounded
down to $0.54, instead of up to $0.55. Most programs that perform computations on currency recognize
that because of rounding, a sum of individual computations may be a few cents different from the
computation applied to the sum of the balances.
What happens to these fractional cents? The computer security folk legend is told of a programmer who
collected the fractional cents and credited them to a single account: hers! The interest program merely
had to balance total interest paid to interest due on the total of the balances of the individual accounts.
Auditors will probably not notice the activity in one specific account. In a situation with many accounts,
the roundoff error can be substantial, and the programmer's account pockets this roundoff.
But salami attacks can net more and be far more interesting. For example, instead of shaving fractional
cents, the programmer may take a few cents from each account, again assuming that no individual has
the desire or understanding to recompute the amount the bank reports. Most people finding a result a
few cents different from that of the bank would accept the bank's figure, attributing the difference to an
error in arithmetic or a misunderstanding of the conditions under which interest is credited. Or a program
might record a $20 fee for a particular service, while the company standard is $15. If unchecked, the
extra $5 could be credited to an account of the programmer's choice. One attacker was able to make
withdrawals of $10,000 or more against accounts that had shown little recent activity; presumably the
attacker hoped the owners were ignoring their accounts.
access the information). The user may not know that a Trojan horse is running and may not be in
collusion to leak information to the spy.
Storage Channels
Some covert channels are called storage channels because they pass information by using the
presence or absence of objects in storage.
A simple example of a covert channel is the file lock channel. In multiuser systems, files can be
"locked" to prevent two people from writing to the same file at the same time (which could corrupt the
file, if one person writes over some of what the other wrote). The operating system or database
management system allows only one program to write to a file at a time, by blocking, delaying, or
rejecting write requests from other programs. A covert channel can signal one bit of information by
whether or not a file is locked.
Remember that the service program contains a Trojan horse written by the spy but run by the
unsuspecting user. As shown in Figure 3-13, the service program reads confidential data (to which the
spy should not have access) and signals the data one bit at a time by locking or not locking some file
(any file, the contents of which are arbitrary and not even modified). The service program and the spy
need a common timing source, broken into intervals. To signal a 1, the service program locks the file for
the interval; for a 0, it does not lock. Later in the interval the spy tries to lock the file itself. If the spy
program cannot lock the file, it knows the service program must have, and thus it concludes the service
program is signaling a 1; if the spy program can lock the file, it knows the service program is signaling a
0.
Timing Channels
Other covert channels, called timing channels, pass information by using the speed at which things
happen. Actually, timing channels are shared resource channels in which the shared resource is time.
A service program uses a timing channel to communicate by using or not using an assigned amount of
computing time. In the simple case, a multiprogrammed system with two user processes divides time
into blocks and allocates blocks of processing alternately to one process and the other. A process is
offered processing time, but if the process is waiting for another event to occur and has no processing to
do, it rejects the offer. The service process either uses its block (to signal a 1) or rejects its block (to
signal a 0). Such a situation is shown in Figure 3-15, first with the service process and the spy's process
alternating, and then with the service process communicating the string 101 to the spy's process. In the
second part of the example, the service program wants to signal 0 in the third time block. It will do this
by using just enough time to determine that it wants to send a 0 and then pause. The spy process then
receives control for the remainder of the time block.
Spy's Process
Locked
R, M
R, M
Confidential data
You then look for two columns and two rows having the following pattern:
This pattern identifies two resources and two processes such that the second process is not allowed to
read from the second resource. However, the first process can pass the information to the second by
reading from the second resource and signaling the data through the first resource. Thus, this pattern
implies the potential information flow as shown here.
Next, you complete the shared resource matrix by adding these implied information flows, and analyze it
for undesirable flows. Thus, you can tell that the spy's process can read the confidential data by using a
covert channel through the file lock, as shown in Table 3-4.
Spy's Process
Locked
R, M
R, M
Confidential data
Finally, we put all the pieces together to show which outputs are affected by which inputs. Although this
analysis sounds frightfully complicated, it can be automated during the syntax analysis portion of
compilation. This analysis can also be performed on the higher-level design specification.
Flow
B:=A
from A to B
from A to B; from C to B
from K to stmts
from K to stmts
B:=fcn(args)
from fcn to B
OPEN FILE f
none
READ (f, X)
from file f to X
WRITE (f, X)
from X to file f
Capacity and speed are not problems; our estimate of 1000 bits per second is unrealistically low, but
even at that rate much information leaks swiftly. With modern hardware architectures, certain covert
channels inherent in the hardware design have capacities of millions of bits per second. And the attack
does not require significant finance. Thus, the attack could be very effective in certain situations
involving highly sensitive data.
For these reasons, security researchers have worked diligently to develop techniques for closing covert
channels. The closure results have been bothersome; in ordinarily open environments, there is
essentially no control over the subversion of a service program, nor is there an effective way of
screening such programs for covert channels. And other than in a few very high security systems,
operating systems cannot control the flow of information from a covert channel. The hardware-based
channels cannot be closed, given the underlying hardware architecture.
For variety (or sobriety), Kurak and McHugh [KUR92] present a very interesting analysis of covert
signaling through graphic images.4 In their work they demonstrate that two different images can be
combined by some rather simple arithmetic on the bit patterns of digitized pictures. The second image in
a printed copy is undetectable to the human eye, but it can easily be separated and reconstructed by the
spy receiving the digital version of the image.
Although covert channel demonstrations are highly speculativereports of actual covert channel attacks
just do not existthe analysis is sound. The mere possibility of their existence calls for more rigorous
attention to other aspects of security, such as program development analysis, system architecture
analysis, and review of output.
Developmental Controls
Many controls can be applied during software development to ferret out and fix problems. So let us
begin by looking at the nature of development itself, to see what tasks are involved in specifying,
designing, building, and testing software.
specify the system, by capturing the requirements and building a model of how the system
should work from the users' point of view
design the system, by proposing a solution to the problem described by the requirements and
building a model of the solution
implement the system, by using the design as a blueprint for building a working solution
test the system, to ensure that it meets the requirements and implements the solution as called
for in the design
review the system at various stages, to make sure that the end products are consistent with the
specification and design models
document the system, so that users can be trained and supported
manage the system, to estimate what resources will be needed for development and to track
when the system will be done
maintain the system, tracking problems found, changes needed, and changes made, and
evaluating their effects on overall quality and functionality
One person could do all these things. But more often than not, a team of developers works together to
perform these tasks. Sometimes a team member does more than one activity; a tester can take part in a
requirements review, for example, or an implementer can write documentation. Each team is different,
and team dynamics play a large role in the team's success.
We can examine both product and process to see how each contributes to quality and in particular to
security as an aspect of quality. Let us begin with the product, to get a sense of how we recognize highquality secure software.
since changes to an isolated component do not affect other components. And it is easier to see where
vulnerabilities may lie if the component is isolated. We call this isolation encapsulation.
Information hiding is another characteristic of modular software. When information is hidden, each
component hides its precise implementation or some other design decision from the others. Thus, when
a change is needed, the overall design can remain intact while only the necessary changes are made to
particular components.
Let us look at these characteristics in more detail.
Modularity
Modularization is the process of dividing a task into subtasks. This division is done on a logical or
functional basis. Each component performs a separate, independent part of the task. Modularity is
depicted in Figure 3-16. The goal is to have each component meet four conditions:
needed to perform their required functions. There are several advantages to having small, independent
components.
Security analysts must be able to understand each component as an independent unit and be assured
of its limited effect on other components.
A modular component usually has high cohesion and low coupling. By cohesion, we mean that all the
elements of a component have a logical and functional reason for being there; every aspect of the
component is tied to the component's single purpose. A highly cohesive component has a high degree
of focus on the purpose; a low degree of cohesion means that the component's contents are an
unrelated jumble of actions, often put together because of time-dependencies or convenience.
Coupling refers to the degree with which a component depends on other components in the system.
Thus, low or loose coupling is better than high or tight coupling, because the loosely coupled
components are free from unwitting interference from other components. This difference in coupling is
shown in Figure 3-17.
Encapsulation
Encapsulation hides a component's implementation details, but it does not necessarily mean complete
isolation. Many components must share information with other components, usually with good reason.
However, this sharing is carefully documented so that a component is affected only in known ways by
others in the system. Sharing is minimized so that the fewest interfaces possible are used. Limited
interfaces reduce the number of covert channels that can be constructed.
component] in such a way as to hide what should be hidden and make visible what is intended to be
visible."
Information Hiding
Developers who work where modularization is stressed can be sure that other components will have
limited effect on the ones they write. Thus, we can think of a component as a kind of black box, with
certain well-defined inputs and outputs and a well-defined function. Other components' designers do not
need to know how the module completes its function; it is enough to be assured that the component
performs its task in some correct manner.
This concealment is the information hiding, depicted in Figure 3-18. Information hiding is desirable,
because developers cannot easily and maliciously alter the components of others if they do not know
how the components work.
Peer Reviews
We turn next to the process of developing software. Certain practices and techniques can assist us in
finding real and potential security flaws (as well as other faults) and fixing them before the system is
turned over to the users. Of the many practices available for building what they call "solid software,"
Pfleeger et al. recommend several key techniques: [PFL01a]
peer reviews
hazard analysis
testing
good design
prediction
static analysis
configuration management
analysis of mistakes
Here, we look at each practice briefly, and we describe its relevance to security controls. We begin with
peer reviews.
You have probably been doing some form of review for as many years as you have been writing code:
desk-checking your work or asking a colleague to look over a routine to ferret out any problems. Today,
a software review is associated with several formal process steps to make it more effective, and we
review any artifact of the development process, not just code. But the essence of a review remains the
same: sharing a product with colleagues able to comment about its correctness. There are careful
distinctions among three types of peer reviews:
Review: The artifact is presented informally to a team of reviewers; the goal is consensus and
buy-in before development proceeds further.
Walk-through: The artifact is presented to the team by its creator, who leads and controls the
discussion. Here, education is the goal, and the focus is on learning about a single document.
Inspection: This more formal process is a detailed analysis in which the artifact is checked
against a prepared list of concerns. The creator does not lead the discussion, and the fault
identification and correction are often controlled by statistical measurements.
A wise engineer who finds a fault can deal with it in at least three ways:
1.
2.
3.
Peer reviews address this problem directly. Unfortunately, many organizations give only lip service to
peer review, and reviews are still not part of mainstream software engineering activities.
But there are compelling reasons to do reviews. An overwhelming amount of evidence suggests that
various types of peer review in software engineering can be extraordinarily effective. For example, early
studies at Hewlett-Packard in the 1980s revealed that those developers performing peer review on their
projects enjoyed a very significant advantage over those relying only on traditional dynamic testing
techniques, whether black-box or white-box. Figure 3-19 compares the fault discovery rate (that is, faults
discovered per hour) among white-box testing, black-box testing, inspections, and software execution. It
is clear that inspections discovered far more faults in the same period of time than other alternatives.
[GRA87] This result is particularly compelling for large, secure systems, where live running for fault
discovery may not be an option.
The fault log can also be used to build a checklist of items to be sought in future reviews. The review
team can use the checklist as a basis for questioning what can go wrong and where. In particular, the
checklist can remind the team of security breaches, such as unchecked buffer overflows, that should be
caught and fixed before the system is placed in the field. A rigorous design or code review can locate
trapdoors, Trojan horses, salami attacks, worms, viruses, and other program flaws. A crafty programmer
can conceal some of these flaws, but the chance of discovery rises when competent programmers
review the design and code, especially when the components are small and encapsulated. Management
should use demanding reviews throughout development to ensure the ultimate security of the programs.
Requirements review
2.5
Design review
5.0
Code inspection
10.0
Integration test
3.0
Acceptance test
2.0
Hazard Analysis
Hazard analysis is a set of systematic techniques intended to expose potentially hazardous system
states. In particular, it can help us expose security concerns and then identify prevention or mitigation
strategies to address them. That is, hazard analysis ferrets out likely causes of problems so that we can
then apply an appropriate technique for preventing the problem or softening its likely consequences.
Thus, it usually involves developing hazard lists, as well as procedures for exploring "what if" scenarios
to trigger consideration of nonobvious hazards. The sources of problems can be lurking in any artifacts
of the development or maintenance process, not just in the code, so a hazard analysis must be broad in
its domain of investigation; in other words, hazard analysis is a system issue, not just a code issue.
Similarly, there are many kinds of problems, ranging from incorrect code to unclear consequences of a
particular action. A good hazard analysis takes all of them into account.
Although hazard analysis is generally good practice on any project, it is required in some regulated and
critical application domains, and it can be invaluable for finding security flaws. It is never too early to be
thinking about the sources of hazards; the analysis should begin when you first start thinking about
building a new system or when someone proposes a significant upgrade to an existing system. Hazard
analysis should continue throughout the system life cycle; you must identify potential hazards that can
be introduced during system design, installation, operation, and maintenance.
A variety of techniques support the identification and management of potential hazards. Among the most
effective are hazard and operability studies (HAZOP), failure modes and effects analysis (FMEA),
and fault tree analysis (FTA). HAZOP is a structured analysis technique originally developed for the
process control and chemical plant industries. Over the last few years it has been adapted to discover
potential hazards in safety-critical software systems. FMEA is a bottom-up technique applied at the
system component level. A team identifies each component's possible faults or fault modes; then, it
determines what could trigger the fault and what systemwide effects each fault might have. By keeping
system consequences in mind, the team often finds possible system failures that are not made visible by
other analytical means. FTA complements FMEA. It is a top-down technique that begins with a
postulated hazardous system malfunction. Then, the FTA team works backwards to identify the possible
pre cursors to the mishap. By tracing back from a specific hazardous malfunction, we can locate
unexpected contributors to mishaps, and we then look for opportunities to mitigate the risks.
Known effect
Known Cause
Unknown Cause
Description of system
Deductive analysis,
Unknown effect
behavior
Inductive analysis,
including failure modes
and effects analysis
Exploratory analysis,
including hazard and
operability studies
Each of these techniques is clearly useful for finding and preventing security breaches. We decide which
technique is most appropriate by understanding how much we know about causes and effects. For
example, Table 3-7 suggests that when we know the cause and effect of a given problem, we can
strengthen the description of how the system should behave. This clearer picture will help requirements
analysts understand how a potential problem is linked to other requirements. It also helps designers
understand exactly what the system should do and helps testers know how to test to verify that the
system is behaving properly. If we can describe a known effect with unknown cause, we use deductive
techniques such as fault tree analysis to help us understand the likely causes of the unwelcome
behavior. Conversely, we may know the cause of a problem but not understand all the effects; here, we
use inductive techniques such as failure modes and effects analysis to help us trace from cause to all
possible effects. For example, suppose we know that a subsystem is unprotected and might lead to a
security failure, but we do not know how that failure will affect the rest of the system. We can use FMEA
to generate a list of possible effects and then evaluate the trade-offs between extra protection and
possible problems. Finally, to find problems about which we may not yet be aware, we can perform an
exploratory analysis such as a hazard and operability study.
We see in Chapter 8 that hazard analysis is also useful for determining vulnerabilities and mapping
them to suitable controls.
Testing
Testing is a process activity that homes in on product quality: making the product failure free or failure
tolerant. Each software problem (especially when it relates to security) has the potential not only for
making software fail but also for adversely affecting a business or a life. Thomas Young, head of
NASA's investigation of the Mars lander failure, noted that "One of the things we kept in mind during the
course of our review is that in the conduct of space missions, you get only one strike, not three. Even if
thousands of functions are carried out flawlessly, just one mistake can be catastrophic to a mission."
[NAS00] This same sentiment is true for security: The failure of one control exposes a vulnerability that
is not ameliorated by any number of functioning controls. Testers improve software quality by finding as
many faults as possible and by writing up their findings carefully so that developers can locate the
causes and repair the problems if possible.
Testing usually involves several stages. First, each program component is tested on its own, isolated
from the other components in the system. Such testing, known as module testing, component testing, or
unit testing, verifies that the component functions properly with the types of input expected from a study
of the component's design. Unit testing is done in a controlled environment whenever possible so that
the test team can feed a predetermined set of data to the component being tested and observe what
output actions and data are produced. In addition, the test team checks the internal data structures,
logic, and boundary conditions for the input and output data.
When collections of components have been subjected to unit testing, the next step is ensuring that the
interfaces among the components are defined and handled properly. Indeed, interface mismatch can be
a significant security vulnerability. Integration testing is the process of verifying that the system
components work together as described in the system and program design specifications.
Once we are sure that information is passed among components in accordance with the design, we test
the system to ensure that it has the desired functionality. A function test evaluates the system to
determine whether the functions described by the requirements specification are actually performed by
the integrated system. The result is a functioning system.
The function test compares the system being built with the functions described in the developers'
requirements specification. Then, a performance test compares the system with the remainder of these
software and hardware requirements. It is during the function and performance tests that security
requirements are examined, and the testers confirm that the system is as secure as it is required to be.
When the performance test is complete, developers are certain that the system functions according to
their understanding of the system description. The next step is conferring with the customer to make
certain that the system works according to customer expectations. Developers join the customer to
perform an acceptance test, in which the system is checked against the customer's requirements
description. Upon completion of acceptance testing, the accepted system is installed in the environment
in which it will be used. A final installation test is run to make sure that the system still functions as it
should. However, security requirements often state that a system should not do something. As Sidebar
3-6 demonstrates, it is difficult to demonstrate absence rather than presence.
The objective of unit and integration testing is to ensure that the code implemented the design properly;
that is, that the programmers have written code to do what the designers intended. System testing has a
very different objective: to ensure that the system does what the customer wants it to do. Regression
testing, an aspect of system testing, is particularly important for security purposes. After a change is
made to enhance the system or fix a problem, regression testing ensures that all remaining functions
are still working and performance has not been degraded by the change.
Each of the types of tests listed here can be performed from two perspectives: black box and clear box
(sometimes called white box). Black-box testing treats a system or its components as black boxes;
testers cannot "see inside" the system, so they apply particular inputs and verify that they get the
expected output. Clear-box testing allows visibility. Here, testers can examine the design and code
directly, generating test cases based on the code's actual construction. Thus, clear-box testing knows
that component
Sidebar 3-6 Absence vs. Presence
Pfleeger [PFL97] points out that security requirements resemble those for any other computing task,
with one seemingly insignificant difference. Whereas most requirements say "the system will do this,"
security requirements add the phrase "and nothing more." As we pointed out in Chapter 1, security
awareness calls for more than a little caution when a creative developer takes liberties with the system's
specification. Ordinarily, we do not worry if a programmer or designer adds a little something extra. For
instance, if the requirement calls for generating a file list on a disk, the "something more" might be
sorting the list into alphabetical order or displaying the date it was created. But we would never expect
someone to meet the requirement by displaying the list and then erasing all the files on the disk!
If we could determine easily whether an addition was harmful, we could just disallow harmful additions.
But unfortunately we cannot. For security reasons, we must state explicitly the phrase "and nothing
more" and leave room for negotiation in requirements definition on any proposed extensions.
It is natural for programmers to want to exercise their creativity in extending and expanding the
requirements. But apparently benign choices, such as storing a value in a global variable or writing to a
temporary file, can have serious security implications. And sometimes the best design approach for
security is counterintuitive. For example, one cryptosystem attack depends on measuring the time to
perform an encryption. That is, an efficient implementation can undermine the system's security. The
solution, oddly enough, is to artificially pad the encryption process with unnecessary computation so that
short computations complete as slowly as long ones.
In another instance, an enthusiastic programmer added parity checking to a cryptographic procedure.
Because the keys were generated randomly, the result was that 255 of the 256 encryptions failed the
parity check, leading to the substitution of a fixed keyso that 255 of every 256 encryptions were being
performed under the same key!
No technology can automatically distinguish between malicious and benign code. For this reason, we
have to rely on a combination of approaches, including human-intensive ones, to help us detect when
we are going beyond the scope of the requirements and threatening the system's security.
X uses CASE statements and can look for instances in which the input causes control to drop through to
an unexpected line. Black-box testing must rely more on the required inputs and outputs because the
actual code is not available for scrutiny.
The mix of techniques appropriate for testing a given system depends on the system's size, application
domain, amount of risk, and many other factors. But understanding the effectiveness of each technique
helps us know what is right for each particular system. For example, Olsen [OLS93] describes the
development at Contel IPC of a system containing 184,000 lines of code. He tracked faults discovered
during various activities, and found differences:
17.3 percent of the faults were found during inspections of the system design
19.1 percent during component design inspection
15.1 percent during code inspection
29.4 percent during integration testing
16.6 percent during system and regression testing
Only 0.1 percent of the faults were revealed after the system was placed in the field. Thus, Olsen's work
shows the importance of using different techniques to uncover different kinds of faults during
development; it is not enough to rely on a single method for catching all problems.
Who does the testing? From a security standpoint, independent testing is highly desirable; it may
prevent a developer from attempting to hide something in a routine, or keep a subsystem from
controlling the tests that will be applied to it. Thus, independent testing increases the likelihood that a
test will expose the effect of a hidden feature.
Good Design
We saw earlier in this chapter that modularity, information hiding, and encapsulation are characteristics
of good design. Several design-related process activities are particularly helpful in building secure
software:
We can build into the design a particular way of handling each problem, selecting from one of three
ways:
1.
2.
3.
Retrying: restoring the system to its previous state and performing the service again, using a
different strategy
Correcting: restoring the system to its previous state, correcting some system characteristic,
and performing the service again, using the same strategy
Reporting: restoring the system to its previous state, reporting the problem to an error-handling
component, and not providing the service again
This consistency of design helps us check for security vulnerabilities; we look for instances that are
different from the standard approach.
Design rationales and history tell us the reasons the system is built one way instead of another. Such
information helps us as the system evolves, so we can integrate the design of our security functions
without compromising the integrity of the system's overall design.
Moreover, the design history enables us to look for patterns, noting what designs work best in which
situations. For example, we can reuse patterns that have been successful in preventing buffer
overflows, in ensuring data integrity, or in implementing user password checks.
Prediction
Among the many kinds of prediction we do during software development, we try to predict the risks
involved in building and using the system. As we see in depth in Chapter 8, we must postulate which
unwelcome events might occur and then make plans to avoid them or at least mitigate their effects. Risk
prediction and management are especially important for security, where we are always dealing with
unwanted events that have negative consequences. Our predictions help us decide which controls to
use and how many. For example, if we think the risk of a particular security breach is small, we may not
want to invest a large amount of money, time, or effort in installing sophisticated controls. Or we may
use the likely risk impact to justify using several controls at once, a technique called "defense in depth."
Static Analysis
Before a system is up and running, we can examine its design and code to locate and repair security
flaws. We noted earlier that the peer review process involves this kind of scrutiny. But static analysis is
more than peer review, and it is usually performed before peer review. We can use tools and techniques
to examine the characteristics of design and code to see if the characteristics warn us of possible faults
lurking within. For example, a large number of levels of nesting may indicate that the design or code is
hard to read and understand, making it easy for a malicious developer to bury dangerous code deep
within the system.
To this end, we can examine several aspects of the design and code:
The control flow is the sequence in which instructions are executed, including iterations and loops. This
aspect of design or code can also tell us how often a particular instruction or routine is executed.
Data flow follows the trail of a data item as it is accessed and modified by the system. Many times,
transactions applied to data are complex, and we use data flow measures to show us how and when
each data item is written, read, and changed.
The data structure is the way in which the data are organized, independent of the system itself. For
instance, if the data are arranged as lists, stacks, or queues, the algorithms for manipulating them are
likely to be well understood and well defined.
There are many approaches to static analysis, especially because there are so many ways to create
and document a design or program. Automated tools are available to generate not only numbers (such
as depth of nesting or cyclomatic number) but also graphical depictions of control flow, data
relationships, and the number of paths from one line of code to another. These aids can help us see
how a flaw in one part of a system can affect other parts.
Configuration Management
When we develop software, it is important to know who is making which changes to what and when:
We want some degree of control over the software changes so that one change does not inadvertently
undo the effect of a previous change. And we want to control what is often a proliferation of different
versions and releases. For instance, a product might run on several different platforms or in several
different environments, necessitating different code to support the same functionality. Configuration
management is the process by which we control changes during development and maintenance, and it
offers several advantages in security. In particular, configuration management scrutinizes new and
changed code to ensure, among other things, that security flaws have not been inserted, intentionally or
accidentally.
Four activities are involved in configuration management:
1.
2.
3.
4.
configuration identification
configuration control and change management
configuration auditing
status accounting
Configuration identification sets up baselines to which all other code will be compared after changes
are made. That is, we build and document an inventory of all components that comprise the system. The
inventory includes not only the code you and your colleagues may have created, but also database
management systems, third-party software, libraries, test cases, documents, and more. Then, we
"freeze" the baseline and carefully control what happens to it. When a change is proposed and made, it
is described in terms of how the baseline changes.
Configuration control and configuration management ensure we can coordinate separate, related
versions. For example, there may be closely related versions of a system to execute on 16-bit and 32-bit
processors. Three ways to control the changes are separate files, deltas, and conditional compilation. If
we use separate files, we have different files for each release or version. For example, we might build
an encryption system in two configurations: one that uses a short key length, to comply with the law in
certain countries, and another that uses a long key. Then, version 1 may be composed of components
A
1 through Ak and B1, while version 2 is A1 through Ak and B2, where B1 and B2 do key length. That is, the
versions are the same except for the separate key processing files.
Alternatively, we can designate a particular version as the main version of a system, and then define
other versions in terms of what is different. The difference file, called a delta, contains editing
commands to describe the ways to transform the main version into the variation.
Finally, we can do conditional compilation, whereby a single code component addresses all versions,
relying on the compiler to determine which statements to apply to which versions. This approach seems
appealing for security applications because all the code appears in one place. However, if the variations
are very complex, the code may be very difficult to read and understand.
Once a configuration management technique is chosen and applied, the system should be audited
regularly. A configuration audit confirms that the baseline is complete and accurate, that changes are
recorded, that recorded changes are made, and that the actual software (that is, the software as used in
the field) is reflected accurately in the documents. Audits are usually done by independent parties taking
one of two approaches: reviewing every entry in the baseline and comparing it with the software in use
or sampling from a larger set just to confirm compliance. For systems with strict security constraints, the
first approach is preferable, but the second approach may be more practical.
Finally, status accounting records information about the components: where they came from (for
instance, purchased, reused, or written from scratch), the current version, the change history, and
pending change requests.
All four sets of activities are performed by a configuration and change control board, or CCB. The
CCB contains representatives from all organizations with a vested interest in the system, perhaps
including customers, users, and developers. The board reviews all proposed changes and approves
changes based on need, design integrity, future plans for the software, cost, and more. The developers
implementing and testing the change work with a program librarian to control and update relevant
documents and components; they also write detailed documentation about the changes and test results.
Configuration management offers two advantages to those of us with security concerns: protecting
against unintentional threats and guarding against malicious ones. Both goals are addressed when the
configuration management processes protect the integrity of programs and documentation. Because
changes occur only after explicit approval from a configuration management authority, all changes are
also carefully evaluated for side effects. With configuration management, previous versions of programs
are archived, so a developer can retract a faulty change when necessary.
Malicious modification is made quite difficult with a strong review and configuration management
process in place. In fact, as presented in Sidebar 3-7, poor configuration control has resulted in at least
one system failure; that sidebar also confirms the principle of easiest penetration from Chapter 1. Once
a reviewed program is accepted for inclusion in a system, the developer cannot sneak in to make small,
subtle changes, such as inserting trapdoors. The developer has access to the running production
program only through the CCB, whose members are alert to such security breaches.
Sidebar 3-7 There's More Than One Way to Crack a System
In the 1970s the primary security assurance strategy was "penetration" or "tiger team" testing. A team of
computer security experts would be hired to test the security of a system prior to its being pronounced
ready to use. Often these teams worked for months to plan their tests.
The U.S. Department of Defense was testing the Multics system, which had been designed and built
under extremely high security quality standards. Multics was being studied as a base operating system
for the WWMCCS command and control system. The developers from M.I.T. were justifiably proud of
the strength of the security of their system, and the sponsoring agency invoked the penetration team
with a note of haughtiness. But the developers underestimated the security testing team.
Led by Roger Schell and Paul Karger, the team analyzed the code and performed their tests without
finding major flaws. Then one team member thought like an attacker. He wrote a slight modification to
the code to embed a trapdoor by which he could perform privileged operations as an unprivileged user.
He then made a tape of this modified system, wrote a cover letter saying that a new release of the
system was enclosed, and mailed the tape and letter to the site where the system was installed.
When it came time to demonstrate their work, the penetration team congratulated the Mul-tics
developers on generally solid security, but said they had found this one apparent failure, which the team
member went on to show. The developers were aghast because they knew they had scrutinized the
affected code carefully. Even when told the nature of the trapdoor that had been added, the developers
could not find it. [KAR74, KAR02]
A security specialist wants to be certain that a given program computes a particular result, computes it
correctly, and does nothing beyond what it is supposed to do. Unfortunately, results in computer science
theory (see [PFL85] for a description) indicate that we cannot know with certainty that two programs do
exactly the same thing. That is, there can be no general decision procedure which, given any two
programs, determines if the two are equivalent. This difficulty results from the "halting problem," which
states that there is no general technique to determine whether an arbitrary program will halt when
processing an arbitrary input.
In spite of this disappointing general result, a technique called program verification can demonstrate
formally the "correctness" of certain specific programs. Program verification involves making initial
assertions about the inputs and then checking to see if the desired output is generated. Each program
statement is translated into a logical description about its contribution to the logical flow of the program.
Finally, the terminal statement of the program is associated with the desired output. By applying a logic
analyzer, we can prove that the initial assumptions, through the implications of the program statements,
produce the terminal condition. In this way, we can show that a particular program achieves its goal.
Sidebar 3-8 presents the case for appropriate use of formal proof techniques. We study an example of
program verification in Chapter 5.
Proving program correctness, although desirable and useful, is hindered by several factors.
McDermid [MCD93] asserts that "these mathematical approaches provide us with the best
available approach to the development of high-integrity safety-critical systems." Formal
methods are becoming used routinely to evaluate communication protocols and proposed
security policies. Evidence from Heitmeyer's work [HEI01] at the U.S. Naval Research
Laboratory suggests that formal methods are becoming easier to use and more effective. Dill
and Rushby [DIL96] report that use of formal methods to analyze correctness of hardware
design "has become attractive because it has focused on reducing the cost and time required
for validation . . . [T]here are some lessons and principles from hardware verification that can
be transferred to the software world." And Pfleeger and Hatton report that an air traffic control
system built with several types of formal methods resulted in software of very high quality. For
these reasons, formal methods are being incorporated into standards and imposed on
developers. For instance, the interim UK defense standard for such systems, DefStd 00-55,
makes mandatory the use of formal methods.
However, more evaluation must be done. We must understand how formal methods contribute
to quality. And we must decide how to choose among the many competing formal methods,
which may not be equally effective in a given situation.
Program verification systems are being improved constantly. Larger programs are being verified in less
time than before. As program verification continues to mature, it may become a more important control
to ensure the security of programs.
Trusted Software
We say that software is trusted software if we know that the code has been rigorously developed and
analyzed, giving us reason to trust that the code does what it is expected to do and nothing more.
Typically, trusted code can be a foundation on which other, untrusted, code runs. That is, the untrusted
system's quality depends, in part, on the trusted code; the trusted code establishes the baseline for
security of the overall system. In particular, an operating system can be trusted software when there is a
basis for trusting that it correctly controls the accesses of components or systems run from it. For
example, the operating system might be expected to limit users' accesses to certain files. We look at
trusted operating systems in more detail in Chapter 5.
To trust any program, we base our trust on rigorous analysis and testing, looking for certain key
characteristics:
Functional correctness: The program does what it is supposed to, and it works correctly.
Enforcement of integrity: Even if presented erroneous commands or commands from
unauthorized users, the program maintains the correctness of the data with which it has
contact.
Limited privilege: The program is allowed to access secure data, but the access is minimized
and neither the access rights nor the data are passed along to other untrusted programs or
back to an untrusted caller.
Appropriate confidence level: The program has been examined and rated at a degree of trust
appropriate for the kind of data and environment in which it is to be used.
Trusted software is often used as a safe way for general users to access sensitive data. Trusted
programs are used to perform limited (safe) operations for users without allowing the users to have
direct access to sensitive data.
Mutual Suspicion
Programs are not always trustworthy. Even with an operating system to enforce access limitations, it
may be impossible or infeasible to bound the access privileges of an untested program effectively. In
this case, the user U is legitimately suspicious of a new program P. However, program P may be
invoked by another program, Q. There is no way for Q to know that P is correct or proper, any more than
a user knows that of P.
Therefore, we use the concept of mutual suspicion to describe the relationship between two programs.
Mutually suspicious programs operate as if other routines in the system were malicious or incorrect. A
calling program cannot trust its called subproce-dures to be correct, and a called subprocedure cannot
trust its calling program to be correct. Each protects its interface data so that the other has only limited
access. For example, a procedure to sort the entries in a list cannot be trusted not to modify those
elements, while that procedure cannot trust its caller to provide any list at all or to supply the number of
elements predicted.
Confinement
Confinement is a technique used by an operating system on a suspected program. A confined program
is strictly limited in what system resources it can access. If a program is not trustworthy, the data it can
access are strictly limited. Strong confinement would be helpful in limiting the spread of viruses. Since a
virus spreads by means of transitivity and shared data, all the data and programs within a single
compartment of a confined program can affect only the data and programs in the same compartment.
Therefore, the virus can spread only to things in that compartment; it cannot get outside the
compartment.
Access Log
An access or audit log is a listing of who accessed which computer objects, when, and for what
amount of time. Commonly applied to files and programs, this technique is less a means of protection
than an after-the-fact means of tracking down what has been done.
Typically, an access log is a protected file or a dedicated output device (such as a printer) to which a log
of activities is written. The logged activities can be such things as logins and logouts, accesses or
attempted accesses to files or directories, execution of programs, and uses of other devices.
Failures are also logged. It may be less important to record that a particular user listed the contents of a
permitted directory than that the same user tried to but was prevented from listing the contents of a
protected directory. One failed login may result from a typing error, but a series of failures in a short time
from the same device may result from the attempt of an intruder to break into the system.
Unusual events in the audit log should be scrutinized. For example, a new program might be tested in a
dedicated, controlled environment. After the program has been tested, an audit log of all files accessed
should be scanned to determine if there are any unexpected file accesses, the presence of which could
point to a Trojan horse in the new program. We examine these two important aspects of operating
system control in more detail in the next two chapters.
Administrative Controls
Not all controls can be imposed automatically by the computing system. Sometimes controls are applied
instead by the declaration that certain practices will be followed. These controls, encouraged by
managers and administrators, are called administrative controls. We look at them briefly here and in
more depth in Chapter 8.
standards of design, including using specified design tools, languages, or methodologies, using
design diversity, and devising strategies for error handling and fault tolerance
standards of documentation, language, and coding style, including layout of code on the page,
choices of names of variables, and use of recognized program structures
standards of programming, including mandatory peer reviews, periodic code audits for
correctness, and compliance with standards
standards of testing, such as using program verification techniques, archiving test results for
future reference, using independent testers, evaluating test thoroughness, and encouraging
test diversity
standards of configuration management, to control access to and changes of stable or
completed program units
Standardization improves the conditions under which all developers work by establishing a common
framework so that no one developer is indispensable. It also allows carryover from one project to
another; lessons learned on previous projects become available for use by all on the next project.
Standards also assist in maintenance, since the maintenance team can find required information in a
well-organized program. However, we must take care so that the standards do not unnecessarily
constrain the developers.
Firms concerned about security and committed to following software development standards often
perform security audits. In a security audit, an independent security evaluation team arrives
unannounced to check each project's compliance with standards and guidelines. The team reviews
requirements, designs, documentation, test data and plans, and code. Knowing that documents are
routinely scrutinized, a developer is unlikely to put suspicious code in a component in the first place.
Separation of Duties
Banks often break tasks into two or more pieces to be performed by separate employees. Employees
are less tempted to do wrong if they need the cooperation of another employee to do so. We can use
the same approach during software development. Modular design and implementation force developers
to cooperate in order to achieve illicit results. Independent test teams test a component or subsystem
more rigorously if they are not the authors or designers. These forms of separation lead to a higher
degree of security in programs.
regulators, and users place an increasing priority on high-confidence software: software for which
compelling evidence is required that it delivers a specified set of services in a manner that satisfies
specified critical properties. For this reason, we look to software engineering research to help us build
software that is not only more secure but also is of generally higher quality than the software we build
and use today. Thus, the security field can leverage work being done in other domains on highconfidence software development.
The software engineering practices that offer us the most benefit can involve processes, products, or
resources. When software has tight quality constraints, we do not want to wait until the system is
developed to see if they are met. Rather, we want to see some evidence during development that the
completed system is likely to meet the quality requirements we have imposed. We want to have
confidence, based on compelling and objective evidence, that the risk associated with using a system
conforms with our willingness to tolerate that risk.
An assurance argument lays out the evidence, not only in terms of software properties but also in
terms of steps taken, resources used, and any other relevant issue that may have bearing on our
confidence in the software's quality. The Common Criteria (studied in Chapter 5) require such an
assurance case for security-critical systems. A framework for assurance arguments includes a
description of what assurance is required for the system, how the case will be made that the required
confidence is justified, what evidence is to be gathered, and how the evidence will be combined and
evaluated. Some such frameworks exist and are being used. However, assurance argument frameworks
suffer from several deficiencies:
Researchers at RAND and MITRE are addressing these issues. MITRE is mapping existing assurance
arguments to a common, machine-processable form, using two kinds of notations: Toulmin structures,
developed as a general framework for presenting and analyzing arguments in legal and regulatory
contexts, and Goal Structuring Notation, developed in the U.K.'s safety-critical software community for
structuring safety arguments. RAND researchers are examining questions of confidence and assurance,
particularly about how bodies of evidence and constructions of arguments support confidence in the
assurance case. In particular, RAND is determining how assurance activities and techniques, such as
reliability modeling and design-by-contract, fit into the larger picture of providing an assurance
argument.
At the same time, researchers are examining ways to make code self-stabilizing or self-healing. Such
systems can sense when they reach an illegitimate statethat is, an insecure oneand can
automatically return to a legitimate, secure state. The self-healing process is not so simple as realizing a
failure and correcting it in one step. Imagine, instead, that you awaken one morning and discover that
you are in poor physical shape, overweight, and bored. A program of exercise, nutrition, and mental
stimulation can gradually bring you back, but there may be some missteps along the way. Similarly a
program may realize that it has allowed many program extensionssome perhaps maliciousto
become integrated into the system and wants to return gradually to a secure configuration. Dijkstra
[DIJ74] introduced this concept, and Lamport
In looking to the future it is important not to forget the past. Every student of computer security should
know the foundational literature of computer security, including the works of Saltzer and Schroeder
[SAL75] and Lampson [LAM71]. Other historical papers of interest are listed in the "To Learn More"
section.
TO LEARN MORE
Some of the earliest examples of security vulnerabilities are programs that compromise data. To read
about them, start with the reports written by Anderson [AND72] and Ware [WAR79], both of which
contain observations that are still valid today. Then read the papers of Thompson [THO84] and Schell
[SCH79], and ask yourself why people act as if malicious code is a new phenomenon.
Various examples of program flaws are described by Parker [PAR83] and Denning [DEN82]. The
volumes edited by Hoffman [HOF90] and Denning [DEN90a] are excellent collections on malicious
code. A good summary of current malicious code techniques and examples is presented by Denning
[DEN99].
Stoll's accounts of finding and dealing with intrusions are worth reading, both for their lighthearted tone
and for the serious situation they describe [STO88, STO89].
Software engineering principles are discussed by numerous authors. The books by Pfleeger [PFL01]
and Pfleeger et al. [PFL01a] are good places to get an overview of the issues and approaches. Corbat
[COR91] reflects on why building complex systems is hard and how we can improve our ability to build
them.
The books by DeMarco and Lister [DEM87] and DeMarco [DEM95] are filled with sensible, creative
ways to address software development. More recent books about agile development and extreme
programming can give you a different perspective on software development; these techniques try to
address the need to develop products quickly in a constrained business environment.
EXERCISES
1.
Suppose you are a customs inspector. You are responsible for checking suitcases for secret
compartments in which bulky items such as jewelry might be hidden. Describe the procedure
you would follow to check for these compartments.
2. Your boss hands you a microprocessor and its technical reference manual. You are asked to
check for undocumented features of the processor. Because of the number of possibilities, you
cannot test every operation code with every combination of operands. Outline the strategy you
would use to identify and characterize unpublicized operations.
3. Your boss hands you a computer program and its technical reference manual. You are asked
to check for undocumented features of the program. How is this activity similar to the task of
the previous exercises? How does it differ? Which is the most feasible? Why?
4. Could a computer program be used to automate testing for trapdoors? That is, could you
design a computer program that, given the source or object version of another program and a
suitable description, would reply Yes or No to show whether the program had any trapdoors?
Explain your answer.
5. A program is written to compute the sum of the integers from 1 to 10. The programmer, well
trained in reusability and maintainability, writes the program so that it computes the sum of the
numbers from k to n. However, a team of security specialists scrutinizes the code. The team
certifies that this program properly sets k to 1 and n to 10; therefore, the program is certified as
being properly restricted in that it always operates on precisely the range 1 to 10. List different
ways that this program can be sabotaged so that during execution it computes a different sum,
such as 3 to 20.
6. One means of limiting the effect of an untrusted program is confinement: controlling what
processes have access to the untrusted program and what access the program has to other
processes and data. Explain how confinement would apply to the earlier example of the
program that computes the sum of the integers 1 to 10.
7. List three controls that could be applied to detect or prevent salami attacks.
8. The distinction between a covert storage channel and a covert timing channel is not clear-cut.
Every timing channel can be transformed into an equivalent storage channel. Explain how this
transformation could be done.
9. List the limitations on the amount of information leaked per second through a covert channel in
a multiaccess computing system.
10. An electronic mail system could be used to leak information. First, explain how the leakage
could occur. Then, identify controls that could be applied to detect or prevent the leakage.
11. Modularity can have a negative as well as a positive effect. A program that is overmodular-ized
performs its operations in very small modules, so a reader has trouble acquiring an overall
perspective on what the system is trying to do. That is, although it may be easy to determine
what individual modules do and what small groups of modules do, it is not easy to understand
what they do in their entirety as a system. Suggest an approach that can be used during
program development to provide this perspective.
12. You are given a program that purportedly manages a list of items through hash coding. The
program is supposed to return the location of an item if the item is present, or return the
location where the item should be inserted if the item is not in the list. Accompanying the
program is a manual describing parameters such as the expected format of items in the table,
the table size, and the specific calling sequence. You have only the object code of this
program, not the source code. List the cases you would apply to test the correctness of the
program's function.
13. You are writing a procedure to add a node to a doubly linked list. The system on which this
procedure is to be run is subject to periodic hardware failures. The list your program is to
maintain is of very high importance. Your program must ensure the integrity of the list, even if
the machine fails in the middle of executing your procedure. Supply the individual statements
you would use in your procedure to update the list. (Your list should be fewer than a dozen
statements long.) Explain the effect of a machine failure after each instruction. Describe how
you would revise this procedure so that it would restore the integrity of the basic list after a
machine failure.
14. Explain how information in an access log could be used to identify the true identity of an
impostor who has acquired unauthorized access to a computing system. Describe several
different pieces of information in the log that could be combined to identify the impostor.
15. Several proposals have been made for a processor that could decrypt encrypted data and
machine instructions and then execute the instructions on the data. The processor would then
encrypt the results. How would such a processor be useful? What are the design requirements
for such a processor?